Onboarding Preparedness Quiz

I am working on some materials around onboarding new engineers.

The Quiz

When was the last time you onboarded a new engineer?

Do different team members use different development environments to run integration tests locally?

Do you have architecture diagrams that are up to date as of last month? 

Do you have an up to date list of the APIs you consume and what their SLAs are? 

Do you have a list of all the internal tools your team uses to complete their work? 

Do you have a list of all the 3rd party tools your team uses to complete their work? 

Do you have a list of all the configuration files that your services use? 

Do you have a list of all the resources your services consume, object storage, disks, caches, VMs, queues, etc?

You can’t estimate a story in 2 minutes.


The typical agile process involves 5-10 people in a room. They go through one sentence descriptions of tasks, and each secretly comes up with their own estimate before every turns over their cards like in the final round of poker.

Then there will be a couple minutes spent discussing the estimates which will end with the average of the “votes”.

These meetings are usually confused by someone misunderstanding the breakout of one particular feature into one sentence descriptions, causing the group to backtrack and re-estimate several of the stories.

Most of the team will think about each task for a total of 2 minutes total. 

The intention is that the project manager will take all this data and gradually calibrate the estimates.

I have spent a great deal of time participating in these agile “estimation” sessions and I can assure you that the estimates I produce in this setting have almost always been far off the mark.

Projects commonly run over what they were estimated, and sometimes a project estimated to take a month actually takes a day if you get the right person to work on it.

After 3.5 years working in this environment the whole process seems pretty pointless. Everyone knows the estimates are wrong and plans for it. Why do we bother doing this process that is known to be unreliable at best?

In construction estimates are done off of detailed blueprints using well understood heuristics, equations and the price of materials. 

In software estimates are done off of one sentence descriptions of heterogeneous tasks and the gut feel of developers about how long something will take.

Continuous Delivery of Bugs


Agile Rapid Delivery means you deliver bugs to your customers everyday. 

Every once in a while I read an article about how software has become horrible over the last decade. Software has gotten slower, it uses too much memory, it has too many confusing features, why are developers producing worse software? 

Part of this is coincidence. We remember software being fast, but that was during a time when CPU speeds doubled every year. By the time you had used a program for a year, there was a processor out that could do it in half the time. Now Moore’s law ambles around at low double digit performance improvements. 

Another issue is feature complexity. “The old software just worked, why did they have to change it?” With the move to Software as a Service we can constantly add new features. 

Frequency matters if you deliver buggy code to your customers once a year on CD-ROMs. They are going to complain about buggy code once a year. If you deliver buggy code to your customers everyday, they will probably give up and accept their fate. 

Please don’t build yet another internal ‘tool’

We all know the feeling. You have just started a new contract and its time to find out the truth about their software stack. Do they still use Ant despite the fact that Maven and Gradle have been options for over 10 years? Are you about to find out that their website is ‘really’ built on a 20 year old Pearl web framework that hasn’t been used for anything anywhere else? 

Once you get past the initial shock of finding out what they doubled down on after it was wildly known to be a horrible idea. You get to find out about the internal tooling. Maybe you hear someone say, “we use the tools that are right for the problem”. Or maybe someone says “We don’t want to force people to use tools that are not a fit for their use case.” That is a sign that there might be more than a few internal tools lurking in the company wiki. 

In your first week you hear about how we have our own web framework and try to figure out how dependency management works here. You find it somewhat nifty that you can submit multi-repository Pull Requests to their custom version of bitbucket, except with a few less bells and whistles. Alright, we have our own RPC framework that uses XML kind of like how SOAP used to work. 

Then over the next few months you realize all the tools are custom. The company hasn’t taken advantage of any of the advances the rest of the industry has over the last 20 years. They have continuous integration and continuous delivery but it is all custom. The end to end test framework your project relies on was built by another team that disbanded quietly last year and is now entirely unsupported. You have the options of taking over the project (fuck no) or migrating to another promising internal end to end test framework. That team promises they will fully support it once they hit 1.0. 

Name a tool used in the industry. Go ahead, name one. “Alright, Terraform”, oh, Terraform is nice, but we have an internal tool Declarative Template Manager which allows you to write templates in Pearl. DTM templates are faster and supported as a first class citizen in our companies closed source deployment system. 

-you- “Cool, I need to track down this bug, how can I search the logs?”

-old time dev- “We use grep.”

-you- “Alright, how do I get to the logs?”

-old time dev- “WoodShed, were you keep logs, here is the link”

Its a link to a wiki page with the standard internal documentation you know and love. 

-you- “So how do I get the logs out of WoodShed?”

-old time dev- “Well, people who know what they are doing use Wood Chipper. WoodChipper is the best tool to use to grep logs from WoodShed. “

-you- “I just want to do a quick search, is there a web portal that can do that?”

-old time dev- “Well, we have SawMill, it can pool all the logs for your teams servers onto one host so you can ssh into one place and use grep!”

Now you might be thinking no serious software company would allow their engineers to ssh into production hosts just to grep logs, but you would be wrong. After all Devops means everyone has ssh access in production. You might think that the top software companies would have a better solution than STORING ALL THE LOGS IN S3 AND THEN GREP’ING THEM.

If you thought any of those things you were wrong. Every company has some private tooling, over time that ‘some’ goes from a couple tools to dozens and then hundreds. The reason software engineers love startups is because they get to dump all the legacy code (older than 2 years) and build something new. Thereby escaping the Chuthulu-esque horror building up in the company codebase. 

It’s amazing how much time not having standup adds to my day.


Now that I am working as an Individual Contributor again I actually get to do focused work. My team has standup at 10:45, I start thinking about standup at about 10:00 or so. What did I do yesterday? I put together some notes and then go back to working for a bit. After standup I have a few minutes of work and then go to lunch around 11:30 to avoid the rush. 

Contrast this with our ‘no meeting’ day. I get into the office and have an additional 90 minutes of uninterrupted work before lunch. The real difference is that I get two 4 hour blocks of uninterrupted work in a no-meeting day as opposed to one or zero on a day with meetings.