How much does using Haskell instead of Java cost your company?

Following on my earlier post “How much is turnover costing your company?” this post is about how much using non-standard technology stacks costs your company. Lets say company A is high performance company with a custom functional programming stack that uses Haskell and Miso for development. Company A doesn’t ship a lot of bugs to production, in fact they run CI/CD with deployments to prod everyday. Their SAAS product is one of the best in the industry and is growing at a good rate. 

Now Company A has a competitor in the form of Company B. In this case B does stand for Boring. Company B offers a SAAS product that is almost identical to the one Company A sells but Company B’s has more bugs because they use Java. In fact Company B uses Java with Spring Boot and ReactJS. Both Company A and Company B started in 2018 when React was already mainstream. Company B also uses CI/CD deployments but they can only deploy to production once a week because there are so many bugs. Company B also hired a QA Engineer to help support the pipeline because of an embarrassing outage that CompanyA used in a humiliating Ad. 

In January 2019 both companies raise a round of founding for 10 million dollars. The mandate from the VCs is to 10x the company in size and destroy the competition. Both companies need to start hiring fast. Throughout 2019 CompanyA has an advantage, there are a bunch of passionate Haskell engineers that want to work on their product. Both Company A and B reach their growth targets and hit 100 engineers that year. 

Fortunately, things go well for both companies in 2020. The market is growing and revenues are increasing. The VCs call for more growth. Now we want to grow engineering to 500 engineers. 

Now Company A has a problem. They already hired all the passionate Haskell+Miso engineers they could find. Any new engineers will take 12 months to train on their technology stack. CompanyB doesn’t have this problem everyone still remembers how ReactJS and Spring Boot work even though they are no longer the top technologies. Company B only needs 6 months to train a new engineer on their technology stack. 

Lets run the numbers. How much is it going to cost to onboard 400 new engineers for each company?

We will keep our 2 year average tenure estimate and use 12 months of onboarding for company A. Each engineer has a total cost of 200k including salary, equity, benefits, taxes and insurance. 

Each engineer is going to cost around 100k to onboard at Company A. 

Each engineer is going to cost about 50k to onboard at Company B.

Now the next question, how long will it take each company to get all 400 new engineers fully operational? 

My assumption is that it takes 1 fully onboarded engineer to train a new engineer. Each company has 100 engineers to start so at the beginning the first 100 engineers will train the next 100. Then each company will have 200 engineers to train the next 200. Then things calm down a little and we have 400 engineers to train the last 100. 

After 12 months Company B has double the trained engineers Company A has. At 18 months Company B is finished onboarding and can have all 500 engineers focus on the product. Company A won’t catch up for another 18 months.

Most people will agree that Company A’s functional programming stack is not going to make up for the extra engineer years that Company B has at this point. 

Estimating in Agile

I have been reading Software Estimation ( again after a recent conversation with one of the product owners on my team. The Senior engineer on my team has pushed us into a ‘pointless’ estimating process where the estimate is essentially the count of the subtasks in the story. 

The Product owner wanted us to provide time or t-shirt size estimates earlier in the process before we did the work to break up the story into subtasks. This discussion broke down into people arguing over agile… 

Personally, I am quite disillusioned with how we create estimates a part of the Software Development lifecycle. The standard  estimation technique in my experience is to compare a new project with a past project based solely on personal memory and gut feel. Then to double your estimate from what ever you came up with. The computer science and engineering classes I took did not cover anything related to estimating projects. Which is a little strange since assumed the other engineering discipline like Mechanical or Electrical engineering would cover estimates. 

The business executives need estimates so that they can coordinate work, sign deals and generally make money. If you promise a client something and the schedule slips 6 months that is a problem. If you sign a 10 million dollar advertising deal and the delivery schedule slips 3 months its a problem. So the business needs estimates that are accurate to plan things. But if we are just guessing as the engineering team what is the point? If our estimates are constantly wrong and the levels above us ignore them why do we need to waste our time ‘voting’ and playing ‘planning poker’? 

Then there is the issue that in agile we are very loose with stories. Subtasks are added and removed with little oversight. After a story is estimated things can change and suddenly the story goes from 3 subtasks to 10.  The next day 8 of those subtasks are moved into another story. What were we estimating again? 

I will be writing a bunch more on estimation, but some of the things I have gained so far are that expert judgement is considered the weakest means of getting an estimate. The best way is to count something whether stories, square feet of drywall or some other work component. The second means is to use heuristics to compute your way into an estimate. The last resort is expert judgement or gut feel. 

Count > Compute > Judgement

The usefulness of linking email threads in stories and tickets.

Have you ever had a great email thread about a feature? You and your manager hashed out a solution with the product team and everyone is in agreement. All that is left to do, is to create a JIRA ticket and get working. Now wouldn’t it be nice if you could just LINK THE EMAIL THREAD IN THE STORY!

I have seen email archives, I know they exist. Linux has them. My employer even has email archives. But how can I link to the archive from Outlook? There should be a way to do this super simple thing to share my emails in a story. 

I am going to investigate how the email archives work at my employer. 

How much is turnover costing your company?

The software industry has a turnover problem. The general consensus is that it takes around 6-12 months to fully onboard a new hire. Essentially, you hire someone for 100k a year. Then it takes roughly 6-12 months before that person starts producing work at the 100k level. This would be fine if Software Engineers did not change jobs frequently, but we tend to switch jobs every 1-3 years for higher pay. 

Let us consider the case of a company with a complex project and slightly high turnover. They need a full 12 months to onboard a new engineer and they can expect that engineer to stay at the company for 2 years in total. If onboarding proceeds linearly that company ends up paying 2 years pay for 1.5 years worth of results. The company is spending 25% of salary on training for the lifetime of that employee. 

If the company managed to get the average lifetime per employee up to 3 years they would spend 3 years pay on 2.5 years of work or roughly 17% of salary on training. 

A company in this situation could afford to give an 8% raise if it kept an employee around for an additional year. 

Onboarding Preparedness Quiz

I am working on some materials around onboarding new engineers.

The Quiz

When was the last time you onboarded a new engineer?

Do different team members use different development environments to run integration tests locally?

Do you have architecture diagrams that are up to date as of last month? 

Do you have an up to date list of the APIs you consume and what their SLAs are? 

Do you have a list of all the internal tools your team uses to complete their work? 

Do you have a list of all the 3rd party tools your team uses to complete their work? 

Do you have a list of all the configuration files that your services use? 

Do you have a list of all the resources your services consume, object storage, disks, caches, VMs, queues, etc?

You can’t estimate a story in 2 minutes.

The typical agile process involves 5-10 people in a room. They go through one sentence descriptions of tasks, and each secretly comes up with their own estimate before every turns over their cards like in the final round of poker.

Then there will be a couple minutes spent discussing the estimates which will end with the average of the “votes”.

These meetings are usually confused by someone misunderstanding the breakout of one particular feature into one sentence descriptions, causing the group to backtrack and re-estimate several of the stories.

Most of the team will think about each task for a total of 2 minutes total. 

The intention is that the project manager will take all this data and gradually calibrate the estimates.

I have spent a great deal of time participating in these agile “estimation” sessions and I can assure you that the estimates I produce in this setting have almost always been far off the mark.

Projects commonly run over what they were estimated, and sometimes a project estimated to take a month actually takes a day if you get the right person to work on it.

After 3.5 years working in this environment the whole process seems pretty pointless. Everyone knows the estimates are wrong and plans for it. Why do we bother doing this process that is known to be unreliable at best?

In construction estimates are done off of detailed blueprints using well understood heuristics, equations and the price of materials. 

In software estimates are done off of one sentence descriptions of heterogeneous tasks and the gut feel of developers about how long something will take.

Continuous Delivery of Bugs

Agile Rapid Delivery means you deliver bugs to your customers everyday. 

Every once in a while I read an article about how software has become horrible over the last decade. Software has gotten slower, it uses too much memory, it has too many confusing features, why are developers producing worse software? 

Part of this is coincidence. We remember software being fast, but that was during a time when CPU speeds doubled every year. By the time you had used a program for a year, there was a processor out that could do it in half the time. Now Moore’s law ambles around at low double digit performance improvements. 

Another issue is feature complexity. “The old software just worked, why did they have to change it?” With the move to Software as a Service we can constantly add new features. 

Frequency matters if you deliver buggy code to your customers once a year on CD-ROMs. They are going to complain about buggy code once a year. If you deliver buggy code to your customers everyday, they will probably give up and accept their fate. 

Please don’t build yet another internal ‘tool’

We all know the feeling. You have just started a new contract and its time to find out the truth about their software stack. Do they still use Ant despite the fact that Maven and Gradle have been options for over 10 years? Are you about to find out that their website is ‘really’ built on a 20 year old Pearl web framework that hasn’t been used for anything anywhere else? 

Once you get past the initial shock of finding out what they doubled down on after it was wildly known to be a horrible idea. You get to find out about the internal tooling. Maybe you hear someone say, “we use the tools that are right for the problem”. Or maybe someone says “We don’t want to force people to use tools that are not a fit for their use case.” That is a sign that there might be more than a few internal tools lurking in the company wiki. 

In your first week you hear about how we have our own web framework and try to figure out how dependency management works here. You find it somewhat nifty that you can submit multi-repository Pull Requests to their custom version of bitbucket, except with a few less bells and whistles. Alright, we have our own RPC framework that uses XML kind of like how SOAP used to work. 

Then over the next few months you realize all the tools are custom. The company hasn’t taken advantage of any of the advances the rest of the industry has over the last 20 years. They have continuous integration and continuous delivery but it is all custom. The end to end test framework your project relies on was built by another team that disbanded quietly last year and is now entirely unsupported. You have the options of taking over the project (fuck no) or migrating to another promising internal end to end test framework. That team promises they will fully support it once they hit 1.0. 

Name a tool used in the industry. Go ahead, name one. “Alright, Terraform”, oh, Terraform is nice, but we have an internal tool Declarative Template Manager which allows you to write templates in Pearl. DTM templates are faster and supported as a first class citizen in our companies closed source deployment system. 

-you- “Cool, I need to track down this bug, how can I search the logs?”

-old time dev- “We use grep.”

-you- “Alright, how do I get to the logs?”

-old time dev- “WoodShed, were you keep logs, here is the link”

Its a link to a wiki page with the standard internal documentation you know and love. 

-you- “So how do I get the logs out of WoodShed?”

-old time dev- “Well, people who know what they are doing use Wood Chipper. WoodChipper is the best tool to use to grep logs from WoodShed. “

-you- “I just want to do a quick search, is there a web portal that can do that?”

-old time dev- “Well, we have SawMill, it can pool all the logs for your teams servers onto one host so you can ssh into one place and use grep!”

Now you might be thinking no serious software company would allow their engineers to ssh into production hosts just to grep logs, but you would be wrong. After all Devops means everyone has ssh access in production. You might think that the top software companies would have a better solution than STORING ALL THE LOGS IN S3 AND THEN GREP’ING THEM.

If you thought any of those things you were wrong. Every company has some private tooling, over time that ‘some’ goes from a couple tools to dozens and then hundreds. The reason software engineers love startups is because they get to dump all the legacy code (older than 2 years) and build something new. Thereby escaping the Chuthulu-esque horror building up in the company codebase. 

It’s amazing how much time not having standup adds to my day.

Now that I am working as an Individual Contributor again I actually get to do focused work. My team has standup at 10:45, I start thinking about standup at about 10:00 or so. What did I do yesterday? I put together some notes and then go back to working for a bit. After standup I have a few minutes of work and then go to lunch around 11:30 to avoid the rush. 

Contrast this with our ‘no meeting’ day. I get into the office and have an additional 90 minutes of uninterrupted work before lunch. The real difference is that I get two 4 hour blocks of uninterrupted work in a no-meeting day as opposed to one or zero on a day with meetings. 

Back to being a full time dev 6 month review.

Time has gone by faster than I expected. At this point I have been back as a full time developer individual contributor for about six months. It had been about 2 years since I did any Java development and the getting back into it was harder than I expected. It doesn’t help that the Java projects here are large and the system is massive. We have dozens of Java services with 100,000s of lines of code and we integrate into a lot of core business systems outside of our organization. This is the most complicated project I have worked on personally, with problems spanning from prices, voice UI, internationalization of voice UI, international law and massive java projects. 

The Software Development Lifecycle here is very mature and the best way to work I have come across in the business world so far. We practice what could be described as ‘informal agile’ We have sprints and stories and quarterly goals like many places. Compared to the Scaled Agile Framework we have a laughable amount of process. It took me a few months to realize that my open calendar was here to stay. 

Java is a mature language with lots of support for anything you could imagine, but it is still a painful language to use. On the flip side, I have finally figured out how annotations work. And I have been experimenting with Lombok. I have returned to the world of XML configuration and spring (not boot). 

All the less fun part of development is back too. Really long builds, massive numbers of dependencies. Pull request churn so you have to rebase your changes constantly. But I get to write code again and my technical expertise is increasing again instead of declining.

Overall, I am pretty happy with my transition back into development. Would I have done it for a pay cut, no. But since it came with an increase I am in a pretty good place.