Software Leviathans

Dis-economies of scale, why FAANG pays high salaries, the dominance of Java

The top end of software engineering jobs are dominated by what I’ve started thinking of as ‘Software Leviathans’, large software systems that are staffed by thousands of engineers. A few that come to mind are Amazon Alexa, Amazon.com, Google Search, Salesforce, Facebook.com. These are not “monoliths’ or large services that do everything. Instead they are the result of combining 100s of smaller ‘micro-services’ into one massive software product. 

These leviathans do many many things, few people on the planet can claim to know all of the features of facebook.com. It is quite possible that there exists no single list that enumerates every feature in that product. 

Similarly, development on these systems happens in parallel across many teams. It it is essentially impossible for any one person to keep track of everything that is being added to the system. 

Leviathans are too big for anyone to understand. It doesn’t matter what architecture or runtime choices are made. It could be one massive JVM, a million lambda functions, a hundred thousand docker containers or thousands of micro-services. Even if you work on the leviathan, you won’t have any real understanding of the total state of the system. Each engineer will be aware of and communicate with a tiny fraction of the total number of people working inside the leviathan. 

Leviathans are heterogeneous systems. The do not do ‘one thing well’. Leviathans do everything you can think of. Google.com is a search engine, but it’s also a calculator, an advertising system, a web scraper, a hotel booking tool, a flight booking tool, and many more. Leviathans grow in parallel, across myriad tentacles of functionality. New features emerge all the time usually to the surprise of other engineers on the project. 

Leviathans are difficult to work in. Despite appearing to be a sea of constant change from the outside. Any change made inside the Leviathan is extremely expensive in engineering hours. There are thousands of potential interactions each engineering team has to consider when evaluating changes to their system. The architecture must be constrained heavily to support parallel development in environments where coordination between different teams is impossible due to scale. Engineers working on a software leviathan spend a relatively small fraction of their time actually writing code as compared to debugging issues, research, coordinating changes, and documenting. 

Leviathans are interesting because they are the ‘core’ services powering the digital world these days. Their scale is at top of the chart in the software engineering world and as a result they expose the limitations of software engineering. 

Software diseconomies of scale are at their most evident in these software leviathans. They are massive projects with huge numbers of the best engineers working on them. But development is slow per engineer and code quality is not clearly superior to industry best practices.