Software Leviathans

Dis-economies of scale, why FAANG pays high salaries, the dominance of Java

The top end of software engineering jobs are dominated by what I’ve started thinking of as ‘Software Leviathans’, large software systems that are staffed by thousands of engineers. A few that come to mind are Amazon Alexa, Amazon.com, Google Search, Salesforce, Facebook.com. These are not “monoliths’ or large services that do everything. Instead they are the result of combining 100s of smaller ‘micro-services’ into one massive software product. 

These leviathans do many many things, few people on the planet can claim to know all of the features of facebook.com. It is quite possible that there exists no single list that enumerates every feature in that product. 

Similarly, development on these systems happens in parallel across many teams. It it is essentially impossible for any one person to keep track of everything that is being added to the system. 

Leviathans are too big for anyone to understand. It doesn’t matter what architecture or runtime choices are made. It could be one massive JVM, a million lambda functions, a hundred thousand docker containers or thousands of micro-services. Even if you work on the leviathan, you won’t have any real understanding of the total state of the system. Each engineer will be aware of and communicate with a tiny fraction of the total number of people working inside the leviathan. 

Leviathans are heterogeneous systems. The do not do ‘one thing well’. Leviathans do everything you can think of. Google.com is a search engine, but it’s also a calculator, an advertising system, a web scraper, a hotel booking tool, a flight booking tool, and many more. Leviathans grow in parallel, across myriad tentacles of functionality. New features emerge all the time usually to the surprise of other engineers on the project. 

Leviathans are difficult to work in. Despite appearing to be a sea of constant change from the outside. Any change made inside the Leviathan is extremely expensive in engineering hours. There are thousands of potential interactions each engineering team has to consider when evaluating changes to their system. The architecture must be constrained heavily to support parallel development in environments where coordination between different teams is impossible due to scale. Engineers working on a software leviathan spend a relatively small fraction of their time actually writing code as compared to debugging issues, research, coordinating changes, and documenting. 

Leviathans are interesting because they are the ‘core’ services powering the digital world these days. Their scale is at top of the chart in the software engineering world and as a result they expose the limitations of software engineering. 

Software diseconomies of scale are at their most evident in these software leviathans. They are massive projects with huge numbers of the best engineers working on them. But development is slow per engineer and code quality is not clearly superior to industry best practices. 

Why I stopped going on twitter, using time tracking apps to monitor your time with Qbserve

I’ve been an avid twitter user for years, but had to stop this winter. I have been listening to ‘Deep Work’ while driving cross country and have done a lot of thinking about how to do better work. One of the things recommended in the book is to quit social media or at least exclude it from the part of your day when you work. I’ve typically just blocked twitter from my network during the workday then used it as much as I wanted afterwards. 

Well another thing I did in the pursuit of ‘deep work’ is to review my Qbserve stats for the last few months. My twitter numbers were way higher than I expected. I have been spending thousands of dollars worth of time using Twitter producing fun content that twitter then monetizes. I could have gotten a part time job or learned to paint. 

Track your time. There are a bunch of apps that can do it. I use Qbserve because it stores data locally and felt like a less heavy weight solution. I have also used RescueTime, but found logging in again when I need to restart tracking to be a pain. 

Once you have tracking going it gives you a lot of insight into what you are doing on your computer. Some people might think “ah, if I’m on the computer I’m working, what else would I use it for” but for millennials and digital natives who spend most of their lives on a computer it can really help. 

For example I know how much clock time I spent reading Xianxia, translated chinese pulp fiction, on wuxiaworld.co this year, four whole days. That is nearly double the amount of time I spent on news.ycombinator.com which came in at 1 day and 13 hours. I also know how much time I spent writing, note taking and journaling this year, around 30 hours so far. Admittedly, I haven’t run the app 24/7 and didn’t start until March so I only have around 8~ months worth of data.

I don’t think I would have made the realization of how much time I was spending on twitter, without a time tracking app. It is a lot like Television for normal people, it is just on all the time when you are home, you don’t really think about it’s effects on your life. Most people underestimate how much time they spend watching television, but you don’t have to underestimate how much time you spend on Youtube, just get Qbserve and review the data occasionally. 

In the week or so since I quit, I’ve already read a couple books and started writing on my blog again. 

Burnout

I’ve dealt with burnout many times in my 6 years as a software engineer. Usually, it’s when I get bored of a project or there is a slow period where I don’t have a lot of work to do. Counter to what you would expect having less work makes it harder to get that work done than when there is more to do. I think it is because when there is less of it, the work feels less important and subconsciously it feels like I’m not really needed. Like when you are in a meeting trying to estimate the impact of an issue with the whole team, but only three out of ten people are actually able to do anything before people are duplicating work. Then you end up waiting for other people to do basic things like read logs and tell you what they say. You could just read them yourself, but do we need 3 people reviewing the same logs right now?

I intentionally stayed on my current team for a relatively long period of time just to see what it was like. Earlier in my career I worked as a consultant where my longest time on one project was 9 months. Here at a product company we have been working on essentially the same problems for years. This is great in a way because I have been able to develop deep expertise in my systems and tooling, but the cost is of course burnout. 

The pandemic has made this year significantly worse by forcing remote work. I’ve lived in studio apartments since college and rely on having an office to provide a distinction between working and other activities. Efficiencies, like eating at my desk, which make sense normally, serve to muddle work and play when everything happens in a 500sqft box. 

Having everything muddled together makes it much harder to maintain flow. The absence of which makes everything more difficult. Especially, when your general happiness is influenced by your self-perceived productivity and usefulness as mine is. A large reason for my career success so far is how I maintain focus in the office. I don’t let myself do certain activities in the office like use Facebook, Reddit, Twitter or almost anything non-work related. Figuring out how to extend those norms to a single room lifestyle has been very difficult. 

I haven’t been able to wait out burnout. In the past a team or job change alleviated the problem. This year it just got worse and worse by the end of my stay in Seattle I didn’t want to fix it.  

The good news is that I left Seattle, living there has never felt right to me, the winters are horrible. In the short term I will be itinerant, but eventually I will acquire a new permanent space which will be larger. I’m hoping to move into a house or condo, but might end up in a one or two bedroom apartment with a dedicated office.

Fixing one part of my life that I knew I didn’t like has helped. It hasn’t fixed everything, but I’ve had a lot of time to think of ways to improve my working situation which I think will pay off.  

Links Post October

Here are some links I have seen lately that were pretty interesting.

In window noise canceling speakers

In window noise canceling speakers.

They will integrate these speakers into windows/walls and make it smaller. Increasing the quality of life in dense cities. It is something thats needed since cities just keep getting larger.

Concept of ‘feature store’ for typed ML model inputs (tensors, vectors, etc)

https://www.logicalclocks.com/blog/feature-store-vs-data-warehouse

VM performance tests

Finally we have a pair of great posts from tratt.net about VM warmup with lots of data.

https://tratt.net/laurie/blog/entries/why_arent_more_users_more_happy_with_our_vms_part_1.html

https://tratt.net/laurie/blog/entries/why_arent_more_users_more_happy_with_our_vms_part_2.html

SledgeConf Retrospective

I envisioned SledgeConf as a tiny in person conference, inspired by https://briancasel.com/tiny-conferences/. I hoped to hold one sometime in 2020, but was focused on some other projects. After the pandemic shut everything down the idea of holding it remotely seemed pretty achievable. So I took my original plan for a 2 day event and tried to launch a 2 day virtual event. 

Marketing and promotion were the two sticking points for SledgeConf, my audience is not very large and I’m not any kind of event promoter. In the end only one of my friends agreed to present at the remote conference. I considered canceling the event, but I had already taken a day off from my job so I went ahead with the ‘conference’. 

In the end we had two presentations and 7 people in total for SledgeConf. 5-10 people was the idea for the in-person conference and 7 worked well for a intimate casual setting. I also think that 2~ talks is the right number for an online conference. Watching presentations on your computer wears you out pretty fast and by the end of the second presentation I was worn out.

My costs were super low, only a basic zoom subscription. So despite being much smaller than hoped things worked out pretty well. Several attendees expressed interest in another SledgeConf. I’m thinking of holding a ‘Winter Sledge*Conf’ this winter. The plan would be to just have 2 talks and get everyone more lead time.