Wide vs Deep software

Voice assistants are wide software. In the industry we call it the ‘long tail’ of functionality. There are hundreds of ‘tasks’ that your Alexa or Google assistant can perform for you. You probably don’t know that most of them even exist. But not knowing that these tasks exist costs you nothing. The fact that you can buy pizzas via Alexa has no impact on your ability to get news briefs. You can do either without ever engaging with the other. 

This ‘long tail’ attribute makes voice assistants extremely wide software. Alexa can do hundreds of things which could be stand alone applications. But the trade off is that voice assistants are extremely vast and don’t do anything particularly well. Over the time the ‘main’ functionalities will be refined and optimized. But Voice assistants will always suffer from the ‘long tail’ problem in that they have extremely wide feature sets. 

Wide software spreads across multiple domains. More domains means  leaky abstractions and mapping software. 

Deep software focuses on a single domain. Perhaps it is an order book or a workflow execution engine. Deep software has a clear purpose and domain. Wide software does everything. 

Over time deep software converges on clean abstractions and easy to understand code. Wide software on the other hand is never finished. Wide software is naturally expansive. There is always a reason to add a new functionality to a voice assistant. And in fact there is no real barrier to entry. Adding a new functionality to a voice assistant is a net positive to the system as a whole. The negatives of adding a new domain are already baked in and many customers will enjoy the new functionality. 

Deep software can be finished. It can solve a problem in one domain and be done. Hadoop is an example. No one has heard about miraculous developments in HDFS this decade. Hadoop is essentially feature complete and in maintenance mode. In reality development continues, but is it really new stuff or refinements?

Wide software cannot be ‘finished’. Wide software is an infinite sinkhole. Adding more code makes the sinkhole more valuable so more code keeps getting added. There is no real way to ‘solve’ the problems of wide software. You can partition wide software such that each domain exists separately. But if you allow one domain to reference another, now you are back into the pit. 

Software Leviathans which I’ve discussed in another post (https://www.sledgeworx.io/software-leviathans/) are wide software. Supporting more domains typically increases the value of the leviathan as a whole. A voice assistant which can order dry cleaning is better than an assistant that can’t order dry cleaning. Overall there isn’t a trade off between the two. One has an additional ‘ability’ with no downside to adding that ability. You would have to make a voice assistant that only handled one domain to escape this constraint. 

Wide Software isn’t magical. Wide software does too many things to be incredible at any of them. Since there are countless features the team has to spend a lot of time making sure they don’t break anything. In software leviathans not breaking things is particularly difficult because nobody actually knows what all the features are. 

Since wide software is always being pushed to add something new. Energy and design focus are constantly shifted towards new features and problem domains. New domains expect old domains to support new features. 

Wide software suffers another problem which is that even if some domains in the project are invested in continually say music for example. Even if music functionality is iterated on again and again. That particular domain being awesome doesn’t change the flavor of the beast. It is still a ball of mud, dirt, rocks, etc. 

Spring into SledgeConf!

https://sledgeconf.dev

Our next SledgeConf is coming on Friday May 21st at 4:30pm PST!

Join us for discussion of the next 10 years of remote work. 

My presentation for this SledgeConf is titled. 

“The next decade in remote work.” 

I’m joined by Mason Traylor presenting!

“How to waste Everybody’s Time with Remote Meetings”

Agenda:

4:30 Introductions 

4:45 Presentation “How to waste Everybody’s Time with Remote Meetings”

5:00 Questions + Discussion

5:15 Presentation “The next decade in remote work.” 

5:45 Questions 

6:00 Promotional Round table 

6:15 Unstructured Discussion time

The zoom link will be sent to the mailing list the day of SledgeConf. Please go to SledgeConf.dev and sign up for the mailing list. If you are one of my connections on linkedin the zoom link will also be available there.

https://sledgeconf.dev

Software advances are slower than you expect.

Most people think of technological advances using the eureka metaphor. But software doesn’t work like that. Take a clear technological advance like a self-driving tractor. They are on the market now but there was no eureka moment, no sudden breakthrough. Self-driving technology just advanced to the base level required for tractors on clearly mapped fields, then a team of software engineers built a working system over several years.

There’s no invention or breakthrough moment, just a slow build in none software capabilities than an investment in building out software to leverage those capabilities. 

There were no software engineering advances that enabled the self-driving tractor. Instead machine vision improved until it was good enough to unblock the software solution.

What advances in software are in sight?

Reproducible builds are one of my favorite advances in software recently. Strictly speaking reproducible builds have been available for quite some time, but that doesn’t mean it isn’t a real advance. Having a trustworthy build environment where you can debug the inputs vs the outputs for different machines is a big benefit. Without reproducible builds software engineers end up spending extra time fixing build failures and figuring out dependency resolution overrides. 

Golang is in my opinion an advancement in software engineering. Instead of focusing on productivity for a single or small number of engineers, Golang is focused on maximizing productivity for large software projects with thousands of engineers. It does this by simplifying the language as much as possible to ease readability and eliminate magical effects. Everything in Golang is very procedural and easy to follow. 

Easy to use gradual typing is a more recent advancement in the Python and Ruby worlds. You can now build your MVP in a dynamic quick evaluating language and then after you reach product market fit you can add types to the code base. Typing is no longer and either/or thing but a sliding scale where you can chose the most opportune time to transition. Overall I think gradual typing has huge advantages compared to the old standard of rewriting the codebase in Java. 

Software Leviathans and the weird dominance of good enough.

One day in Spring 1989, I was sitting out on the Lucid porch with some of the hackers, and someone asked me why I thought people believed C and Unix were better than Lisp. I jokingly answered, “because, well, worse is better.” We laughed over it for a while as I tried to make up an argument for why something clearly lousy could be good.

https://www.dreamsongs.com/WorseIsBetter.html

It has long been wondered why Java took the crown for the ‘enterprise’ language. I can’t really argue on that topic since I came onto the scene long after Java was all there was. This article is about why software leviathans are written in Java more than anything else. 

You have a huge software project to build. What language do you build it in? The prototype was written in ruby on rails by one guy and an Adderal prescription. Now they want you to scale this thing to a 1000+ engineers over 5 years of development. You might think “aha this is my chance, lets save an order of magnitude lines of code and use lisp”, except this story happened in the past and they chose Java. 

Why is it always Java? Sure it’s reasonably fast, but Facebook made PHP work, can’t we at least use Haskell? Since we have the benefit of hindsight, we know that most of the biggest software systems are built in Java. Google built so many leviathans in Java that they bankrolled a new language like Java but with less features. Amazon is based on Java. Netflix is java again. Facebook made their own language and Microsoft is old enough to have existed before Java, but still made their own version of Java, C#. 

The real question should be, “What is Java’s secret?”. 

Java requires a lot of boiler plate

Java just plays well with the major constraints in a software leviathan and at Leviathan scale that is all that matters. 

This one is the corollary of “Java doesn’t support meta-programming”. Creating your own DSL is great, 1000 engineers creating their own DSL is 999 nightmares. Software Leviathans are too big for any one engineering team to understand. Any DSL you create makes your code unintelligible to the rest of the people working in hell with you. I can understand boilerplate written by a monkey, but a DSL written by another software engineer could take me days to understand. When your team gets poached to go work on a startup where the code base isn’t humongous, it’s a lot easier to bring in Java programmers to replace you lot than it would be to get Haskell engineers to figure out your undocumented dialect. 

Google got to the point where they figured Java had too much meta-programming ability so they created Go which is basically Java without inheritance. That is what happens when you work in a leviathan project. You begin to resent the ability of your peers to do anything unusual, because you know it’s just going to be more work for you. 

Adding more onboarding time to understand 1) the functional language and 2) the DSL your team created might push our already long 6 month on-boarding period closer to the 1 year mark. I wrote an article about onboarding time and functional languages aimed at startups, but honestly I don’t think the hiring market is the real reason Java dominates the top end. FAANG is already willing to train new grads to work on their giant software projects. 

It boils down to comprehension honestly. Humans can only comprehend so many things and at leviathan scale the max is a tiny fraction of the entire system. 

In a software leviathan your team constantly works with other teams’ systems. How does this API work? There isn’t any documentation and one 30 minute office hours isn’t going to explain that hair ball. If you all use the same language and that language is Java there is a chance you can open up their code base and figure out what is going on. They probably didn’t do anything you wouldn’t expect like pre-allocating all of their memory and storing all objects into a ring buffer. But if they did do something crazy you can probably figure it out. Besides Java doesn’t have anything like Scalaz so you won’t be surprised by a functor where you weren’t expecting it. 

Lets take the opposite side, away team work. You have been given the glorious task of implementing a new feature. But it’s impossible to do it cleanly without an API change in another team’s system. That team fully supports the change and has contributed 2 paragraphs to your architecture document describing the change to make in their system. But the change isn’t on their team’s roadmap so you are going to have to do it. 

Getting their service to run and pass integration tests in your virtual development machine takes a week. Now you need to navigate their system where they have conveniently used dependency injection to ensure that you can’t know which of the 5 implementations of this interface is in play. Do you still wish the other team could use Clojure? You might never figure out the DSL. 

Have you ever looked at somebody else’s Lisp code and wondered what was inside the variables? Now imagine this is your job and you will spend the next month making a 200 line change to a 100,000 line of code API service you didn’t know existed until this week. Except this will happen every quarter for the rest of your career. 

People complain about how Java forces you to write the type of things everywhere, but for software leviathans this is a benefit. I can see helpful type signatures everywhere, whether I’m reading your code in my IDE, an email, an excerpt in an arch doc, or in a Slack message you sent me at 3 am. 

Java and Go are great in Software Leviathans. You don’t have to worry about stumbling upon a programming mystery created 10 years ago by a disgruntled new grad. You can expect a consistent syntax and language whichever microservice you are working on. The code has self-documenting types that are ‘easy’ to understand. Honestly, they are a lot of benefits which make a tough coding environment a little more manageable.

Software Leviathans

Dis-economies of scale, why FAANG pays high salaries, the dominance of Java

The top end of software engineering jobs are dominated by what I’ve started thinking of as ‘Software Leviathans’, large software systems that are staffed by thousands of engineers. A few that come to mind are Amazon Alexa, Amazon.com, Google Search, Salesforce, Facebook.com. These are not “monoliths’ or large services that do everything. Instead they are the result of combining 100s of smaller ‘micro-services’ into one massive software product. 

These leviathans do many many things, few people on the planet can claim to know all of the features of facebook.com. It is quite possible that there exists no single list that enumerates every feature in that product. 

Similarly, development on these systems happens in parallel across many teams. It it is essentially impossible for any one person to keep track of everything that is being added to the system. 

Leviathans are too big for anyone to understand. It doesn’t matter what architecture or runtime choices are made. It could be one massive JVM, a million lambda functions, a hundred thousand docker containers or thousands of micro-services. Even if you work on the leviathan, you won’t have any real understanding of the total state of the system. Each engineer will be aware of and communicate with a tiny fraction of the total number of people working inside the leviathan. 

Leviathans are heterogeneous systems. The do not do ‘one thing well’. Leviathans do everything you can think of. Google.com is a search engine, but it’s also a calculator, an advertising system, a web scraper, a hotel booking tool, a flight booking tool, and many more. Leviathans grow in parallel, across myriad tentacles of functionality. New features emerge all the time usually to the surprise of other engineers on the project. 

Leviathans are difficult to work in. Despite appearing to be a sea of constant change from the outside. Any change made inside the Leviathan is extremely expensive in engineering hours. There are thousands of potential interactions each engineering team has to consider when evaluating changes to their system. The architecture must be constrained heavily to support parallel development in environments where coordination between different teams is impossible due to scale. Engineers working on a software leviathan spend a relatively small fraction of their time actually writing code as compared to debugging issues, research, coordinating changes, and documenting. 

Leviathans are interesting because they are the ‘core’ services powering the digital world these days. Their scale is at top of the chart in the software engineering world and as a result they expose the limitations of software engineering. 

Software diseconomies of scale are at their most evident in these software leviathans. They are massive projects with huge numbers of the best engineers working on them. But development is slow per engineer and code quality is not clearly superior to industry best practices.