Software advances are slower than you expect.

Most people think of technological advances using the eureka metaphor. But software doesn’t work like that. Take a clear technological advance like a self-driving tractor. They are on the market now but there was no eureka moment, no sudden breakthrough. Self-driving technology just advanced to the base level required for tractors on clearly mapped fields, then a team of software engineers built a working system over several years.

There’s no invention or breakthrough moment, just a slow build in none software capabilities than an investment in building out software to leverage those capabilities. 

There were no software engineering advances that enabled the self-driving tractor. Instead machine vision improved until it was good enough to unblock the software solution.

What advances in software are in sight?

Reproducible builds are one of my favorite advances in software recently. Strictly speaking reproducible builds have been available for quite some time, but that doesn’t mean it isn’t a real advance. Having a trustworthy build environment where you can debug the inputs vs the outputs for different machines is a big benefit. Without reproducible builds software engineers end up spending extra time fixing build failures and figuring out dependency resolution overrides. 

Golang is in my opinion an advancement in software engineering. Instead of focusing on productivity for a single or small number of engineers, Golang is focused on maximizing productivity for large software projects with thousands of engineers. It does this by simplifying the language as much as possible to ease readability and eliminate magical effects. Everything in Golang is very procedural and easy to follow. 

Easy to use gradual typing is a more recent advancement in the Python and Ruby worlds. You can now build your MVP in a dynamic quick evaluating language and then after you reach product market fit you can add types to the code base. Typing is no longer and either/or thing but a sliding scale where you can chose the most opportune time to transition. Overall I think gradual typing has huge advantages compared to the old standard of rewriting the codebase in Java. 

Software Leviathans and the weird dominance of good enough.

One day in Spring 1989, I was sitting out on the Lucid porch with some of the hackers, and someone asked me why I thought people believed C and Unix were better than Lisp. I jokingly answered, “because, well, worse is better.” We laughed over it for a while as I tried to make up an argument for why something clearly lousy could be good.

https://www.dreamsongs.com/WorseIsBetter.html

It has long been wondered why Java took the crown for the ‘enterprise’ language. I can’t really argue on that topic since I came onto the scene long after Java was all there was. This article is about why software leviathans are written in Java more than anything else. 

You have a huge software project to build. What language do you build it in? The prototype was written in ruby on rails by one guy and an Adderal prescription. Now they want you to scale this thing to a 1000+ engineers over 5 years of development. You might think “aha this is my chance, lets save an order of magnitude lines of code and use lisp”, except this story happened in the past and they chose Java. 

Why is it always Java? Sure it’s reasonably fast, but Facebook made PHP work, can’t we at least use Haskell? Since we have the benefit of hindsight, we know that most of the biggest software systems are built in Java. Google built so many leviathans in Java that they bankrolled a new language like Java but with less features. Amazon is based on Java. Netflix is java again. Facebook made their own language and Microsoft is old enough to have existed before Java, but still made their own version of Java, C#. 

The real question should be, “What is Java’s secret?”. 

Java requires a lot of boiler plate

Java just plays well with the major constraints in a software leviathan and at Leviathan scale that is all that matters. 

This one is the corollary of “Java doesn’t support meta-programming”. Creating your own DSL is great, 1000 engineers creating their own DSL is 999 nightmares. Software Leviathans are too big for any one engineering team to understand. Any DSL you create makes your code unintelligible to the rest of the people working in hell with you. I can understand boilerplate written by a monkey, but a DSL written by another software engineer could take me days to understand. When your team gets poached to go work on a startup where the code base isn’t humongous, it’s a lot easier to bring in Java programmers to replace you lot than it would be to get Haskell engineers to figure out your undocumented dialect. 

Google got to the point where they figured Java had too much meta-programming ability so they created Go which is basically Java without inheritance. That is what happens when you work in a leviathan project. You begin to resent the ability of your peers to do anything unusual, because you know it’s just going to be more work for you. 

Adding more onboarding time to understand 1) the functional language and 2) the DSL your team created might push our already long 6 month on-boarding period closer to the 1 year mark. I wrote an article about onboarding time and functional languages aimed at startups, but honestly I don’t think the hiring market is the real reason Java dominates the top end. FAANG is already willing to train new grads to work on their giant software projects. 

It boils down to comprehension honestly. Humans can only comprehend so many things and at leviathan scale the max is a tiny fraction of the entire system. 

In a software leviathan your team constantly works with other teams’ systems. How does this API work? There isn’t any documentation and one 30 minute office hours isn’t going to explain that hair ball. If you all use the same language and that language is Java there is a chance you can open up their code base and figure out what is going on. They probably didn’t do anything you wouldn’t expect like pre-allocating all of their memory and storing all objects into a ring buffer. But if they did do something crazy you can probably figure it out. Besides Java doesn’t have anything like Scalaz so you won’t be surprised by a functor where you weren’t expecting it. 

Lets take the opposite side, away team work. You have been given the glorious task of implementing a new feature. But it’s impossible to do it cleanly without an API change in another team’s system. That team fully supports the change and has contributed 2 paragraphs to your architecture document describing the change to make in their system. But the change isn’t on their team’s roadmap so you are going to have to do it. 

Getting their service to run and pass integration tests in your virtual development machine takes a week. Now you need to navigate their system where they have conveniently used dependency injection to ensure that you can’t know which of the 5 implementations of this interface is in play. Do you still wish the other team could use Clojure? You might never figure out the DSL. 

Have you ever looked at somebody else’s Lisp code and wondered what was inside the variables? Now imagine this is your job and you will spend the next month making a 200 line change to a 100,000 line of code API service you didn’t know existed until this week. Except this will happen every quarter for the rest of your career. 

People complain about how Java forces you to write the type of things everywhere, but for software leviathans this is a benefit. I can see helpful type signatures everywhere, whether I’m reading your code in my IDE, an email, an excerpt in an arch doc, or in a Slack message you sent me at 3 am. 

Java and Go are great in Software Leviathans. You don’t have to worry about stumbling upon a programming mystery created 10 years ago by a disgruntled new grad. You can expect a consistent syntax and language whichever microservice you are working on. The code has self-documenting types that are ‘easy’ to understand. Honestly, they are a lot of benefits which make a tough coding environment a little more manageable.

Software Leviathans

Dis-economies of scale, why FAANG pays high salaries, the dominance of Java

The top end of software engineering jobs are dominated by what I’ve started thinking of as ‘Software Leviathans’, large software systems that are staffed by thousands of engineers. A few that come to mind are Amazon Alexa, Amazon.com, Google Search, Salesforce, Facebook.com. These are not “monoliths’ or large services that do everything. Instead they are the result of combining 100s of smaller ‘micro-services’ into one massive software product. 

These leviathans do many many things, few people on the planet can claim to know all of the features of facebook.com. It is quite possible that there exists no single list that enumerates every feature in that product. 

Similarly, development on these systems happens in parallel across many teams. It it is essentially impossible for any one person to keep track of everything that is being added to the system. 

Leviathans are too big for anyone to understand. It doesn’t matter what architecture or runtime choices are made. It could be one massive JVM, a million lambda functions, a hundred thousand docker containers or thousands of micro-services. Even if you work on the leviathan, you won’t have any real understanding of the total state of the system. Each engineer will be aware of and communicate with a tiny fraction of the total number of people working inside the leviathan. 

Leviathans are heterogeneous systems. The do not do ‘one thing well’. Leviathans do everything you can think of. Google.com is a search engine, but it’s also a calculator, an advertising system, a web scraper, a hotel booking tool, a flight booking tool, and many more. Leviathans grow in parallel, across myriad tentacles of functionality. New features emerge all the time usually to the surprise of other engineers on the project. 

Leviathans are difficult to work in. Despite appearing to be a sea of constant change from the outside. Any change made inside the Leviathan is extremely expensive in engineering hours. There are thousands of potential interactions each engineering team has to consider when evaluating changes to their system. The architecture must be constrained heavily to support parallel development in environments where coordination between different teams is impossible due to scale. Engineers working on a software leviathan spend a relatively small fraction of their time actually writing code as compared to debugging issues, research, coordinating changes, and documenting. 

Leviathans are interesting because they are the ‘core’ services powering the digital world these days. Their scale is at top of the chart in the software engineering world and as a result they expose the limitations of software engineering. 

Software diseconomies of scale are at their most evident in these software leviathans. They are massive projects with huge numbers of the best engineers working on them. But development is slow per engineer and code quality is not clearly superior to industry best practices. 

Don’t use internal tooling, contribute to your tools.

There is a dichotomy in software engineering organizations, some only use public tooling, entirely avoiding building their owns tools. Other companies  follow the ‘Not invented here’ principle and try to only use software developed internally. 

There are two forces driving this split, first building software tooling is expensive, small companies often cannot afford to build and support internal tools. It is an expensive recurring cost that can easily get out of hand. Building 2 internal tools a year will set you on a path to supporting 20 tools 10 years from now eating huge chunks of your operations budget. 

The other force in play is that if you use public or off the shelf tooling you will encounter workflow discontinuities that are difficult to fix. Using off-the-shelf tool A with tool B might require an entire employee to bridge the gaps manually, while still be a pain for everyone involved. Decision makers at some companies think to themselves, “We can just build a tool B that works perfectly with our use case.” And that works great when you have one tool, but then when need C comes along your team  makes the same case again, “We already invested in custom tool B we can’t throw away that work, we need a new custom tool for need C”. And now your company is on the path towards building an alphabet’s worth of internal tools that aren’t useful outside of your business.

Luckily, there is a solution to this. Use Open Source tooling and when you run into a workflow problem, work with the maintainers to contribute a fix. Even in poorly managed projects extending the code to support your use case will be less costly than building an entirely new internal tool to solve the same problem. 

Previewing image uploads with stimulus.js

I’ve been working on a website with file uploads lately and after a bunch of googling put together a solution using stimulus.js instead of jQuery as not using jQuery is the thing now.

The website I’m working on allows people to upload images and then to tag them before sharing and I thought it was a bit sub-optimal to ask people to tag their images without being able to see them. 

I’m using rails so this component has two parts the javascript file and my html template file. In the html template file I have a html form created via the rails form_with helper. 

<%= form_with scope: :look, url: looks_path, local: true do |form| %>

  <div>
    <%= form.label :title %><br>
    <%= form.text_field :title %>
  </div>

  <div>
    <%= form.text_field :tag_list, multiple:true %>
  </div>

  <div data-controller="image-preview">
    <%= image_tag("preview.png",  :width => "100px", :hight => "100px", 
    data: { target: "image-preview.output" }) %>
  
    <div class="my-2">
      <%= form.label "Look Image" %><br>
      <%= form.file_field :image, :class => "form-control-file photo_upload", 
      data: {target: "image-preview.input", action: "image-preview#readURL" } %>
    </div>    
  </div>

  <div>
    <%= form.submit %>
  </div>
  <% end %>

You can see there that my stimulus data-controller encapsulates the form file_input and an image tag. The image tag loads a “preview.png” on page load and then replaces that white square when the user uploads an image. Ideally, you may want to play with the visibility attribute instead of using a blank square. 

// Visit The Stimulus Handbook for more details 
// https://stimulusjs.org/handbook/introduction

import { Controller } from "stimulus"

export default class extends Controller {
  static targets = [ "output", "input" ]

  readURL() {
    var input = this.inputTarget
    var output = this.outputTarget

    if (input.files && input.files[0]) {
      var reader = new FileReader();

      reader.onload = function () {
       output.src = reader.result
     }

     reader.readAsDataURL(input.files[0]);
   }
 }

}

The stimulus.js controller is here. It uses two targets the “input” which is the form file field and the “output” which is the image tag we are updating with the new image. This javascript uses the FileReader (https://developer.mozilla.org/en-US/docs/Web/API/FileReader) API to read the image off of the user’s computer and display it without needing to move the image across the network. 

This ended up being pretty simple to do in stimulus.js. I like that we don’t need QuerySelectors or anything non-deterministic like that and can just reference our targets directly. I haven’t included the webpack and rails integration code, it was a little tricky, and is a topic for another blog post.