New programmers sometimes ask me “what project should I work on next?”. This project is one I drafted up for myself because I wanted to build a more complex application with Django and pytorch.
The Turtle Generator Project is an attempt to create a website on which people can submit pictures of turtles and vote on whether machine generated turtles are “turtle” or “not turtle”. The user submissions and votes form a GAN or generative adversarial network, both classifying and producing pictures of turtles, although we may use user submissions as part of our dataset of turtles.
Pages / Components
Draw or Submit turtle component
Drawing component where a visitor can use their mouse or touchpad to to draw a turtle and submit it to the turtles dataset.
A component consisting of a picture of a turtle generated by our backend algorithms and a button which says “turtle” or “not turtle”.
Django frontend + PostgresDB
serves web pages and handles user interaction
votes and turtle drawing submissions are submitted to the MachineLearner system via Celery
submits “turtle image” request to celery queue — gets turtle back
Possibly Reactjs or just django templates
online machine learning system based on pytorch
takes celery tasks and either
generates a picture of a turtle
adds a vote submission to the training data ( classification )
adds a turtle picture submission to the training data
Celery + Redis
Message queue used to handle queuing training tasks
I was surfing reddit this week and I saw a great comment from u/Swordbow.
“Thus as the obligation stretches across time, the cash flow must stretch across time.”
–Quote from u/Swordbow on reddit
Computer software isn’t like classical machinery in that bugs and vulnerabilities can be discovered after release that destroy the security of the application. A year after release you might have to perform an urgent patch to protect your customers from hacking. If your company goes bankrupt during that year, customers are in a very bad situation. Software needs to be maintained and unlike typical machinery it has to be maintained by the creators not the end users.
Selling new releases of software is hard. Programs can be copied at zero marginal cost and don’t really wear out. Getting customers to buy a new version every year is a lot harder than getting customers to pay a monthly subscription. With Software as a Service instead of getting paid when you sell a new version of your software, you get paid as long as customers use your software. And you don’t have to worry about dealing with upgrades.
Companies can reduce consumer surplus by switching to a SAAS model that requires customers to pay as long as they use the software. With SAAS all customers pay for every update.
Project README flies are typically an after thought in the software development process. If a question comes up repeatedly it gets added in an unstructured fashion. This is unfortunate, because the people who need READMEs the most are new engineers who joining the team. They don’t know any of the team’s jargon. They probably don’t have a good understanding of what the project does. And they probably don’t understand the internal architecture of the project.
You want the first part of the README to be an introduction to your project. Answer the question “Why do we have this service?”.
To help new engineers use as little jargon as possible, and define terms in the README.
Include a summary of the architecture of the project in the README. It should cover what abstractions you are using and why you picked the ones that you did. If you use any patterns that are not included in every project at your company make sure to mention them in the README. The last thing you want is for people to take over the project from you, not be able to figure out why you chose these abstractions and then removing them from the codebase.
You README should also include the steps to get the project running. What permissions and credentials do new engineers need to run builds and integration tests? Who should they contact to get those permissions? Make sure to include the common failure cases that new engineers ask questions about.
Include a summary of the typical build process for the project. If you use make, write explanations for every make command you support and when they should be used. If you use a standard build tool like Maven, mention the extensions and plugins you use. “We use the Jacoco Plugin to ensure 80% code coverage, if you add a Spring configuration class you can add it to the ignored list for Jacoco.”
If you have integration or end to end tests in a different package reference in your README. Include an example of typical usage of the external package and expect people to read the README for that package if they run into trouble. Make sure to include common failure cases in the test suite. If external dependencies commonly cause your integration tests to fail, call out how a new engineer can determine that is the case and what they should do in response.
Example Table of Contents for a README
Why does this project exist?
Where can I find additional documentation?
Where can I find our CI/CD infrastructure?
What is the basic architecture of the system?
MVC, SPA, messaging, RPC
Do we have any managed thread pools?
What are our asynchronous tasks?
What patterns do we use in our codebase?
Explain any unusual patterns you use and why you need them.
How to get builds running
What tools are needed to run builds?
What build commands and flags should a new engineer be using?
How to get Tests running
What tools to use to run unit/functional/integration/end-to-end tests
Are any external packages needed
How to retrieve the external packages
Basic commands for any external packages
How to know if the tests passed or failed.
I read REWORK by Jason Fried and David Heinemeier Hansson the founders of basecamp. The book is a series of short 200-500 word ‘sections’ that elaborate on a point. No wasted space or pages full of empty words where the point has already been made. As a result the book flows incredibly well. It is a quick and light read. The ideas in the book are commonsense lessons learned from running a successful small business. A lot of the ideas are shared with agile and the ‘lean startup’ schools of thought. But REWORK is a superior book to the ‘The Lean Startup’. Comparing the two books its clear Hansson and Fried understand the space better.
A few points from the book stuck with me so I will go over them.
Don’t write it down
The top customer complaints will come up so often you will never be able to forget them. You shouldn’t need a long list of customer issues, if you are listening to your customers regularly you won’t be able to ignore the top issues. If you get ten customer complaints each day and five of them are the same issue, you know what to work on.
The myth of the overnight sensation
“And on the rare occasion that instant success does come along, it usually doesn’t last —there’s no foundation there to support it.” — page 196.
I liked this phrasing of the overnight sensation. These days social media constantly spams us with success stories and lavish lifestyles we could be living. But if you are relying on luck to succeed it might not come a second time, and then you don’t have anything left.
Don’t scar on the first cut
Policies are only meant for situations that come up over and over again. You create a policy to make a common problem easier to solve. Without a policy you have to rely on judgement and escalating up the chain of command. That is expensive, but having a policy takes all the flexibility out of the situation. Don’t create policies unless its obvious that the issue is common and thinking about it is wasting people’s limited time.
Four letter words
Don’t use the words “Easy”, “Fast”, etc. Things are rarely done fast or easily. If they could be we would have done it already. Using those words implies things that we probably don’t know.
Inspiration is perishable
If you want to do something, you have got to do it now. You can’t do it later because you won’t be inspired to do it later.
I started my software engineering career in consulting. We did application development for large companies that did not have enough engineers on staff to get their projects done. We had a Statement of Work and a list of stories to do. Then we handed over the code and went back to our nice benches. Consulting was a big thing, we had executives, million dollar contracts, etc. I often ran in situations where I was spinning my wheels waiting for people to make decisions or to sell another contract. We ran some internal projects, but they were always an after thought.
Now I work in a product company somewhat similar to the companies we used to consult with. We have what is essentially an unlimited amount of work. There isn’t any time for anyone to be ‘on the bench’. We have enough work for double our headcount. I have 5 real, valuable projects that I want to work on. It is just an issue of prioritizing and getting things done. But the thing is, a lot of stuff is falling through the cracks.
We don’t have time to do everything, but we do need a lot of things improved.
We could use an expert on CI/CD pipelines to come in for a month and upgrade our pipeline. We could use a java performance expert to come in and help us cut end user latency by 50%. We could use a consultant to help us break up the monolith. We have opportunities for a dozen consultants, but from the outside you don’t see the internal workings of the product company. You can’t see that we have hundreds of ways to make the product better and no where near enough work to do it.