New programmers sometimes ask me “what project should I work on next?”. This project is one I drafted up for myself because I wanted to build a more complex application with Django and pytorch.
The Turtle Generator Project is an attempt to create a website on which people can submit pictures of turtles and vote on whether machine generated turtles are “turtle” or “not turtle”. The user submissions and votes form a GAN or generative adversarial network, both classifying and producing pictures of turtles, although we may use user submissions as part of our dataset of turtles.
Pages / Components
Draw or Submit turtle component
Drawing component where a visitor can use their mouse or touchpad to to draw a turtle and submit it to the turtles dataset.
A component consisting of a picture of a turtle generated by our backend algorithms and a button which says “turtle” or “not turtle”.
Django frontend + PostgresDB
serves web pages and handles user interaction
votes and turtle drawing submissions are submitted to the MachineLearner system via Celery
submits “turtle image” request to celery queue — gets turtle back
Possibly Reactjs or just django templates
online machine learning system based on pytorch
takes celery tasks and either
generates a picture of a turtle
adds a vote submission to the training data ( classification )
adds a turtle picture submission to the training data
Celery + Redis
Message queue used to handle queuing training tasks
I was surfing reddit this week and I saw a great comment from u/Swordbow.
“Thus as the obligation stretches across time, the cash flow must stretch across time.”
–Quote from u/Swordbow on reddit
Computer software isn’t like classical machinery in that bugs and vulnerabilities can be discovered after release that destroy the security of the application. A year after release you might have to perform an urgent patch to protect your customers from hacking. If your company goes bankrupt during that year, customers are in a very bad situation. Software needs to be maintained and unlike typical machinery it has to be maintained by the creators not the end users.
Selling new releases of software is hard. Programs can be copied at zero marginal cost and don’t really wear out. Getting customers to buy a new version every year is a lot harder than getting customers to pay a monthly subscription. With Software as a Service instead of getting paid when you sell a new version of your software, you get paid as long as customers use your software. And you don’t have to worry about dealing with upgrades.
Companies can reduce consumer surplus by switching to a SAAS model that requires customers to pay as long as they use the software. With SAAS all customers pay for every update.
Project README flies are typically an after thought in the software development process. If a question comes up repeatedly it gets added in an unstructured fashion. This is unfortunate, because the people who need READMEs the most are new engineers who joining the team. They don’t know any of the teams jargon. They probably don’t have a good understanding of what the project does. And they probably don’t understand the internal architecture of the project.
You want the first part of the README to be an introduction to your project. Answer the question “Why do we have this service?”.
To help new engineers use as little jargon as possible, and define terms in the README.
Include a summary of the architecture of the project in the README. It should cover what abstractions you are using and why you picked the ones that you did. If you use any patterns that are not included in every project at your company make sure to mention them in the README. The last thing you want is for people to take over the project from you, not be able to figure out why you chose these abstractions and then removing them from the codebase.
You README should also include the steps to get the project running. What permissions and credentials do new engineers need to run builds and integration tests? Who should they contact to get those permissions? Make sure to include the common failure cases that new engineers ask questions about.
Include a summary of the typical build process for the project. If you use make, write explanations for every make command you support and when they should be used. If you use a standard build tool like Maven, mention the extensions and plugins you use. “We use the Jacoco Plugin to ensure 80% code coverage, if you add a Spring configuration class you can add it to the ignored list for Jacoco.”
If you have integration or end to end tests in a different package reference in your README. Include an example of typical usage of the external package and expect people to read the README for that package if they run into trouble. Make sure to include common failure cases in the test suite. If external dependencies commonly cause your integration tests to fail, call out how a new engineer can determine that is the case and what they should do in response.
Example Table of Contents for a README
Why does this project exist?
Where can I find additional documentation?
Where can I find our CI/CD infrastructure?
What is the basic architecture of the system?
MVC, SPA, messaging, RPC
Do we have any managed thread pools?
What are our asynchronous tasks?
What patterns do we use in our codebase?
Explain any unusual patterns you use and why you need them.
How to get builds running
What tools are needed to run builds?
What build commands and flags should a new engineer be using?
How to get Tests running
What tools to use to run unit/functional/integration/end-to-end tests
Are any external packages needed
How to retrieve the external packages
Basic commands for any external packages
How to know if the tests passed or failed.