Remote Work Revisited

About a week ago on a particularly cold morning with fresh snow and ice on the streets, I decided to try working from home again. While I don’t hesitate to bike or walk to work in 20 or 30 degree weather, 3 degrees is too cold.

After making the decision I opened slack and typed “WFH” into our office channel. Then about 30min later, after getting coffee, I hopped back online to address my first intern request for clarity. I played some music and connected my external monitor to my laptop while debugging some docker networking issues. For some reason, nearly all of my issues with docker involve the network.

For lunch I walked over to Safeway and picked up some sandwich materials. Then I took a quick nap. At work despite having an hour for lunch, I don’t have any where to nap even if I have time.

Then it was back to debugging docker and answering questions from the team. Overall, this working from home experience was much better than last time when I stayed at home due to the cold. This time was more peaceful and I did not miss the office.

Certified Kubernetes Administrator!

I had the opportunity to take the [CKA](https://www.cncf.io/announcement/2016/11/08/cloud-native-computing-foundation-launches-certification-training-managed-service-provider-program-kubernetes),
‘Certified Kubernetes Administrator’ exam because my employer is trying to get ‘Certified Kubernetes Service Provider’ status, which requires 3 certified administrators. Now, why would Certified Kubernetes Administrators or service providers be valuable for your company? The CKA is good because it certifies a base level of knowledge and ability in kubernetes administrators. Things you would expect from CKAs are the ability to debug clusters, perform upgrades, bootstrap clusters and application deployment tasks.

What and how does the exam test administrators? The exam tests your ability to perform operations against the kubernetes api. The entire exam period is spent in the command line with kubectl on a standard linux shell. The test validates your ability to ‘get things done’ in the kubernetes environment.

Overall, I have a very positive outlook on the exam. I spent about a week preparing. I read a lot of the kubernetes.io documentation, ran through ‘kubernetes the hard way’ three times and had worked with kubernetes on the application side previously. I passed on the first try, but I did need all of the exam time and had to skip a few of the hard problems.

I don’t think that the CKA is essential for devops or kubernetes admins, but it is a good exam and great for filling some of the gaps you might have in your knowledge.

Getting a website up in 2017

I launched two websites in 2017, yourfoods.info and sledgianowski.com. This post goes over how I did it, what went well and where the weak points were.

### Hosting:
Google Cloud Platform’s GCE is the hosting provider I chose. I think GCP has the best user interface of the public clouds and its no effort persistent use discounts make things simple for me. The cloud shell google offers is excellent and lets you ignore the SSH keys you would have to keep track of for AWS.

I will probably keep using GCP for my projects next year. Although, I am interested in testing packet.net’s bare metal hosting.

### Setup
My websites are a ghost blog https://ghost.org and a Django+Postgres web application. My blog uses nginx for the frontend and SSL encryption with ghost’s nodejs implementation as the backend.

Yourfoods.info uses a nginx frontend for ssl and gunicorn for serving the django webservice.

Let’s Encrypt is a big win for my websites. The integration between nginx and certbot, https://certbot.eff.org, is excellent making it easy to setup SSL in minutes. The main issue I have had with it is when my DNS was not pointing at my public IP in google cloud.

### Domain names
I use Namecheap for my domains and DNS provider. The pricing is decent and their DNS configuration support handles my use case well. Namecheap has two factor authentication, but its SMS based and somewhat wonky to use.

### Analytics
I am still using Google Analytics for user tracking. Its free, provides a first class UI and is very easy to setup. I gave a few free and private options a cursory glance, but they did not look like things I could setup in an hour.

### Conclusion
Its pretty easy to setup a website these days. Let’s Encrypt and Nginx integration make SSL quick and easy to get going. And the public cloud is great for small websites that do not use a lot of resources.

What I learned from a year of Devops

In 2017 I had the opportunity to spend a year working as a devops or platform engineer. I have mainly worked as a software engineer before so moving in to an automate and operate role was a bit of a leap. This was a fully remote engagement where I was embedded with and helped bootstrap the client’s first platform team.

The project was building out continuous integration and delivery for a client of ours that had no AWS experience. Before they brought us in they ran all their systems in their own datacenters in a windows and .Net environment. We came in to assist with the move into the cloud and to help transition the company from .NET to Java, javascript and microservice development.

The first few months we focused on building out the CI/CD with Jenkins pipelines and a great deal of AWS cli scripting. Once we got the basics working teams started to come out of microservices training and began developing against it. This was the start of operational support for us and started a bit of a scramble while we tried to balance new features and the stability of the platform with hiring and onboarding.

We used jenkins pipelines, docker and cloudformation to provide our users with a solid customizable pipeline solution. Using our default templates development teams could easily bootstrap their pipeline with CI/CD from dev to canary deploys in production. If they needed more than a stateless microservice we enabled them to provide cloudformation templates in their github repository that would be run with each deploy to ensure the AWS environment was bootstrapped for their needs.

We started out with the intention of using Jenkins pipelines with ansible to automate things, but the client’s team was more experienced with CloudFormation and as a result I ended up writing most of our initial CI/CD code in a combination of groovy and AWS cli calls. This proved unwieldy and eventually led us to using Groovy + Cloudformation for nearly everything. Cloudformation works but it is locked into AWS and its programming model is a somewhat awkward. Cloudformation’s saving grace is the first class integration and editor. Next time I would recommend starting with a commitment to Terraform or Ansible.

In the 3rd quarter we started work on implementing Canary deployments. Our solution ended up being a combination of a customized client side load balancing http client and jenkins pipelines. I started us off with a proof of concept that proved easier to write than we expected which put us on good footing for the rest of the project. One of the client’s employees took advantage of the space we had to rewrite the shared jenkins pipeline library in more idiomatic language which turned out to be a great improvement.

We went live in Q4 and I moved on to another project. I am moving back into application development, I ended up doing 100% automation scripting instead of the 50-50 split I was expecting. So it will be good to get back to writing applications.

Team Skill Shaping

When running a team you need to balance bus factor and performance.

On an average software team of perhaps 10 people, you will naturally have people develop expertise in a particular part of the codebase. One developer will be an expert in the frontend, another in the SQL queries, the next in the controllers, etc. What you want to avoid is a bus factor of 1. Some teams try to keep every developer knowledgeable in every area of the system, this is a waste. If you have to work on every part of the system you will not be able to master any single part. To get a bus factor of n by rotating developers through different parts of the system you give up the efficiency from letting a developer master a subsystem. My solution is that you should focus on getting a bus factor of 2 for each major subsystem. Have people focus on the two subsystems that they are interested in or are available and leave things there. Just try to avoid having two developers working on the same two subsystems. You are unlikely to have two developers get hit by the ‘bus’ at the same time.

Aim for a bus factor of 2 while trying to avoid a lot of overlap on subsystems, then leave things there.