Remote Work Revisited

About a week ago on a particularly cold morning with fresh snow and ice on the streets, I decided to try working from home again. While I don’t hesitate to bike or walk to work in 20 or 30 degree weather, 3 degrees is too cold.

After making the decision I opened slack and typed “WFH” into our office channel. Then about 30min later, after getting coffee, I hopped back online to address my first intern request for clarity. I played some music and connected my external monitor to my laptop while debugging some docker networking issues. For some reason, nearly all of my issues with docker involve the network.

For lunch I walked over to Safeway and picked up some sandwich materials. Then I took a quick nap. At work despite having an hour for lunch, I don’t have any where to nap even if I have time.

Then it was back to debugging docker and answering questions from the team. Overall, this working from home experience was much better than last time when I stayed at home due to the cold. This time was more peaceful and I did not miss the office.

Certified Kubernetes Administrator!

I had the opportunity to take the [CKA](https://www.cncf.io/announcement/2016/11/08/cloud-native-computing-foundation-launches-certification-training-managed-service-provider-program-kubernetes),
‘Certified Kubernetes Administrator’ exam because my employer is trying to get ‘Certified Kubernetes Service Provider’ status, which requires 3 certified administrators. Now, why would Certified Kubernetes Administrators or service providers be valuable for your company? The CKA is good because it certifies a base level of knowledge and ability in kubernetes administrators. Things you would expect from CKAs are the ability to debug clusters, perform upgrades, bootstrap clusters and application deployment tasks.

What and how does the exam test administrators? The exam tests your ability to perform operations against the kubernetes api. The entire exam period is spent in the command line with kubectl on a standard linux shell. The test validates your ability to ‘get things done’ in the kubernetes environment.

Overall, I have a very positive outlook on the exam. I spent about a week preparing. I read a lot of the kubernetes.io documentation, ran through ‘kubernetes the hard way’ three times and had worked with kubernetes on the application side previously. I passed on the first try, but I did need all of the exam time and had to skip a few of the hard problems.

I don’t think that the CKA is essential for devops or kubernetes admins, but it is a good exam and great for filling some of the gaps you might have in your knowledge.

Getting a website up in 2017

I launched two websites in 2017, yourfoods.info and sledgianowski.com. This post goes over how I did it, what went well and where the weak points were.

### Hosting:
Google Cloud Platform’s GCE is the hosting provider I chose. I think GCP has the best user interface of the public clouds and its no effort persistent use discounts make things simple for me. The cloud shell google offers is excellent and lets you ignore the SSH keys you would have to keep track of for AWS.

I will probably keep using GCP for my projects next year. Although, I am interested in testing packet.net’s bare metal hosting.

### Setup
My websites are a ghost blog https://ghost.org and a Django+Postgres web application. My blog uses nginx for the frontend and SSL encryption with ghost’s nodejs implementation as the backend.

Yourfoods.info uses a nginx frontend for ssl and gunicorn for serving the django webservice.

Let’s Encrypt is a big win for my websites. The integration between nginx and certbot, https://certbot.eff.org, is excellent making it easy to setup SSL in minutes. The main issue I have had with it is when my DNS was not pointing at my public IP in google cloud.

### Domain names
I use Namecheap for my domains and DNS provider. The pricing is decent and their DNS configuration support handles my use case well. Namecheap has two factor authentication, but its SMS based and somewhat wonky to use.

### Analytics
I am still using Google Analytics for user tracking. Its free, provides a first class UI and is very easy to setup. I gave a few free and private options a cursory glance, but they did not look like things I could setup in an hour.

### Conclusion
Its pretty easy to setup a website these days. Let’s Encrypt and Nginx integration make SSL quick and easy to get going. And the public cloud is great for small websites that do not use a lot of resources.

What I learned from a year of Devops

In 2017 I had the opportunity to spend a year working as a devops or platform engineer. I have mainly worked as a software engineer before so moving in to an automate and operate role was a bit of a leap. This was a fully remote engagement where I was embedded with and helped bootstrap the client’s first platform team.

The project was building out continuous integration and delivery for a client of ours that had no AWS experience. Before they brought us in they ran all their systems in their own datacenters in a windows and .Net environment. We came in to assist with the move into the cloud and to help transition the company from .NET to Java, javascript and microservice development.

The first few months we focused on building out the CI/CD with Jenkins pipelines and a great deal of AWS cli scripting. Once we got the basics working teams started to come out of microservices training and began developing against it. This was the start of operational support for us and started a bit of a scramble while we tried to balance new features and the stability of the platform with hiring and onboarding.

We used jenkins pipelines, docker and cloudformation to provide our users with a solid customizable pipeline solution. Using our default templates development teams could easily bootstrap their pipeline with CI/CD from dev to canary deploys in production. If they needed more than a stateless microservice we enabled them to provide cloudformation templates in their github repository that would be run with each deploy to ensure the AWS environment was bootstrapped for their needs.

We started out with the intention of using Jenkins pipelines with ansible to automate things, but the client’s team was more experienced with CloudFormation and as a result I ended up writing most of our initial CI/CD code in a combination of groovy and AWS cli calls. This proved unwieldy and eventually led us to using Groovy + Cloudformation for nearly everything. Cloudformation works but it is locked into AWS and its programming model is a somewhat awkward. Cloudformation’s saving grace is the first class integration and editor. Next time I would recommend starting with a commitment to Terraform or Ansible.

In the 3rd quarter we started work on implementing Canary deployments. Our solution ended up being a combination of a customized client side load balancing http client and jenkins pipelines. I started us off with a proof of concept that proved easier to write than we expected which put us on good footing for the rest of the project. One of the client’s employees took advantage of the space we had to rewrite the shared jenkins pipeline library in more idiomatic language which turned out to be a great improvement.

We went live in Q4 and I moved on to another project. I am moving back into application development, I ended up doing 100% automation scripting instead of the 50-50 split I was expecting. So it will be good to get back to writing applications.

Team Skill Shaping

When running a team you need to balance bus factor and performance.

On an average software team of perhaps 10 people, you will naturally have people develop expertise in a particular part of the codebase. One developer will be an expert in the frontend, another in the SQL queries, the next in the controllers, etc. What you want to avoid is a bus factor of 1. Some teams try to keep every developer knowledgeable in every area of the system, this is a waste. If you have to work on every part of the system you will not be able to master any single part. To get a bus factor of n by rotating developers through different parts of the system you give up the efficiency from letting a developer master a subsystem. My solution is that you should focus on getting a bus factor of 2 for each major subsystem. Have people focus on the two subsystems that they are interested in or are available and leave things there. Just try to avoid having two developers working on the same two subsystems. You are unlikely to have two developers get hit by the ‘bus’ at the same time.

Aim for a bus factor of 2 while trying to avoid a lot of overlap on subsystems, then leave things there.

Changing KPIs — A tale of moving from individual contributor to team lead.

Changing KPIs — Moving from individual contributor to team lead.

The biggest change after my move to team lead is that my KPIs (key performance indicators) have changed significantly. I still troubleshoot bugs, create architecture, discuss and persuade teammates of architectures. I get to write some code here and there. But the work that I am evaluated on has changed significantly. Instead of being evaluated on my ability to get coding done, to resolve bugs and be a good team member, I am evaluated based on the team’s performance. Was I able to keep everyone on the team from being blocked this sprint? Was I able to keep people on the team coordinated such that they didn’t duplicate code or write incompatible interfaces? Do we have the architecture and stories hashed out far ahead enough to keep working towards the release?

Its been kind of a shock to me because I will be giving my update in standup, trying to remember what I did yesterday and its something along the lines of “I sat in on a couple meetings, reviewed PRs and helped classify several bugs.” I worked all day and am exhausted now, but I didn’t commit any code or make any progress on the story I assigned to myself. It feels like I’m not getting anything done, what happened, I used to be good at my job.

But despite feeling like nothing is getting done, I am still hitting my KPIs as a team lead. My bosses are happy, the team seems happy enough with my work, the scrum master has what he needs, etc. Its not that I am not getting any work done, its that my ‘work’ has changed to something different. I am focused more on coordinating the team’s effort and planning what we need to do next, keeping abreast of the features coming down the roadmap, keeping track of technical debt and the maintenance work we need to do.

Starting a new Go project is delightful.

I started a couple projects recently, a mock crypto exchange and my latest project a unicode manipulation library. But what struck me is that its really simple to get started. You need the go runtime and a GOPATH setup.

Then you specify the package and the main method and that is a valid program.

package main

func main(){}

Above is my program so far. It is a short program with a bit of exploratory code that converts strings into their unicode rune ids. Back when I mainly used Java, I would have had to setup an IDE, integrated maven and think about package structure. In Go I just have a main.go file, if I need dependencies I create a /vendor directory and usego get github.com/gin-gonic/gin to pull the dependency.

Overall the lightweightness of the Go tooling makes it very easy to build small programs.

 

 

Code Challenge Usability Testing

Coding Challenges have become a common step in the Software engineering interview process. I have had to review a couple challenges and the biggest mistake is a lack of usability testing. People are usually in a rush when they do coding challenges, either to limit the amount of time they are spending, or because they decided to attempt something challenging and time is running out. When we do side projects we often think that because its just us less complications will come up and our estimates will be more accurate. But the reality is that the project is still going to take longer than you expected. 

Since most coding challenges end up being rushed, its surprisingly easy to stand out through usability testing. Ask a friend to clone your repository and try to get the code running without any support on your end. A surprising number of code samples and challenges don’t even start once they get into the interviewers hands. Or even worse, your code works perfectly, but the interviewers don’t realize it because you didn’t document what the ‘working’ state looked like well enough. 

It might seem like a simple thing to do, but just having two people try to start your app and give you feedback can result in a smooth stress free review of your code sample and give you a strong foot in the door. 

Using pprof to examine the heap and memory usage in Golang Programs

I had some trouble getting the info I needed to setup pprof in my program and the steps to get actionable data out of pprof, so here is my attempt to provide the minimum steps needed to use pprof.

Instrument your code

import (
     _ "net/http/pprof"
        )

func main() {
    go func() { log.Println(http.ListenAndServe("localhost:6060", nil)) }()

    //Your program
}
Make sure you have the above in your main.go file. This sets up a webserver that provides pprof data from localhost:6060/debug/pprof/

Heap Memory Usage

In your shell run:
go tool pprof http://localhost:6060/debug/pprof/heap
This will open a cli program, enter top into the prompt.
This gives you the top 10 nodes of memory usage. It will truncate the results if most of the memory is in the top 3 nodes.

CPU time

To sample 30 seconds of cpu time, with 50ms of time ‘sampled’.
go tool pprof http://localhost:6060/debug/pprof/profile
pprof Godoc:   https://golang.org/pkg/net/http/pprof/