Building a culture of documentation Vinay Raghu - Software Developer

I work on a team that serves as a platform for other teams so they can quickly spin up new web applications and deploy them. Part of my job is to make architectural decisions and provide support for these teams who are technically my “customers”.

Having been a customer of several such internal teams myself, one of the factors that matter the most to a good developer / customer experience is good documentation. More important than building a good set of documents itself, is building a culture of documentation. Good documentation needs maintenance, which can only happen if everyone on the team is incentivized to do so.

What kind of documentation are we talking about?

Before we proceed, I’d like to explain what kind of documentation we are talking about because there are multiple kinds. One kind is fairly code-adjacent and easy to build and maintain. Things like swagger API docs or the typical repo readme that helps local setup are usually available because they are imperative to getting work done.

The kind of documentation that is not commonly available is more around things like workflow, processes, key decisions and code style guides. These are key things that affect a developer’s day-to-day experience in a codebase. But a lot of times there is no documentation for these types of processes. One main reason it is hard to find this type of documentation is because they are an outcome of passive consensus that evolves over time.

Isn’t code itself documentation?

It is, to an extent. Code provides you information regarding what decisions were made - what framework was picked, what libraries are used, how we call APIs. What’s usually missing is context – or why these decisions were made.

Think of some of these documents as analogous to good code comments. When you look at the code, you can already infer what it does. A good code comment would tell you why something is being done.

Documenting decisions

In my team, we started documenting key technical decisions like:

Our approach to CSS
Which framework do we use to build our apps
Programming language of choice - Typescript vs Javascript vs Elm

The contents of the document in the picture below are deliberately obfuscated because it’s immaterial. The overall format for this kind of documentation is to list a set of choices and evaluate their pros and cons. Once a final decision is made after discussing with the team, this document is published for posterity.

What does this achieve?

Context

The big solve is that it provides a full picture of all decisions made to a new engineer that’s onboarding several months after these discussions have taken place. It shows what set of problems we chose to optimize for and what tradeoffs we were okay with in the process.

It shows what were the different choices we considered which I think provides for a much better onboarding experience compared to “here’s a codebase, make some code”.

Saves time

We hire opinionated engineers because having strong opinions (within reason) is a mark of a good senior engineer. We fully expect everyone coming in to question and reevaluate these choices. However, it’s hard for any one person or group of people to have to explain and defend their choices constantly. This kind of documentation scales really well because any such question can be answered with a link to a page that contains more context and provides deeper insight.

Removes subjectiveness

Another benefit is that it removes subjectivity from both the decision making process and the questioning process. The document aims to be as objective as possible and avoids conversations such as “we used to do X at my last job” or “I like doing X because” and puts the focus on what’s good for the project not the person.

Decision framework

It serves as a template for any and all decision-making in the team. It’s nice to see this process be continued for any new decisions that are made. This way the documentation process is democratized and all decisions follow a standard framework.

Avoids bike-shedding

Software always has tradeoffs and it is inherently easy but unproductive to argue on tradeoffs. With a document like this, we are clearly stating that we are accepting the tradeoffs that come with the decision we made. Thereby avoiding any kind of bike-shedding around known tradeoffs when people want to propose changes.

Method to propose changes

One thing that is essential to maintaining a codebase over time is the ability to adapt to changes and reevaluate decisions. Since decisions are documented, it is a lot easier to propose changes that are no-brainers.

Do some assumptions still make sense?

Eg. Are we still optimizing to “get out the door quickly”?

Are there other choices that were not considered?

Eg. Was GraphQL not considered because it wasn’t as popular when the original decision was made?

The end-result of having good documentation like this is a good onboarding experience – not just familiarizing the new engineer to the codebase but also provide a more rounded context around the whole project and the team. It encourages people to follow a similar process of documentation and update anything that is outdated. It’s a rite of passage for anyone to come in and update anything outdated they encounter. Also, I think it greatly reduces the friction between someone being added to the codebase and producing code, which is a huge metric if you are scaling a team from 0 - 60 engineers.

What format of documentation

Different strokes for different folks as everyone’s learning style is different. But the key is to aim for use-case based documentation such as “how do I X”.

This is because people will be more likely to consume this content when it directly impacts their ability to get work done vs optional reading. Although some like blogs and others like videos, the best kind of documentation IMO is written ones. This is because they are easy to skim compared to other formats.

However, creating video content is a lot easier these days where everything is a call and you can simply hit the record button. Anytime I jump on a pair-programming session, if it is a common problem, we record the session and add it to the library so others can peruse as well.

In addition, we have:

How-to articles
Tech talks diving deep into specific topics
Open ended office hours for people to come in with questions, suggestions and concerns
A bunch of screencasts

The answer to what kind of documentation works best is all of the above.

Centralize documentation

Pick a tool - we used Confluence, because we were already in the ecosystem. Others have had success with Notion or similar tools. The key to centralizing documentation is so info is not fragmented. For example, we used gitlab repo and markdown files for some docs but learned that designers and PMs don’t have Gitlab access. So we decided on a single location in confluence and kept that as our central store.

Optimize for discovery

The key is “findability” - you don’t want to spend more time looking for a doc. We build indexes and reorganize content pretty often for better discovery. This means the folder structure probably keeps evolving over time but that’s okay because Confluence can remember references and adjust them even when the location is moved.

Advantages

Fewer meetings

One huge advantage of building a knowledge base such as this is fewer meetings. Often times, questions can be answered with a link to a documentation. Eg. “How do I request for changes to a global component” is documented in an intake-request doc. “When can I expect this to be done” is coded in an SLA doc. Often times people needing such changes require both these pieces of info and are sent links to both.

Fewer 1:1 troubleshooting

Fewer 1:1 support and troubleshooting. We document common issues that arise and part of everyone’s onboarding doc is to ensure they build on the existing document. That’s usually their first MR into the repo. This solves a lot of troubleshooting where someone needs to buddy up with a newcomer to solve for problems like:

Are you on a mac or a PC?
Are you a vendor with your own machine or a full timer with company provided laptop?
Are you on the right VPN group? How do you verify?
Do you have the right version of Node? (solved by Volta)

Empowered team

What I really liked about the whole approach was to always mention how important documentation is, making it a first-class citizen. Meaning, you can spend time on the sprint documenting work so others can pick up easily. All subject matter expertise is written down so we are not blocked if one person is on vacation.

Research spikes produce documentation - any decision for framework or tooling is documented. Code style guidelines are documented and constantly updated by the team. It’s lovely to see the team pick up on “oh, this seems like it should be documented” or see comments like this in a code review “we decided not to do this - see this document for more info”. Which really starts to spread across the org as teams grow and more people get added to this repo.

This is powerful because this isn’t me manually writing down documents or asking people to write things down. Because they have been put into a situation where writing things down is valued, they automatically pick it up.

Problems

Before I wrap up, I’d like to talk about some disadvantages of these approaches.

Too much documentation

There is such a thing as too much documentation - if there’s so much written down that someone needs DAYs to get up to speed, then they are most likely not going to read. This is sort of solved by positioning documentation around use cases - so you know if someone needs to get something done they are going to read it. Even then, a wall of text is very intimidating to someone that just wants to get work done.

Not reading the documentation

Every once in a while, you’ll encounter a character that refuses to read what’s already documented and they will either ping you 1:1 or ask questions in a group channel. One thing that comes in very handy is being able to link to a specific section in the document - where each heading is a page-level link. That really solves this problem, so you can point folks to the exact question they want answered. It doubles as a way of shaming them for not doing a basic search. There is no way around this though.

Conclusion

In conclusion, I’ve found that starting with documenting and treating it as a first-class citizen enables teams to scale better and spend less time troubleshooting or onboarding someone. This comes in especially handy in a world where everyone is remote (thanks Coronavirus) and timezones are not always conducive to jumping on a call. There is usually not a lot of downsides to writing things down but it does take being deliberate to build a culture so documentation is maintained and updated.