How we open sourced docs.github.com
A lot of work went into figuring out how to sync a public and private docs repo.
Last week we open sourced all of GitHub’s product documentation, along with the Node.js web application that powers it. Check out our new public repository at github.com/github/docs.
This post tells the story of why we wanted to open source the docs, what tools we built and open sourced along the way, and how we worked to make the project welcoming to external contributors.
Why open source?
More people, more ideas, more fun
The open source community loves to innovate. Our docs team is innovative too, but we are relatively few in number. We open sourced docs.github.com to welcome new ideas and contributions from a broader and more diverse range of people, like you!
Exemplifying open source practices for enterprises
We open sourced GitHub’s product documentation to help demonstrate that it’s possible (and beneficial) for private companies to open source their products. GitHub has open sourced hundreds of projects over the years, but docs.github.com is the first private production service that we’ve migrated into the open. We hope that our application design, automation workflows, and openness to external contributions can be a model for other organizations that want to open up their private projects and benefit from community collaboration.
Cross-org collaboration
docs.github.com is an internationalized website with documentation translated into Japanese, Simplified Chinese, Spanish, and Brazilian Portuguese, with support for more languages coming soon. Meanwhile the Node.js internationalization working group is also working to localize Node.js documentation. The Node.js project faces many of the same challenges that we do. They use GitHub repositories, GitHub Actions, and the Crowdin.com localization platform to build workflows and tooling for the open source community. Our hope is that open sourcing GitHub Docs will be mutually beneficial to the GitHub and Node.js communities. Together we’ll be able to share practices, tooling, and workflows more transparently. The @crowdin-node GitHub organization exists for this purpose, and the tools there are maintained by folks from the Electron, GitHub Docs, and Node.js projects. We currently work with professional translators to localize our content, but we hope to eventually open up the translation process to external contributors as well.
Access for our vendors
When members of the GitHub Docs team open support requests with third-party vendors like Fastly, Crowdin, Algolia, or Heroku, we often paint detailed pictures of the problems we’re trying to solve. This time-consuming process can lead to a lot of back-and-forth communication in private support channels. Now that github/docs is a public repo we can link directly to code or GitHub Issues that relate to the matter at hand. Instead of making suggestions for us to try, our vendors can (and often do) clone our repo, try out solutions, and even open pull requests to help solve problems. This not only helps our team, but strengthens our relationships with our vendors and leaves a public record of how we solve problems that others can learn from.
A long history
GitHub has open sourced many things over the years, from full-blown platforms like Electron to tools like GitHub Desktop and the gh
CLI, to API client libraries like octokit/rest.js and octokit.rb, and lots of npm packages. We’ve also shared some things that aren’t software, such as open sourcing our employee intellectual property agreement and sharing our product roadmap publicly.
But github/docs is a bit different from all the other projects GitHub has open sourced: it’s not a standalone tool or library, but a repo containing all of our product documentation, along with the web application that powers it. This application was created over seven years ago in 2013, and has a long and colorful history.
The first incarnation of docs.github.com (originally help.github.com) was a Ruby on Rails application. It was later converted to a static site using Jekyll and then Nanoc, another Ruby static site generator. Today the site is powered by a Node.js web service that supports dynamic routing and content rendering.
The site’s tooling has changed over the years, but many of the tried-and-true authoring conventions of the Jekyll site have been preserved:
- Site content is written in Markdown files.
- Liquid templates can be used in Markdown for rendering dynamic data.
- Structured JSON and YAML files in a data directory can be referenced from templates.
- Markdown files can include key-value metadata in YAML frontmatter.
What did we open source?
When rebuilding GitHub Docs as a Node.js application, we looked for opportunities to open source different pieces of it along the way:
- github/docs – the content and code that powers docs.github.com
- github/repo-sync – GitHub Actions for keeping git repositories in sync
- github/rest-api-description – OpenAPI description files for GitHub’s REST API
- docs/liquid – a Node.js implementation of the Liquid template language
- docs/render-content – a Node.js module for rendering Markdown and Liquid
- docs/frontmatter – a Node.js module for parsing and validating YAML frontmatter
- docs/data-directory – a Node.js module for loading structured data from a directory
Syncing public and private repos
One of the unique challenges we faced when open sourcing GitHub Docs was finding a way to make everything public while still being able to collaborate privately on upcoming product releases. To accomplish this, we decided to create two git repositories (one public, one private) and set about finding a way to keep the two in sync automatically.
We turned to the GitHub Marketplace for a solution. Although there didn’t seem to be an existing tool that suited our exact needs, there was something that came close: Pull is a GitHub App designed to automate the often tedious task of keeping a forked repo up to date with its upstream repo.
We reached out to the author of the Pull app, Wei He, to see if he’d be interested in collaborating with us to build a similar app, but for syncing repos bi-directionally. Wei happily accepted the challenge and together we set about building Repo Sync, a new open source tool using Docker, git, shell scripts, GitHub Actions, and the GitHub Container Registry.
Repo Sync is a set of flexible GitHub Actions for keeping git repositories in sync. Using a single Actions workflow file that runs on a schedule, we can keep the main
branch of the private and public docs repositories in sync, without human intervention.
Structured REST API docs with OpenAPI
Earlier this year we merged help.github.com and developer.github.com into a single site, docs.github.com. Prior to this migration, all of GitHub’s REST API reference documentation was a mix of unstructured Markdown, embedded Ruby, Liquid templating, and a side of hand-pasted cURL output. When GitHub’s REST API was created over ten years ago, it was not documented in a structured way. There were automated tests in the github.com codebase, but no machine-readable specification for the API. As it turned out, the unstructured REST API docs on developer.github.com were the closest thing we had to a source of truth about GitHub’s REST API.
With the help of Octokit maintainer Gregor Martynus and contractors at Redoc.ly, we embarked on a long journey to reverse-engineer GitHub’s REST API reference docs into machine-readable and human-editable OpenAPI description files.
As of earlier this year, those OpenAPI descriptions now live in the github.com codebase, and are used to create, validate, and test GitHub’s REST API. These OpenAPI descriptions are also now used to generate the Javascript and Ruby Octokit clients, and to render the new REST API reference docs at docs.github.com/rest. GitHub’s OpenAPI description files are now open source and can be downloaded at github.com/github/rest-api-description.
For more history on the creation of OpenAPI (originally Swagger) and GitHub’s eventual adoption of it, check out my Swagger origin story talk and my colleague’s Describing a 10-year old API with OpenAPI talk from this year’s API Specifications conference.
Liquid
Liquid is an open source template language similar to Nunjucks or Handlebars. It was created by Shopify to give their customers a safe way to add dynamic content to their storefronts. Liquid is a Ruby project, and has seen wide adoption among projects using Rails or Jekyll. While Liquid is popular in the Ruby community, it is still relatively obscure and rarely used in the JavaScript ecosystem. We tried to find an npm package to parse and render our existing Liquid templates, but were unable to find a complete implementation that met all of our needs.
At this point we could have thrown in the towel and migrated to another template language, but we liked what we had: our team’s technical writers were comfortable using Liquid, and our engineers didn’t particularly want to take on the daunting and thankless task of migrating thousands of content files to a new template language.
We reached out to the authors of several existing Liquid-related npm packages, and with the help of multiple contributors we were able to deprecate several old and unmaintained packages, rebrand liquid-node
as liquid
, convert its codebase from CoffeeScript to JavaScript, and improve its tests and documentation.
The new liquid
npm package lives at github.com/docs/liquid.
Contributor Experience
We like to automate stuff. If a human task is boring, repetitive, time-consuming, or error-prone, we look for opportunities to replace it with automation.
The GitHub Docs team embraces the concept of GitHub Flow, a continuous delivery cycle wherein changes are staged, tested, and deployed to production automatically, as a byproduct of the normal pull request workflow. To put it another way, anyone should be able to make a contribution, preview it, test it, and ship it without leaving the github.com website.
When a pull request is opened on github/docs, the changes in that pull request’s branch are automatically deployed to an ephemeral staging/review app. This makes it easy for anyone to review the changes in a live app, without having to pull down the changes and view them in their local environment. When a pull request is merged to the default branch, the ephemeral staging app is destroyed and the changes are deployed to production.
Our team has enjoyed this “hands-free” continuous delivery workflow for over a year now, and we could never go back to the days of finite staging instances and chatops deployment commands.
We want the process of contributing to GitHub Docs to be as easy for external contributors as it is for GitHub employees, and we’ve built our testing and deployment tools with that in mind. When an external contributor opens a pull request on github/docs, we’ll run the same continuous integration tests and create the same ephemeral staging apps as we do for GitHub employees.
Acknowledging all contributions
Open source is about more than just code. Successful open source project maintainers know that the health of the project depends not just on the source code, but on the well-being of the people who participate in its development. In addition to adopting and enforcing a Code of Conduct, it’s important to acknowledge contributions of all kinds, so people know their efforts are valued.
All Contributors is a popular open source project that makes it easy to acknowledge contributions. All Contributors includes a specification, a GitHub bot, and a command-line tool for quickly adding contributors to your project. It also supports a wide array of contribution types:
We’re now using the All Contributors project on all of our docs-related open source repositories. For an example, check out the github/docs README.
This is just the beginning!
It took a long time and a lot of work to get here. Figuring out how to open source this long-lived private project has been a fun challenge, but I think the best is yet to come. Working in the open is the most fun way to build software, and I can’t wait to see how docs.github.com evolves as we invite contributors from around the world to help shape its future.
Tags:
Written by
Related posts
First Look: Exploring OpenAI o1 in GitHub Copilot
We’ve tested integrating OpenAI o1-preview with GitHub Copilot. Here’s a first look at where we think it can add value to your day to day.
GitHub Availability Report: August 2024
In August, we experienced one incident that resulted in degraded performance across GitHub services.
Fine-tuned models are now in limited public beta for GitHub Copilot Enterprise
Fine-tuned models empower organizations to receive code suggestions specifically tailored to their coding practices and internal languages.