Migrating to GitHub Apps: Code Climate shares their story
Code Climate shares their experience and what they learned building with GitHub Apps.
We first started talking about GitHub Apps in 2017 as the recommended way for developers to build integrations on GitHub. With GitHub Apps, developers can use either GitHub’s REST or GraphQL APIs to interact with the GitHub ecosystem and provide a more flexible security model for users. Since announcing, we have released new documentation and guides to help simplify the process of building new integrations or migrating existing OAuth Apps and talking to integrators about their experiences. For those considering building or migrating to GitHub Apps, we thought it might be interesting to share one of those stories from Code Climate.
Code Climate has built on GitHub from the very beginning. Now they’re using GitHub Apps to build bots, manage server-to-server integrations, limit permission scopes, and provide better support to the engineers using their products. Here’s how they build with GitHub Apps, and what they learned in the process, in the words of Chris Hulton, Senior Engineer.
Code Climate Switches to GitHub Apps
Why are we so excited about GitHub Apps?
Though GitHub OAuth provides great capabilities for managing user-initiated actions, it presented us with a few limitations for building server-to-server integrations.
Using GitHub Apps to build bots
In our Quality product, we wanted to provide users with an Automated Code Review feature. Before GitHub Apps, we used individual user credentials to retrieve and post data into GitHub. We couldn’t post review comments on our customers’ pull requests as a Code Climate bot. GitHub Apps solve this problem.
Using GitHub Apps to manage server-to-server integrations
Some of our customers have immediate security policies that don’t allow us to use SSH to clone repository data from GitHub. Prior to GitHub Apps, we implemented HTTPS repository data cloning by rotating through various users’ OAuth access tokens. This means that these HTTPS calls are associated with specific GitHub users, even though they are not initiated by them. With GitHub Apps, we can authenticate to pull this type of data as a service.
Using GitHub Apps to limit permission scope
In the world of GitHub OAuth, read and write access to repository data are bundled together. GitHub Apps provides much more granularity in permissions. We are now able to ask for only the access level we need (read-only) on the specific resources we need (e.g. pull requests). This aligns the requested permission with the services we provide, making our users more comfortable.
How did we go about implementing GitHub Apps?
We found that switching to GitHub Apps did not require our existing system to be substantially restructured. The API endpoints and queries that we implemented via OAuth remained the same for GitHub Apps, and only our method of authentication changed.
For querying, we use GitHub’s graphql-client gem, with our client defined as:
class HTTP < ::GraphQL::Client::HTTP
def initialize(vcs, access_token)
super(vcs.graphql_api_url)
@access_token = access_token
end
def headers(context)
{}.tap do |h|
if (token = context.fetch(:access_token, @access_token))
h.merge!("Authorization" => "Bearer #{token}")
end
end
end
end
With OAuth, for access_token, we provided the authorized OAuth token of a random user in the organization. Randomization helped us to balance usage across the organization, but was not ideal. We wanted to make the API request on behalf of the organization itself, which is where GitHub Apps came in.
With GitHub Apps, access is provided via short-lived, renewable tokens belonging to the installation, and the installation is granted the appropriate permissions. We introduced a component in our code that would generate a token for the organization’s GitHub App installation using the octokit gem and the credentials of our GitHub App:
def create_token
create_token_client.create_app_installation_access_token(
external_database_id,
accept: Octokit::Preview::PREVIEW_TYPES[:integrations],
)
end
def create_token_client
@create_token_client ||= Octokit::Client.new(
bearer_token: generate_jwt_token,
api_endpoint: vcs.api_url,
web_endpoint: vcs.web_url,
connection_options: {
url: vcs.api_url,
},
)
end
def generate_jwt_token
now = Time.now.to_i
key = OpenSSL::PKey::RSA.new(vcs_app.private_key)
opts = {
iat: now,
exp: now + 60,
iss: vcs_app.external_database_id.to_i,
}
JWT.encode(opts, key, "RS256")
end
We then pass this token to the API client in exactly the same way as before, keeping the overall change small and isolated.
Additionally, each generated token comes with an associated expires_at timestamp. To improve performance, we store an encrypted copy of the temporary token along with this timestamp in our database, allowing us to refresh the token only when necessary.
What did we learn in the process?
API Rate Limit Calculation
During this migration, we learned about the different rate-limiting mechanisms between GitHub OAuth and GitHub Apps.
With OAuth, we were able to cycle through user tokens to make API requests. This provided a very high capacity, as each token allowed for 5,000 GraphQL points per hour. When switching to GitHub Apps, however, the rate limit became organization-wide, meaning we had to think more critically about how often we were hitting the API and how expensive our queries were.
To understand our current GraphQL usage, we used the rateLimit object received from our GraphQL queries, and began tracking statistics using StatsD around how expensive each query was, as well as how often the tokens we used approached their rate limits:
def process_RateLimit(object)
object = Connectors::GitHub::Graphql::Fragments::RateLimitFragment.new(object)
prefix = ["graphql", definition.name.demodulize, "limit"].join(".")
$statsd.gauge("#{prefix}.cost", object.cost)
$statsd.gauge("#{prefix}.remaining", object.remaining)
end
We found that many of our GraphQL queries were fairly inexpensive, but we identified one particularly complex query to be expensive (15 GraphQL points) and frequent (every 10 minutes for every repository in an organization). We thought about how this query would perform for a large organization in our system (for example, 100 repositories) when using GitHub Apps:
100 repositories 15 GraphQL points 6 queries per hour = 9,000 GraphQL points / hour
This was initially concerning, as the number was significantly more than the 5,000 GraphQL point rate limit provided for an installation. However, another advantage of GitHub Apps is that its rate limit scales with the size of the organization. For installations with more than 20 repositories, an additional 50 requests (or 50 GraphQL points) is provided for each repository per hour.
With this increase, we identified our modified rate limit would be:
5,000 GraphQL points base + 50 additional points 100 repositories = 10,000 GraphQL points*
This provided us with enough capacity to maintain our current system. To help us stay comfortably within our limit going forward, we worked with GitHub to identify additional strategies for keeping our API usage within its rate limit:
- Ingest real-time data via inbound webhooks
- Catch-up with historical data using the API
- Throttle your API requests to stay within your rate limits
- Use conditional requests, where possible (currently only available via v3 REST API)
Impact on User Onboarding
Another lesson we learned during this migration is that GitHub Apps requires a different onboarding approach. Unlike OAuth, where users typically identify themselves and grant access in one step, with GitHub Apps, users are required to leave our site and install the application on GitHub before data can be processed. To smooth out this flow, we use the GitHub App installation endpoints to track the user’s onboarding progress and display in-app calls to action linking out to the GitHub App setup URL (when appropriate).
Wrap up
We are very excited to continue to build on GitHub Apps and to take advantage of all the data exposed through the GraphQL API. We’re also looking forward to partnering with GitHub and pioneering more beta features in the future.
If you’re interested in taking a more data-driven approach to engineering management (and want to see our GitHub Apps workflow in action) check us out!
Written by
Related posts
Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers
An interview with economic researchers analyzing the causal effect of GitHub Copilot on how open source maintainers work.
OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models
The December 17 release of OpenAI’s o1 model is now available in GitHub Copilot and GitHub Models, bringing advanced coding capabilities to your workflows.
Announcing 150M developers and a new free tier for GitHub Copilot in VS Code
Come and join 150M developers on GitHub that can now code with Copilot for free in VS Code.