One of the slowest things you can do in Ruby is shell out to the operating system. As a contrived example, let’s open an empty file 1,000 times:

>> require 'benchmark'
>> `touch foo`
>> Benchmark.measure { 1000.times { `cat foo` } }.total
=> 4.51
>> Benchmark.measure { 1000.times {'foo') } }.total
=> 0.04

The difference is clear – the very act of shelling out is expensive. And while 1,000 may seem high, we have plenty of content on GitHub with 30+ shell calls per page. It starts to add up.

The Problem with Grit

Our Grit library was written as an API to the @git@ binary using, you guessed it, shell calls. In the past few weeks, as the site became slower and less stable, we knew we had to begin rewriting parts of our infrastructure. Response times and memory usage were both spiking. We began seeing weird out of memory errors and @git@ segfaults.

Scott Chacon had been working on a pure Ruby implementation of Git for some time, which we’d been watching with interest. Instead of shelling out and asking the @git@ binary for information, Scott’s library understands the layout of @.git@ directories and uses methods like to procure the requested information

Over the past few weeks we’ve been working with Scott to integrate his library into GitHub while he adds features and improves performance. Last night we rolled out a near-finished version of Scott’s library.

The result? Sweet, sweet speed.

Yep, we cut our average response time in half. (Lower numbers are better.)

Open Source

Scott will soon be merging the changes he made for us into his Grit fork. As a result, expect to see other Ruby-based Git hosting sites speed up in the next few weeks as they integrate the code we wrote.

We’re interested in funding the development of other Git related open source projects. If you’re working on something awesome that will drive Git adoption, please send us an email.

Future Enhancements

We’re still working to improve our architecture. As we roll out more changes, you’ll see them here. Everyone loves scaling.