As I detailed in How We Made GitHub Fast, we have created a new data serialization and RPC protocol to power the GitHub backend. We have big plans for these technologies and I’d like to take a moment to explain what makes them special and the philosophy behind their creation.
The serialization format is called BERT (Binary ERlang Term) and is based on
the existing external term format already implemented by Erlang. The RPC protocol is called BERT-RPC and is a simple protocol built on top of BERT packets.
You can view the current specifications at http://bert-rpc.org.
This is a long article; if you want to see some example code of how easy it is to setup an Erlang/Ruby BERT-RPC server and call it from a Ruby BERT-RPC client, skip to the end.
For the new GitHub architecture, we decided to use a simple RPC mechanism to expose the Git repositories as a service. This allows us to federate users across disparate file servers and eliminates the need for a shared file system.
Choosing a data serialization and RPC protocol was a difficult task. My first thought was to look at Thrift and Protocol Buffers since they are both gaining traction as modern, low-latency RPC implementations.
I had some contact with Thrift when I worked at Powerset, I talk to a lot of people that use Thrift at their jobs, and Scott is using Thrift as part of some Cassandra experiments we’re doing. As much as I want to like Thrift, I just can’t. I find the entire concept behind IDLs and code generation abhorrent. Coming from a background in dynamic languages and automated testing, these ideas just seem silly. The developer overhead required to constantly maintain IDLs and keep the corresponding implementation code up to date is too frustrating. I don’t do these things when I write application code, so why should I be forced to do them when I write RPC code?
Protocol Buffers ends up looking very similar to Thrift. More IDLs and more code generation. Any solution that relies on these concepts does not fit well with my worldview. In addition, the set of types available to both Thrift and Protocol Buffers feels limiting compared to what I’d like to easily transmit over the wire.
XML-RPC, SOAP, and other XML based protocols are hardly even worth mentioning. They are unnecessarily verbose and complex. XML is not convertible to a simple unambiguous data structure in any language I’ve ever used. I’ve wasted too many hours of my life clumsily extracting data from XML files to feel anything but animosity towards the format.
JSON-RPC is a nice system, much more inline with how I see the world. It’s simple, relatively compact, has support for a decent set of types, and works well in an agile workflow. A big problem here, though, is the lack of support for native binary data. Our applications will be transmitting large amounts of binary data, and it displeases me to think that every byte of binary data I send across the wire would have to be encoded into an inferior representation just because JSON is a text-based protocol.
After becoming thoroughly disenfranchised with the current “state of the art” RPC protocols, I sat down and started thinking about what the ideal solution would look like. I came up with a list that looked something like this:
- Extreme simplicity
- Dynamic (No IDLs or code generation)
- Good set of types (nil, symbols, hashes, bignums, heterogenous arrays, etc)
- Support for complex types (Time, Regex, etc)
- No need to encode binary data
- Synchronous and Asynchronous calls
- Fast serialization/deserialization
- Streaming (to and from)
- Caching directives
I mentioned before that I like JSON. I love the concept of extracting a subset of a language and using that to facilitate interprocess communication. This got me thinking about the work I’d done with Erlectricity. About two years ago I wrote a C extension for Erlectricity to speed up the deserialization of Erlang’s external term format. I remember being very impressed with the simplicity of the serialization format and how easy it was to parse. Since I was considering using Erlang more within the GitHub architecture, an Erlang-centric solution might be really nice. Putting these pieces together, I was struck by an idea.
Of course, the first thing any project needs is a good name, so I started brainstorming acronyms. EETF (Erlang External Term Format) is the obvious one, but it’s boring and not accurate for what I wanted to do since I would only be using a subset of EETF. After a while I came up with BERT for Binary ERlang Term. Not only did this moniker precisely describe the nature of the idea, but it was nearly a person’s name, just like JSON, offering a tip of the hat to my source of inspiration.
Over the next few weeks I sketched out specifications for BERT and BERT-RPC and showed them to a bunch of my developer friends. I got some great feedback on ways to simplify some confusing parts of the spec and was able to boil things down to what I think is the simplest manifestation that still enables the rich set of features that I want these technologies to support.
The responses were generally positive, and I found a lot of people looking for something simple to replace the nightmarish solutions they were currently forced to work with. If there’s one thing I’ve learned in doing open source over the last 5 years, it’s that if I find an idea compelling, then there are probably a boatload of people out there that will feel the same way. So I went ahead with the project and created reference implementations in Ruby that would eventually become the backbone of the new GitHub architecture.
But enough talk, let’s take a look at the Ruby workflow and you’ll see what I mean when I say that BERT and BERT-RPC are built around a philosophy of simplicity and Getting Things Done.
To give you an idea of how easy it is to get a Ruby based BERT-RPC service running, consider the following simple calculator service:
# calc.rb require 'ernie' mod(:calc) do fun(:add) do |a, b| a + b end end
This is a complete service file suitable for use by my Erlang/Ruby hybrid BERT-RPC server framework called Ernie. You start up the service like so:
$ ernie -p 9999 -n 10 -h calc.rb
This fires up the server on port 9999 and spawns ten Ruby workers to handle requests. Ernie takes care of balancing and queuing incoming connections. All you have to worry about is writing your RPC functions, Ernie takes care of the rest.
To call the service, you can use my Ruby BERT-RPC client called BERTRPC like so:
require 'bertrpc' svc = BERTRPC::Service.new('localhost', 9999) svc.call.calc.add(1, 2) # => 3
That’s it! Nine lines of code to a working example. No IDLs. No code generation. If the module and function that you call from the client exist on the server, then everything goes well. If they don’t, then you get an exception, just like your application code.
The Ernie framework and the BERTRPC library power the new GitHub and we use them exactly as-is. They’ve been in use since the move to Rackspace three weeks ago and are responsible for serving over 300 million RPC requests in that period. They are still incomplete implementations of the spec, but I plan to flesh them out as time goes on.
If you find BERT and BERT-RPC intriguing, I’d love to hear your feedback. The best place to hold discussions is on the official mailing list. If you want to participate, I’d love to see implementations in more languages. Together, we can make BERT and BERT-RPC the easiest way to get RPC done in every language!