Data at GitHub
There are over 1.5 million people working on over 2.5 million repositories on GitHub these days. Here are a few more fun stats I grabbed recently: Exploring GitHub Data The…
There are over 1.5 million people working on over 2.5 million repositories on GitHub these days. Here are a few more fun stats I grabbed recently:
Exploring GitHub Data
The GitHub API provides access to the many public activities that happen on GitHub.com. Before today, if you were looking to analyze this data, you would need to archive it on your own and store it somewhere capable of querying such a large dataset.
Ilya Grigorik recently released a project called GitHub Archive.
GitHub Archive is a project to record the public GitHub timeline, archive it, and make it easily accessible for further analysis.
Analyze GitHub Data with Google BigQuery
We are happy to announce that the GitHub public timeline is now a featured public dataset available on Google BigQuery, which launched to the public today.
Running queries against the GitHub dataset is free. After logging into BigQuery, add the project name “githubarchive”.
Bonus Dataset: Programming Language Correlations
This dataset, available in the language_correlation
table on BigQuery, explores the relationships between programming languages as seen by GitHub. Example: How likely is it that a programmer who writes in Objective-C also programs in Java? (31%). Read on to learn more about how this data was gathered.
Generals in the Editor War may note that Emacs users are 35.2% likely to also hack on Vim, while Vim users are only 17.3% likely to hack on Emacs, so there’s that.
Written by
Related posts
Enhancing the GitHub Copilot ecosystem with Copilot Extensions, now in public beta
Whether you’re an individual developer looking to streamline your workflow or an organization aiming to integrate proprietary tools, GitHub Copilot Extensions now offers a platform to make that happen and to share your creations on the GitHub Marketplace.
First Look: Exploring OpenAI o1 in GitHub Copilot
We’ve tested integrating OpenAI o1-preview with GitHub Copilot. Here’s a first look at where we think it can add value to your day to day.
GitHub Availability Report: August 2024
In August, we experienced one incident that resulted in degraded performance across GitHub services.