GitHub Availability Report: January 2021
Introduction In January, we experienced one incident resulting in significant impact and degraded state of availability for the GitHub Actions service. January 28 04:21 UTC (lasting 3 hours 53 minutes)…
Introduction
In January, we experienced one incident resulting in significant impact and degraded state of availability for the GitHub Actions service.
January 28 04:21 UTC (lasting 3 hours 53 minutes)
Our service monitors detected abnormal levels of errors affecting the Actions service. This incident resulted in the failure or delay of some queued jobs for a period of time. Jobs that were queued during the incident were run successfully after the issue was resolved.
We identified the issue as caused by an infrastructure error in our SQL database layer. The database failure impacted one of the core microservices that facilitates authentication and communication between the Actions microservices, which affected queued jobs across the service. In normal circumstances, automated processes would detect that the database was unhealthy and failover with minimal or no customer impact. In this case, the failure pattern was not recognized by the automated processes, and telemetry did not show issues with the database, resulting in a longer time to determine the root cause and complete mitigation efforts.
To help avoid this class of failure in the future, we are updating the automation processes in our SQL database layer to improve error detection and failovers. Furthermore, we are continuing to invest in localizing failures to minimize the scope of impact resulting from infrastructure errors.
In summary
We’ll continue to keep you updated on the progress we’re making on ensuring reliability of our services. To learn more about how teams across GitHub identify and address opportunities to improve our engineering systems, check out the GitHub Engineering blog.
Tags:
Written by
Related posts
What 986 million code pushes say about the developer workflow in 2025
Nearly a billion commits later, the way we ship code has changed for good. Here’s what the 2025 Octoverse data says about how devs really work now.
Introducing Agent HQ: Any agent, any way you work
At Universe 2025, GitHub’s next evolution introduces a single, unified workflow for developers to be able to orchestrate any agent, any time, anywhere.
Octoverse: A new developer joins GitHub every second as AI leads TypeScript to #1
In this year’s Octoverse, we uncover how AI, agents, and typed languages are driving the biggest shifts in software development in more than a decade.