GitHub Availability Report: January 2021
Introduction In January, we experienced one incident resulting in significant impact and degraded state of availability for the GitHub Actions service. January 28 04:21 UTC (lasting 3 hours 53 minutes)…
Introduction
In January, we experienced one incident resulting in significant impact and degraded state of availability for the GitHub Actions service.
January 28 04:21 UTC (lasting 3 hours 53 minutes)
Our service monitors detected abnormal levels of errors affecting the Actions service. This incident resulted in the failure or delay of some queued jobs for a period of time. Jobs that were queued during the incident were run successfully after the issue was resolved.
We identified the issue as caused by an infrastructure error in our SQL database layer. The database failure impacted one of the core microservices that facilitates authentication and communication between the Actions microservices, which affected queued jobs across the service. In normal circumstances, automated processes would detect that the database was unhealthy and failover with minimal or no customer impact. In this case, the failure pattern was not recognized by the automated processes, and telemetry did not show issues with the database, resulting in a longer time to determine the root cause and complete mitigation efforts.
To help avoid this class of failure in the future, we are updating the automation processes in our SQL database layer to improve error detection and failovers. Furthermore, we are continuing to invest in localizing failures to minimize the scope of impact resulting from infrastructure errors.
In summary
We’ll continue to keep you updated on the progress we’re making on ensuring reliability of our services. To learn more about how teams across GitHub identify and address opportunities to improve our engineering systems, check out the GitHub Engineering blog.
Tags:
Written by
Related posts
Celebrating the GitHub Awards 2024 recipients 🎉
The GitHub Awards celebrates the outstanding contributions and achievements in the developer community by honoring individuals, projects, and organizations for creating an outsized positive impact on the community.
New from Universe 2024: Get the latest previews and releases
Find out how we’re evolving GitHub and GitHub Copilot—and get access to the latest previews and GA releases.
Bringing developer choice to Copilot with Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s o1-preview
At GitHub Universe, we announced Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s o1-preview and o1-mini are coming to GitHub Copilot—bringing a new level of choice to every developer.