Yesterday’s Outage
A scheduled DB maintenance went haywire yesterday, taking a number of repositories temporarily offline. While pushing and pulling were briefly offline (and for that we’re sorry!), the first phase of…
A scheduled DB maintenance went haywire yesterday, taking a number of repositories temporarily offline.
While pushing and pulling were briefly offline (and for that we’re sorry!), the first phase of the migration worked. The problem was we didn’t know it worked – the tools we were using failed to report success (or anything, really). As a result we weren’t able to start phase two, which left some repositories unaccessible via the web interface.
What should have been a few minutes of interrupted service for some users turned into a huge pain. But I don’t want to blame our tools – the real problem is our maintenance strategy. Any amount of interrupted service is unacceptable at this point.
With that in mind, we’re going to re-think the way we do maintenance. Zero downtime and uninterrupted service is the goal. GitHub should be there when you need it.
When we have a solution we’ll post about it here (like we always do). Sorry for the outage – we really don’t want it to happen again.
Written by
Related posts
Celebrating the GitHub Awards 2024 recipients 🎉
The GitHub Awards celebrates the outstanding contributions and achievements in the developer community by honoring individuals, projects, and organizations for creating an outsized positive impact on the community.
New from Universe 2024: Get the latest previews and releases
Find out how we’re evolving GitHub and GitHub Copilot—and get access to the latest previews and GA releases.
Bringing developer choice to Copilot with Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s o1-preview
At GitHub Universe, we announced Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s o1-preview and o1-mini are coming to GitHub Copilot—bringing a new level of choice to every developer.