In September, we experienced one incident resulting in significant impact and degraded state of availability to the GitHub Pages service.
One of the nodes responsible for serving GitHub Pages sites became unresponsive to new builds. Since it was still online and able to serve requests, our system did not automatically remove it from service. This caused builds destined for that node to fail until we were able to manually disable that node. Going forward, we will automatically remediate failures of this type, and ensure builds are automatically retried on a different node when possible.
We’ll continue to keep you updated on the progress we’re making on ensuring reliability of our services. To learn more about what we’re working on, check out the GitHub engineering blog.