As GitHub Copilot continues to rapidly grow, we continue to observe an increase in patterns of high concurrency and intense usage. While we understand this can be driven by legitimate workflows, this type of usage places significant strain on our shared infrastructure and operating resources.

To ensure every user gets a fast, reliable Copilot experience, we’re updating limits to better balance capacity. These will roll out over the next few weeks. There will be two types of limits that users may see. Both are meant to balance capacity and protect the system for everyone.

  • Limits for overall service reliability
  • Limits for specific models or model family capacity

What this means for you

  • When you hit a service reliability limit, you will need to wait until your current session resets. This will be visible in the error experience when you are rate limited.
  • When you hit a usage limit for specific models or model family, you can switch to an alternative model or use Auto mode.

We recommend distributing requests more evenly over time when possible, rather than sending them in large, concentrated waves. You can also upgrade your plan for higher limits.

We know limits can be frustrating and are actively exploring new ways to offer increased capacity for all users. We will share updates as we identify durable solutions. Learn more in our docs about rate limiting.

To further improve service reliability, we are streamlining our model offerings and focusing resources on the models our users use the most. As a first step, we’ll be retiring Opus 4.6 Fast for Copilot Pro+ users, beginning today. We recommend using Opus 4.6 as an alternative model with similar capabilities.