Introducing self-service SBOMs
Developers and compliance teams get a new SBOM generation tool for cloud repositories.
Following the 2019 Octoverse report, this latest article provides trends and insights into developer activity on GitHub in the early days of COVID-19.
This article is a special report following the 2019 Octoverse report, providing trends and insights into developer activity on GitHub in the early days of COVID-19. The analysis is brought to you from the GitHub Data Science Team.
Note: This report is best viewed on a large screen. The included charts and graphics are not optimized for mobile viewing.
COVID-19 brought a sudden—and global—need for people to stay home. This pressed many organizations to support remote work wherever possible, changing the routines and environment of millions of people in a matter of days. This shift means developers and other technology professionals are transitioning to a remote-first world, many while also caring for children or other family members, finding a new work cadence, and redefining what it means to balance (or merely maintain) both work and life.
How has this shift affected our work as developers? This report centers on three themes for which data is readily available: productivity and activity, work cadence, and collaboration. In our findings, we offer insights for software developers and company leaders, and suggest implications for leading newly distributed teams through uncertainty.
As the largest global developer platform, GitHub is in a unique position to see patterns and changes in developer activity. Our goal in conducting this analysis and sharing this report is to help the community understand what is happening in developer activity so we’re better prepared and able to adapt to future business disruptions. We plan to continue and extend these analyses later this year in our annual State of the Octoverse report. This will allow us to see how the trends we see may have changed over time, so stay tuned.
When we compare the first three months of 2020 to the same time period in 2019:
How to read this report
The analysis for each theme in this report is conducted on a different dataset, selected to answer the research questions posed in that section. You’ll see a description of the data in a box like this. Theme #3 includes two descriptions. At the end of this report, we describe the overall context of our analysis (GitHub user data), our research design decisions and limitations, and upcoming research.
Many technology platforms like video conferencing have reported an increase in activity since COVID-19 (like Zoom and Microsoft Teams). And while these statistics are an interesting glimpse into how we communicate and meet with our colleagues and customers, this could simply be evidence of people shifting meetings from in-person to online spaces.
If we want to understand people’s work—or in this case, changes to people’s work that happen during the COVID-19 outbreak—we can ask them directly (in interviews or surveys) or look at the artifacts created when they do their work (such as notebooks, source code, or system logs). For our investigation of developer productivity and activity, we analyze GitHub developer activity.
A complex construct such as productivity can’t be captured with simple or single metrics like lines of code or issues closed. These measures don’t capture the importance of impact, complexity, or outcome of work tasks. Recent research finds that developer productivity is a combination of several things, including time spent on code, completing code, and reviewing code (Meyer et al. 2014). Our analysis adopts a similar approach: by including several measures of developer activity, we present a composite measure of developer productivity and reflect developers’ own perceptions (Meyer et al. 2014).
Development doesn’t change
Development activity on GitHub is a good proxy for activity that is fairly robust to shifting work routines. While some things like kanban boards may have moved from whiteboards to online tools, development can be observed via the same pushes, pull requests, and issues regardless of whether we do that work from an office or at home. This means our year-over-year comparisons are reliable.
In contrast, the use of video conferencing increases when people can’t meet in person, even if the number of meetings stays the same. This makes year-over-year comparisons unreliable for people who were not previously working from home.
Theme #1: Data
The data for this section of the report comes from analyzing all GitHub platform activity—public (including open source) and private activity—year over year. The period of comparison is January 1, 2019 – March 31, 2019 vs. January 1, 2020 – March 31, 2020 respectively. The change in the geographic distribution of active users included in the analysis year over year is shown below.
Figure: Geographies included in Theme #1 analysis: Distribution of active users January 2019 – March 2020
Update—May 6, 2:00 pm PT: This chart was replaced to reflect the corrected percentage for Australia.
To allow for easier year-over-year comparisons, we will normalize our analysis by using figures per user for this report, unless noted otherwise.
For our composite measure of developer productivity, we investigated pull requests, pushes, reviewed pull requests, and commented issues per user. Overall, we see consistent or increased activity for these measures compared to last year.
We start by showing pull requests per user per day. As noted in the chart, the regular dips in activity correspond to weekends.
Chart: Pull requests per active user, year over year comparison (weekends included)
If we remove weekends, it gives us a smoother chart and eases readability. This is pull request creation per user per day, with weekends removed:
A chart of pull requests per active user, year over year comparison (weekends excluded)
Moving forward, we will show charts with this normalized activity (per user) and smoothed (without weekends) to ease readability, except where otherwise noted.
Next, we show push volume per user per day, weekends removed:
Chart: Pushes per active user, year over year comparison (weekends excluded)
Our investigation also includes reviewed pull requests and commented issues, and the pattern is similar: higher than last year and consistent across the first three months of 2020. We have not included the reviewed pull requests or commented issues chart for the sake of brevity.
At first glance, this suggests that developer activity has stayed consistent or slightly increased throughout the initial wave of the pandemic and shift to working from home. However, further investigation in Theme #2 sheds light into the timing and cadence of work and provides additional insight into how developers are accomplishing these levels of activity.
We also measure developer activity by looking at GitHub issues per user. When compared to 2019, daily issue creation per user on GitHub is lower or equal for most of 2020 and dips lower in February. This started to shift in mid-March and continues throughout the month, as noted by the arrow in the chart. Again, the chart is shown without weekends.
Chart: Issues created per active user, year over year comparison (weekends excluded)
We conducted additional investigation into this shift, and see it alongside an increase in issue creation rates across all repositories, with the biggest increase seen in repositories owned by free user and paid team accounts. The following chart shows the volume of issues created by repository plan vs. last year (again without weekends):
Chart: Issues per active user: percentage increase year over year by repository owner type
When looking at the comparison chart, the activity in repositories owned by Enterprise Cloud accounts for the first three months of the year gives us some interesting insights. Notice two interesting patterns in issue creation volume: First, there was a noticeable decrease in issue creation through the middle of February (noted by the arrow marked A in the chart). This corresponds to the time period when Asia and Europe were hit by COVID-19, and North America’s West Coast began shifting to working from home. Late February and early March showed a return to last year’s levels of activity, with continued dips for enterprise repositories on the weekends. These weekend dips in activity are not reflected in repositories owned by free user and paid team accounts, which may be expected given these repositories are more likely to contain open source and project work. We then see issues exceed last year’s levels of activity, though not as high as the spikes seen among repositories owned by free user and paid team accounts (noted by the arrow marked B in the chart). This may signal resuming the activities involved in enterprise software development following an adjustment to remote work.
Why the swings in issue creation for enterprises? Understanding how GitHub issues are used offers insights into what may be happening: Issues are used for communication and planning. They are how many users track tasks, enhancements, features, and bugs. In our personal or hobby development, we may not go to the trouble of doing planning (or at least not logging it in an issue—we may just put it on a sticky note). However, enterprise development is usually much more structured and coordinated, planning larger and more complex projects and features, and requiring communication across teams. Issues will drive this work and are likely more sensitive to changes or disruptions in the ceremonies that surround our work.
The swings we saw in issues could be attributed to the disruption in the existing planning process around software development and the move to remote work and newly “forced” distributed teams. A probable explanatory scenario could be:
We’re happy to see this relatively fast recovery, which may be an initial rebound to the strong dip we saw in February. Given the complexity required to shift and adapt to the demands and requirements of new work, combined with the general economic uncertainty and extended shelter-in-place orders, we anticipate fluctuations to continue and settle through the summer.
Looking at patterns of work activity doesn’t give us a complete picture of a developer’s day. To better understand how COVID-19 is impacting developers, we also investigated how work may have shifted.
People who typically work from home tend to work more hours—up to one or two 8-hour days more per week (Hill et al. 2010). This is likely because our work begins to stretch into our lives, and the boundaries between work and home blur. With the sudden shift to remote work, we wondered if we would see any patterns of increased development work as well.
Theme #2: Data
The data for this section of the report comes from analyzing paid organization accounts that meet the following criteria:
More than 40,000 organizations were included in our analysis, with strongest representation in North America (40%), Europe (35%), and Asia (17%) as shown in the following figure:
Figure: Geographies included in Theme #2 analysis: Distribution of active paid accounts, January 2019 – March 2020
A work day was captured as the time difference between the first and the last
git push to the repository’s default branch. This is a rough approximation for length of day worked and gives us one way to think about when a work day starts and ends.
For this analysis, we narrowed our investigation to two time zones in the US (Pacific and Eastern), and looked at both work days and work volume. We selected these two time zones because the coasts each issued fairly coordinated shelter-in-place orders while the Midwest did not— meaning any impacts are more likely to show up in the data—and our sample size was large enough to be significant and not be overly influenced by a few organizations.
We began our investigation by looking at development activity cadence in the US Pacific Time Zone. In this analysis, we saw work days have variable lengths compared to last year and then significantly increase starting mid-March (typically 30-60 minutes per day). Much of this increase in work day length is seen on weekends.
Chart: Year over year change between first and last push, average (weekends included). Pacific Time Zone
To augment this analysis, we investigated the volume of work being done by developers over this same time period. This helped us understand if developers’ workdays might be stretching longer while doing the same amount of work. That is, perhaps developers are working in smaller shifts spread out over the day to accommodate family or childcare, but doing approximately the same amount of work as before. (For example, three three-hour blocks spread out over 12 hours equates to a 12-hour workday with our proxy measure but is equivalent to a traditional nine-hour workday, which is eight hours of work with a one hour lunch, when not having to accommodate a stretched schedule). Therefore, we also investigated how often they pushed code in that timeframe as a rough proxy of amount of work done. We call this work volume.
We saw that users in the Pacific Time Zone have consistently increased their work volume compared to last year. This analysis suggests that, for these users, they are working longer days and doing more development work, particularly in March. This is likely impacted by those who typically commute to an office “recovering” that time now that they work from home. While this time could be split between home tasks and work activity, developers may be feeling pressure to push more often, and thereby showing this increased work volume in the data due to several factors: economic uncertainty and the desire to do well and stay employed, using work as a distraction to combat boredom when stuck at home, pressure from management to get products to market, or team norms to push frequently to maintain fast and stable software delivery cadences.
Chart: Year over year change in volume of pushes, average (weekends included). Pacific Time Zone
Next, we moved our analysis to the US East Coast, and observed that users here have seen shorter work days for much of 2020, and then an increase to the work day similar to the Pacific Time Zone starting mid-March (although shorter by 15-30 minutes). Again, much of this increase in work day length is observed on weekends.
Chart: Year over year change between first push and last push, average (weekends included). Eastern Time Zone
When looking at work volume that accompanies these work days, we saw that users in the East Coast Time Zone have an increased push volume that begins to decline after February. This analysis suggests that these users are doing more distribution of work, particularly as these trends correlate with shelter-in-place orders.
Chart: Year over year change in volume of pushes, average (weekends included). Eastern Time Zone
Our analysis suggests that developers are continuing to do sustained and even increased amounts of development, which some may cheer as evidence that productivity has continued in the face of uncertainty. However, combined with our work cadence analysis, we caution that developers, leaders, and organizations should take proactive steps to prevent burnout, and watch for it among their teams and peers.
The World Health Organization has recognized burnout as “an occupational phenomenon resulting from chronic workplace stress that has not been successfully managed.” While burnout specifically refers to workplace stress, it can be difficult to manage right now when our work is invading our personal space.
Dealing with burnout is important for our mental well-being, both in the workplace and in our personal lives. Teams and leaders that support flexible and sustainable work schedules and watch for burnout will have colleagues and teams that are happier and more productive. Remember, we’re all in this together.
For more on burnout and what you can do to address it, we point you to the article Understanding Job Burnout from Dr. Christina Maslach, an expert on workplace burnout. If you prefer, you can watch or listen to her conference talk from DevOps Enterprise Summit 2018 in Las Vegas.
Writing software is an inherently collaborative endeavor, even though so many depictions of developers in pop culture show it as a solitary activity. Don’t get us wrong: It can be fun and satisfying to solve problems on our own, but there’s a special magic in working with a team or community to build something together.
Shifting to remote work has changed a lot about the way we work: The communication and coordination ceremonies are different, and the artifacts may have changed from in-person whiteboards and sticky notes to digital analogs of these old mainstays as well as new technology.
In this final section of the report, we investigate how people are working together to make software.
Theme #3: Data for pull request analysis
The data for the following analysis in the report comes from paid organizations that meet the following criteria:
This resulted in more than 40,000 organizations for analysis. Note: This is the same data identified in Theme #2.
A pull request is how developers tell others about changes they make to a repository. Merging a pull request involves a group of interested developers reviewing changes, discussing modification, and sometimes doing follow-up work through commits. Finally, the pull request is merged into the relevant branch of the intended repository. To proxy this full collaborative process and see how it may have changed compared to last year, we measured the time to merge pull requests.
In Theme #1: Productivity and activity, we observed that pull requests created per user had remained relatively consistent or increased compared to last year. However, we do see that users’ behavior around pull requests has shifted: In January, time to merge took longer than last year by four to five hours for repositories owned by Enterprise Cloud accounts, and approximately one hour longer than last year for repositories owned by paid Team accounts. This increased time to merge pull requests early in the year is an interesting observation, and one we don’t have an explanation for.
In March, we saw the average time to merge pull requests drop: repositories owned by Enterprise Cloud accounts took 15-30 minutes longer to merge a pull request compared to last year (showing three hours or more improvement from January), and repositories owned by paid team accounts took about the same amount of time to merge compared to last year (showing approximately an hour improvement from January). These drop in times to merge from January—especially in March when many shelter-in-place orders took effect—may suggest a few things resulting from being more online: People are online and ready to review a pull request, have more time to focus, and can be more responsive when they are at home.
Chart: Year over year change in time to merge pull requests for paid accounts, average (weekly)
Theme #3: Data for open source analysis
The data for the analysis in this subsection came from the GitHub platform for open source software activity. Comparisons are conducted year over year, and include data from January 1, 2019 and ending March 31, 2020.
We did an additional analysis into time to merge pull requests for open source repositories, and see that early in the year, they took a few hours longer to merge compared to last year. In March, time to merge was faster compared to last year, ranging from 45 minutes to almost four hours faster in comparison. This could suggest that people are more engaged in open source projects and more responsive, particularly in March, as they are finding more projects they can do from home.
Chart: Year over year change in time to merge pull requests for open source projects, average (weekly)
This increased contribution and engagement with the open source community was exciting to see, so we took a deeper look.
We did an analysis to see what open source project creation looked like and saw there was growth throughout this year, with 27.62% more open source repositories created this year in late March compared to last year. Note: This chart includes weekends.
Chart: Year over year change in rate of open source project creation (weekends included)
We did an analysis on the top public projects, and saw a jump in collaboration (as measured by distinct contributors, which is defined by making a contribution) picked up in mid-March—just like we saw in that time to merge pull requests.
Chart: Select open source projects with largest increase in distinct contributors in 2020
Jitsi is a set of free and open source projects that provide fully encrypted voice (VoIP), videoconferencing, and instant messaging. The project includes Jitsi Videobridge (for multi-party calling) and Jitsi Meet (full video conferencing with web, Android, and iOS clients).
According to jitsi.org, the projects first started as a student project in 2003 and was renamed Jitsi in 2011 (from the Bulgarian “жици”, or “wires”), once it supported audio and video. In 2020, Jitsi passed 10 million monthly average users.
We’re also seeing some fantastic work in the community around public COVID-19 projects. This work started picking up in January, and we see a similar spike in contributors in mid-March.
Chart: Select public and open source COVID-19 related projects with largest percentage increase in distinct contributors in 2020
Overall, GitHub showed an increase in the number of active users compared to last year. This increase over 2019 was largely consistent through the first three months of the year, even through the COVID-19 outbreak. This year-over-year growth is comparable to the growth we’ve seen in previous years, but it may be notable considering the overall economic climate and uncertainty. We are unable to provide numbers due to confidentiality and sensitivity of the data, however we do provide visual representation of the data, so you can see the trend year over year. The regular dips in activity correspond to weekends, as noted in the chart by the gray bars.
Chart: Daily active users, year over year comparison (weekends included)
For this report, we analyzed developer activity on GitHub, looking at patterns of work and how they compare to activity last year. We focused on the first three months of 2020, when COVID-19 first impacted work routines around the world. To optimize for comparability, we compared the data to one year ago and matched by day of the week, so the data is shifted by two days (one day for yearly drift and one day for leap year). We limited our comparison to the three months in 2020 because this allowed us to focus on the time period of interest for this special report. Extending the analysis beyond these three months may allow us to spot more long-standing trends in the data, but it introduces variability and seasonality that we were not able to address in time for this report (like downturns in activity that may happen in summer months). We may include this in a future analysis, particularly as we continue and extend the analyses in this report for our annual State of Octoverse report later in the year to revisit these findings.
All data is reported in aggregate, and we provide details in each section to describe how our sample was selected, but do not provide access to the actual data to preserve anonymity.
Many thanks to our data scientists, contributors, and reviewers. Each is listed alphabetically by type of contribution.