Just like artists, musicians, filmmakers, and photographers, software developers are also important copyright holders. But code differs in important ways from the works of these other copyright creators, and applying the same rules to software often produces unintended consequences. GitHub is committed to representing the interests of developers in policy discussions to ensure that rules generally developed for copyright owners also make sense when applied to software.
One key example is rules that apply to online platforms that host copyrighted works. Over the past several years, lawmakers and regulators have proposed that online platforms take proactive efforts to implement “technical measures,” such as upload filtering or automated scanning to identify and protect copyrighted works. As developers understand, these measures are ill-suited to the unique environment of code collaboration, especially for open source software code. These technical measures do not effectively address software copyright infringement and can result in unintended consequences, such as removing critical code that can break developer ecosystems.
Over the past year, we have been participating in a series of technical measures consultations with the US Copyright Office (CO), which is reviewing the effect of such measures on a variety of works hosted on a variety of online platforms. We are pleased that in the closing plenary, the CO reflected the concerns GitHub shared about the impact of these measures on developers. The CO acknowledged in a key finding from its consultation that automated detection/filtering does not work well with software, which requires human review to understand the content, the context, and the implications of removal and remediation, instead of automatically removing access to potentially matched code.
What we shared with the US Copyright Office
Code is different from other copyrightable content.
Software code on GitHub is generally licensed openly, and distributed without requiring payment. While all code is highly functional, only the expressive aspects of code are protected by copyright law. Identifying those protected aspects of code is not a simple task that lends itself to automation. Additionally, code is a dynamic work, often not finished but regularly updated. Code is stored with unique version control technology–like Git–that make it resistant to removal and more difficult to detect specific pieces of content via filtering technologies.
Technical detection measures are ill-suited to source code.
Because of these unique traits, technical measures used for detecting copyright infringement of other forms of content do not translate well to source code and can result in seemingly duplicate code being flagged in error. Applying technical measures to remove access to detected source code can have a high false-positive rate, because these measures are rarely sensitive to the context of code that copyright law considers before concluding whether its use is infringing. These false positives can have an outsized effect on interdependent code ecosystems with shared functionalities. Due to the complex and interdependent nature of code, GitHub routinely subjects copyright takedown notices to technical review.
Technical removal measures do not translate well to the sharing norms of open source code.
The most valuable content hosted by code hosting platforms is licensed under terms that allow sharing and remixing. Auto-removal of code is ill-suited to the norms of sharing and remixing code on code-hosting platforms. When software developers share their code publicly under an open source license, that typically means they want their code to be shared. Oftentimes, copyright disputes arising on sites like GitHub are not about whether particular code may ever be posted publicly, but instead are about meeting specific requirements, such as attribution, under the terms of the relevant licenses. Many copyright disputes on GitHub can be resolved not by removing code, but by a developer making changes to the identified code or its accompanying license rather than removing it.
Wrongful code takedowns break developer ecosystems.
Code on GitHub may be in use by millions of computers around the world, and a wrongful takedown can have enormous consequences not just to GitHub’s users, but throughout the developer ecosystem. In modern software development, programmers write code that “depends” on other tested, proven, and widely accessible software—usually open source software—written by third parties. All types of software, from mobile apps to enterprise software run by corporations and governments, rely on these “dependencies.” When even a single dependency is removed from a collaboration platform like GitHub in response to an infringement complaint, its removal can break the software of an exponential number of other programs that “depend” on that code.
GitHub is proud to represent developers in policy discussions, especially when developer interests differ significantly from other stakeholders. Policymakers taking aim at copyright infringement may not be aware of the unique environment of code collaboration and the unintended downstream impacts that can result from wrongful takedowns. In the past, we successfully advocated for excluding software development platforms from upload filter requirements in the EU Copyright Directive with the help of developers speaking up about how it could affect them. We also seek to inform developers about our Digital Millennium Copyright Act (DMCA) Takedown Policy, demystify how the DMCA applies to developers, and have established a Developer Defense Fund to provide independent legal support from the Stanford Juelsgaard Clinic to review and handle appropriate DMCA cases for developers on GitHub and across the software ecosystem. We are encouraged by the findings the CO reported from the technical measures consultations, and will work to ensure that future copyright regulations take the unique norms of code collaboration into consideration.
Watch the recordings of all the plenary sessions here.