More than meets the pull request: maintainers talk contributions
Creating an open source project can feel a bit like sending out an open invite to a party—will it be a roaring good time, or will you unbegrudginly dine on…
Creating an open source project can feel a bit like sending out an open invite to a party—will it be a roaring good time, or will you unbegrudginly dine on leftover junk food for the following week after nobody shows? When the first guest arrives, you breathe a sigh of relief. The party’s a success, and you really didn’t need to eat potato chips for a week, anyway. But then that one really popular friend shows up with a crew in tow, and the next thing you know, someone spills wine on your new couch and their (uninvited) dog is chewing on your shoes. The question is: How do you move forward from here? Do you cut the music, stand on the coffee table, and yell “Everyone out!”? Or do you grab the club soda, a towel, and a soup stock bone to keep Fido occupied?
Open source maintainers often find themselves in a similar role, hosting an open invite party—especially when they start out. With time, however, they learn how to better craft their invite, set ground rules for attendees, and make sure their carpet (to continue the metaphor) remains unstained.
As Senior Editor for The ReadME Project, I spoke with three maintainers for this month’s Q&A about how they approach community contributions and the lessons they’ve learned along the way.
Mike Bayer is the creator and lead maintainer of the database toolkit SQLAlchemy. He has been working in the software construction industry since the mid 1990’s, across a wide variety of internet-related startups. After building out many database abstraction layers in languages like Perl, C, C++, and Java, he created SQLAlchemy, a one-stop toolkit to solve all those database problems that were universal to every project. He currently works for Red Hat.
Thea Flowers creates open source hardware synthesizers at Winterbloom and serves on the board of the Open Source Hardware Association. Previously, she maintained and contributed to widely-used Python packages such as urllib3, Twine, warehouse, and Nox. She’s been recognized as a Python Software Foundation fellow for her work in the Python community. She spends her free time playing guitar, 3D printing, and squading up in Fortnite. She’s also a noted weasel enthusiast.
Jordan Harband is a prolific open source maintainer, a TC39 committee delegate since 2014, and serves as a voting member on the Cross Project Council of the OpenJS Foundation. When he’s not talking to anyone who will listen about OSS, he’s spending time with his family, consuming sci-fi, gardening, skiing, and, of course, maintaining open source projects.
Mike Melanson: How did you first approach accepting contributions when you started out as a maintainer? How has this changed over time?
Thea: In the early days, I was super eager about everything and just excited that someone was looking at my project. At the same time, I took everything personally. I had this identity built around my projects, which wasn’t super healthy, and if someone said something in my code was bad, I’d get upset. But for the most part, I was just happy to see a contribution, so I would do everything I could to accept it and not question it too much. These days, now that I’m older and wiser, I’m a lot more cautious about code contributions, especially if I haven’t had a chance to talk to the person. There are a lot of things about your project that aren’t communicated with just the code—assumptions and things like that—so I always feel bad if somebody sends me a big contribution and it’s absolutely not what we want for the project. I’m a lot more deliberate about what I accept nowadays, and I try to guide people rather than just reviewing code and accepting or rejecting it. But some things just don’t fit in the project. In that case, I try to make affordances. Maybe they’re trying to do something the project isn’t intended to do, but a small change could enable them to create some code in their own repository that would accomplish their goals without me having to bring that code into our codebase and maintain it indefinitely.
Jordan: When I first started out, accepting contributions made me nervous. I’d inherited other people’s projects and I was very gun-shy. I didn’t want to break things, so I erred on the side of caution and tried to be very thorough. I still value all those things, but I’m more confident now. It’s always been really difficult to find people who are interested in making one contribution, let alone more than one, so I try to be as welcoming as possible while not breaking the code. My approach varies based on the project, but I always try to have the best test cases I can, that way if something breaks, it’s my fault for not providing a good enough test instead of being the user’s fault. If I had started out with my own projects, I would have probably accepted everything too. Starting out with someone handing me the keys to a project that people already used felt like having someone hand me the keys to a sports car at 16. The rational response was to be nervous that you’re about to crash it. I know that’s not everyone’s response, but I felt like I was being given responsibility that, impostor syndrome-wise, I wasn’t sure I deserved. I wanted to be really careful with it and not screw up. But yeah, if I had started out just making something that tripped and fell into popularity, I think I could have easily been very gung-ho and reckless about it.
Mike Bayer: Collaboration has changed so much since I started. My project predates not just GitHub and pull requests, but Git itself, and they have revolutionized collaboration. When I started, I didn’t fully understand the implications of contributions. If I liked the code people sent, I’d include it in the project, but that was a naive approach. Accepting a feature, especially one that is poorly thought out, means that you now own it. I’d accept features that didn’t fit into a larger vision because the project was not mature enough to have a vision. Now, as the owner of a project, I consider how features fit into a larger package, and I’ve learned that others might not have considered the same things when they submit a pull request. Building a plugin API, or some other way that users can add-on functionality independently, is often a good alternative to adding new features to the core project.
Mike Melanson: Different types of contributions have different implications down the line, which can change your approach. So, how do you differ in your approach to a bug fix versus a feature request versus documentation, for example?
Mike Bayer: If you have a large project with a lot of interconnected parts, even what you think might be a small change could have big implications. We absolutely require clear issue reports before accepting new features or even bug fixes. We have a lot of people who want to come with the PR directly, but we go through huge efforts to ask that they don’t. We really want to see what the problem is clearly, as we’re a very large library with lots and lots of touchpoints. If it’s a feature request, we want to know how it would work, why you need it, and all the rationale behind it. Some people are in a hurry to just give you some code, but the code is the smallest, easiest part. Tests are much harder, and documentation is the most difficult to create. It’s in the inverse order of what people want to work on.
Thea: I think how heavy-handed I am comes down to the end-user impact of the change, and also what I call inertia—how hard it is to undo the change or make another change on top of it. For things like documentation updates and stuff like that, there’s no inertia, right? We can just rollback a change or make another update, so if there’s a grammar mistake, I’ll just fix it and merge it. Tests are great because they generally don’t have any impact on the end-user, and I usually just speed those through as well. But things like bug fixes can be a little bit trickier. If it’s something obvious and there was a test for it, and we know that it’s not going to cause everybody’s workflow to fall apart, great, I’m going to take it. The more focused and central that bug fix is, the easier it is for me to just say, “Yeah, cool.” But when we’re talking about a big bug fix that changes a lot of behavior or has ramifications across the codebase or requires a lot of refactoring, that’s where I’m going to be cautious and make sure that we’ve covered our bases. For things like feature requests, I want to understand what the benefit is for end users, and I’m going to be looking for consensus with the other maintainers. Feature requests are fundamentally a different mindset for contributions. The other contributions are really like patching up a wall or repainting a fence, but a feature request is like building a new shed in your backyard. You’ve got to really think about it a bit.
Jordan: For me, it’s not a bug fix unless it comes with a regression test. My policy is: If you can’t provide some tests that would have failed without your fix, then the fix doesn’t go in. For feature requests, I try to think first about how it can be non-breaking, because I want to make sure it’s semver minor and not semver major. How can it be done in a way that’s generic enough and extensible enough without over-engineering it? That’s a subtle, subjective process. There’s no rubric, I just kind of think about it and hope that I’m thinking right. I think that including documentation is very important, but I recognize that the rare contributor who comes along might not be good at writing it, so I try not to make it a hard blocker. I can always add documentation if they don’t. That’s a burden I put on myself as part of accepting a feature request. I generally reject breaking changes. If it’s something important and we can’t add it without breaking something else, then I might add the new feature under a feature flag so that the breaking change isn’t enabled by default. That makes the transition much smoother and you get a lot more time testing the breakage before making it default. Lastly, I’ll consider complexity and whether something should or could be a separate package. Sometimes I’ll accept a change because the consequences of forking the ecosystem are worse, but sometimes I’ll reject a change because the consequences of adding the complexity to the library are worse.
Mike Melanson: We often focus on code contributions, but you still need non-code contributions like design, documentation, or project management. What methods have you used to attract non-code contributions and contributors?
Thea: The best thing you can do to encourage all different kinds of contributions is to have an active community that you engage with. You need the interpersonal aspect of building a community where you actually get to know each other. Platforms like Discord or Gitter, which allow for semi-real-time chats are great for this purpose. It makes people feel less hesitant to ask quick questions. A lot of people are intimidated to ask questions on repositories, especially if they’re not code contributions. Talk to the people who want to contribute—they’re usually your users. Also, make those types of contributions as easy as possible to make. For example, make sure that each page of documentation has an “edit this” button that takes users directly to that page’s source on GitHub. Make it easy for people to run test suites and identify missing elements. For project management and the abstract non-code aspects, it’s all about empowerment. When you find someone who’s interested in doing these things, hold their hand for the first couple of times and then empower them as soon as you can. The more barriers you put up, the fewer people will want to contribute or continue contributing. So take down the barriers and open up the doors as fast as you can, while ensuring that they’re not going to ruin everything.
Jordan: I try to slap ‘Help Wanted’ labels on as many things as I can. This serves as a signal that the maintainers want this change and that any attempts to fix it will be welcomed. If it doesn’t have the label, then you may need to read carefully to determine if it’s desirable. If someone asks about documentation, I’ll immediately slap a ‘Help Wanted’ label on it. On my own projects, any documentation I write will inevitably be tainted by my already knowing how it works. I think newcomers can write way better documentation that can then be reviewed by experts. As for attracting contributors, I don’t really know how to do that. They just show up sometimes, and it’s up to me to keep them around, which is difficult.
Mike Bayer: I haven’t found many ways to attract contributors. I reached out on Twitter one time asking for graphic design help and found someone to create some of the cartoon icons in our documentation. And I still reach out on Twitter and Mastodon for feedback on our documentation. We also attempted to do a sprint at PyCon many years ago to focus on documentation, but it didn’t really work out. It was my naivete in understanding the people who were participating in the sprint and what their interests were. New sprinters and more entry-level programmers who don’t have a pre-existing investment in your project are often fairly eager to see what it’s like to code on a big project. Think of it like this: If you had some visitors to a racetrack, they’d be mostly interested in going for a ride in the racecar and meeting the celebrities rather than organizing all the nuts and bolts in the pit. So, if you decide to organize a sprint, you have to think up front who is going to be in the sprint and what perspective they are coming from. Then when the sprint happens, if you get people who are not what you hoped for, you probably have to accommodate them and let go of whatever you were hoping for. Like so many other things in open source, sprints can end up not actually being helpful for the lead maintainer looking to lighten their workload.
Mike Melanson: Are there types of contributions you wish you’d get more of, and types you wish you’d get less?
Jordan: Really, I’d like to have more of almost everything. The only thing I’d want less of is people trying to update dependency versions, or PRs to remove test files in my published packages, as if I’d just forgotten to omit them from a project I’ve been maintaining for a decade. These types of presumptive requests can be frustrating, but if I had to pick, I’d rather get all of those than get less of the rest. It remains a challenge to get contributions. People who contribute as part of their job or those with a specific feature or bug in mind sometimes do contribute, but they don’t always stick around. Ultimately, what I wish for the most is more long-term contributions from individuals. I’m excited to add maintainers to my projects, but there are shockingly few people who have the stamina to stick around.
Thea: Documentation is always a great first contribution for anyone. Every project requires documentation, whether it’s updating existing documents or creating new ones. Automation is also a valuable contribution, particularly for newcomers who are just getting started with a project and might notice things that we, as maintainers, have become accustomed to and no longer notice. Automating certain tasks can make things easier for everyone. Any contribution that helps with active repository maintenance is always welcome. For instance, testing and eliminating technical debt, such as removing deprecated features, is an excellent contribution. Personally, I love it when I receive a pull request with nothing but deletions. If there are misspellings or sentences that make no sense, or if a sentence has serious grammatical errors, then feel free to submit a pull request to fix them. But please don’t make tiny edits, like removing a single comma or dash, for the sake of contributing. It’s not a good use of anyone’s time. One thing I’ve encountered surprisingly often is bug reports or issues that are incredibly long and detailed, as if they were academic papers. For example, they might examine a design and point out everything that is wrong with it. While they may be correct, the way they present the information can be condescending and unhelpful. I think it’s much better to provide feedback in a clear and concise manner that’s respectful of everyone’s time.
Mike Bayer: We get a lot of typo fixes, and maybe some grammatical ones, and those are all great, but people writing whole sections of documentation is rare. I think it’s happened maybe once in 20 years. If someone is really good at writing documentation and wants to come along and write some sections, that would be amazing. And like Jordan was saying, it’s really hard for developers to write docs, because you’ve been looking at the code for so long. Having another pair of eyes is crazy helpful for docs, and the same goes for reviewing patches. We have some people that help review patches now, and it’s super helpful. Beyond that, we use GitHub Discussions, and we’re always looking for more people to join the discussions and help each other out. Some people asking questions have never programmed before, or they just started a new job and they’re lost, and they need someone to be patient with them and help them, and I only have so much patience.
Mike Melanson: What about quality—is there such a thing as poor quality contributions? If so, how do you ensure that you’re getting quality contributions?**
Thea: As I mentioned earlier, poor quality contributions do exist, such as minor changes or issue reports that lack any useful information. These are what I’d call low quality contributions. But, in most cases, individuals interacting with your GitHub repository or community are attempting to make a useful contribution. As maintainers, it’s crucial to establish guidelines so that people can make a valuable contribution instead of wasting their time.
Mike Bayer: We use issue templates, pull request templates, extensive code style checks, formatters and tooling, continuous integration and code review tools to try to keep things high quality. There are several stages where we try to communicate to the contributor what we want them to do. They still don’t get it right most of the time, but it’s always a good sign when they try. When a contributor comes in and we can tell that they were trying to follow our guidelines, it makes our job easier, and we trust them a lot more.
Jordan: First of all, a poor quality contribution does not mean that the contributor is of poor quality. That’s a hard thing for us humans to separate, on both ends. Second, the reason for poor quality contributions can be any number of things. It could be poor quality because there are no tests, the documentation is insufficient, the API naming is unintuitive, or any other reason. It could also be poor quality because the fix they propose is absurd, even if the thing they want to fix is great. For maintainers, it can be a struggle when the rare contributor shows up. We want to encourage and invest time in them, but not everyone is interested in learning. It’s a trade-off. The more things that can be automated in continuous integration (CI), the better. Linting, tests, and code coverage are crucial because they can help catch errors and mistakes that contributors may miss. Even if the error messages aren’t super helpful, it feels better to have a machine tell you to fix something than to have a human do it, even if the human programmed the machine. This is just another weird aspect of human nature. To minimize the need for human intervention, you can do several things. You can create a CONTRIBUTING.MD, have a code of conduct, use CI automation, and, as mentioned, use issue and PR templates to make things as clear as possible. These are all desperate attempts to add as many layers as possible to reduce the need for human molding, because interpersonal conflicts can come up when there’s too much human intervention. You want to give contributors the maximal opportunity to make the best possible contribution.
Mike Melanson: What one tip do you have for other maintainers who are just starting out?
Jordan: If you’re maintaining multiple projects, automate and consistify as much as possible. If you’re just maintaining one project, it’s more about setting expectations. You need to set expectations clearly, and stick to them, both in terms of responsiveness and backwards compatibility support. Both are important, but they change with your situation.
Mike Bayer: Remember that, while contributors are trying to be helpful, they’re probably not going to stick around, so whatever they’re giving you, that’s going to be yours. You need to know how it works. Some very good programmers will give you something very intricate, and it’s going to be hard to maintain. When it breaks, that’s going to be your problem.
Thea: Learn to say “no” often and politely, and sometimes assertively. Ultimately, this is your garden, right? Other people may come and want to plant things, but it’s really up to you to tend to it.
Do you have a burning question about Git, GitHub, or open source software development? Drop your question to us on your social channel of choice using #askRMP, and it may be answered by a panel of experts in an upcoming newsletter or a future episode of The ReadME Podcast!
Want to get The ReadME Project right in your inbox? Sign up for our newsletter to receive new stories, best practices and opinions developed for The ReadME Project, as well as great listens and reads from around the community.
Tags:
Written by
Related posts
How to build an open source metrics dashboard
How GitHub volunteers built an open source metrics dashboard for the World Health Organization and some best practices they picked up along the way.
Automating open source: How Ersilia distributes AI models to advance global health equity
Discover how the Ersilia Open Source Initiative accelerates drug discovery by using GitHub Actions to disseminate AI/ML models.
Highlights from Git 2.46
Git 2.46 is here with new features like pseudo-merge bitmaps, more capable credential helpers, and a new git config command. Check out our coverage on some of the highlights here.