Learn about what GitHub is doing to make their products more inclusive, and what’s next.
Eventually, any interesting software project will come to depend on another project, library, or framework. Git provides submodules to help with this. Submodules allow you to include or embed one or more repositories as a sub-folder inside another repository.
For many projects, submodules aren’t the best answer (more on this below), and even at their best, working with submodules can be tricky, but let’s start by looking at a straight-forward example.
Adding a Submodule
Let’s say you’re working on a project called Slingshot. You’ve got code for
y-shaped stick and a
At the same time, in another repository, you’ve got another project called Rock—it’s just a generic
rock library, but you think it’d be perfect for Slingshot.
You can add
rock as a submodule of
slingshot. In the
git submodule add https://github.com/<user>/rock rock
At this point, you’ll have a
rock folder inside
slingshot, but if you were to peek inside that folder, depending on your version of Git, you might see … nothing.
Newer versions of Git will do this automatically, but older versions will require you to explicitly tell Git to download the contents of
git submodule update --init --recursive
If everything looks good, you can commit this change and you’ll have a
rock folder in the
slingshot repository with all the content from the
On GitHub, the
rock folder icon will have a little indicator showing that it is a submodule:
And clicking on the
rock folder will take you over to the
That’s it! You’ve embedded the
rock repository inside the
slingshot repository. You can interact with all the content from
rock as if it were a folder inside
slingshot (because it is).
On the command-line, Git commands issued from
slingshot (or any of the other folders,
y-shaped-stick) will operate on the “parent repository”,
slingshot, but commands you issue from the
rock folder will operate on just the
cd ~/projects/slingshot git log # log shows commits from Project Slingshot cd ~/projects/slingshot/rubber-band git log # still commits from Project Slingshot cd ~/projects/slingshot/rock git log # commits from Rock
Joining a project using submodules
Now, say you’re a new collaborator joining Project Slingshot. You’d start by running
git clone to download the contents of the
slingshot repository. At this point, if you were to peek inside the
rock folder, you’d see … nothing.
Again, Git expects us to explicitly ask it to download the submodule’s content. You can use
git submodule update --init --recursive here as well, but if you’re cloning
slingshot for the first time, you can use a modified
clone command to ensure you download everything, including any submodules:
git clone --recursive <project url>
Switching to submodules
It can be a little tricky to take an existing subfolder and turn it into an external dependency. Let’s look at an example.
You’re about to start a new project—a magic roll-back can–which also needs a
rubber-band. Let’s take the
rubber-band you built for
slingshot, split it out into a stand-alone repository, and then embed it into both projects via submodules.
You can take everything from the Project Slingshot’s
rubber-band folder and extract it into a new repository and even maintain the commit history.
Let’s begin by extracting the contents of the
rubber-band folder out of
slingshot. You can use
git filter-branch to do this, leaving you with just the commits related to
git filter-branch command will rewrite our repository’s history, making it look as if the
rubber-band folder had been it’s own repository all along. For more information on
git filter-branch, see this article.
The first step is to make a copy of
slingshot to work on—the end-goal is for
rubber-band to stand as its own repository, so leave
slingshot as is. You can use
-r to recursively copy the entire
slingshot folder to a new folder
cd .. cp -r slingshot rubber-band
It looks like
rubber-band is just another
slingshot, but now, from the
rubber-band repository, run
cd rubber-band pwd # (double check before proceeding!) git filter-branch --subdirectory-filter rubber-band -- --all
At this point, you’ll have a folder
rubber-band, which is a repository that sort of resembles Project Slingshot, but it only has the files and commit history from the
Since you copied this from
slingshot, the new repository will still have any remote tracking branches you setup when it was
slingshot. You don’t want to push
rubber-band back onto
slingshot. You want to push this to a new repository.
Create a new repository for
rubber-band on GitHub, then update the remote for
rubber-band. Assuming you were calling the remote
origin, you could:
git remote set-url origin https://github.com/<user>/rubber-band
Then you can publish the new “generic rubber-band module” with
Now that you’ve separated
rubber-band into its own repository, you need to delete the old
rubber-band folder from the
git rm -r rubber-band git commit -m "Remove rubber-band (preparing for submodule)"
slingshot to use
rubber-band as a submodule:
git submodule add https://github.com/<user>/rubber-band rubber-band git commit -m "rubber-band submodule"
Like we saw before when we were adding
rock, we now have a repository-in-a-repository. Three repositories, in fact: the “parent” repository
slingshot, plus the two “sub” repositories,
In addition, if we dive back into
slingshot‘s history, we’ll see the commits we originally made into
rubber-band back when it was a folder—deleting the folder didn’t erase any of the history. This can sometimes be a little confusing—since the
rubber-band “child” repository has a copied-and-modified version of those old
slingshot commits, it can sometimes feel like you’re having déja vu.
Unfortunately, any collaborator who pulls
slingshot at this point will have an empty
rubber-band folder. You might want to remind your collaborators to run this command to ensure they have all the submodule’s content:
git submodule update --init --recursive
You’ll also want to add the
rubber-band submodule to
magic roll-back can. Luckily, all you need to do that is to follow the same procedure you used earlier when you added
slingshot, in “Adding a submodule.”
cd ~/projects/roll-back-can git submodule add https://github.com/<user>/rubber-band rubber-band git commit -m "rubber-band submodule" git submodule update --init --recursive
Advice on using submodules (or not)
- Remember that Git doesn’t download submodule contents by default. If you’re adding a submodule to an existing project, make sure anyone that works on the project knows they need to run commands like
git submodule updateand
git clone --recursiveto ensure they get everything—this includes any automated deployment or testing service that might be involved in the project! We recommend you use something like our “Scripts to Rule Them All” to ensure that all collaborators and services have access to the same repository content everywhere.
- Submodules require you to carefully balance consistency and convenience. The setup used here strongly prefers consistency, at the cost of a little convenience. It’s generally best to have a project’s submodules locked to a specific SHA, so all collaborators receive the same content. But this setup also makes it difficult for developers in the “parent” repository to contribute changes back to the submodule repository.
- Remember that collaborators won’t automatically see updates to submodules—if you update a submodule, you may need to remind your colleagues to run
git submodule updateor they will likely see odd behavior.
- Managing dynamic, rapidly evolving or heavily co-dependent repositories with submodules can quickly become frustrating. This post was focused on simple, relatively static parent-child repository relationships. A future follow-up post will detail some strategies to help manage more complex submodule workflows.