Highlights from Git 2.24
Take a look at some of the new features in the latest Git release.
The open source Git project just released Git 2.24 with features and bug fixes from over 78 contributors, 21 of them new. Here’s our look at some of the most exciting features and changes introduced since Git 2.23.
Feature macros
Since the very early days, Git has shipped with a configuration subsystem that lets you configure different global or repository-specific settings. For example, the first time you wrote a commit on a new machine, you might have been reminded to set your user.name
and user.email
settings[1] if you haven’t already.
It turns out, git config is used for many things ranging anywhere from the identity you commit with and what line endings to use all the way to configuring aliases to other Git commands and what algorithm is chosen to produce diffs.
Usually, configuring some behavior requires only a single configuration change, like enabling or disabling any of the aforementioned values. But what about when it doesn’t? What do you do when you don’t know which configuration values to change? For example, let’s say you want to live on the bleeding-edge of the latest from upstream Git, but don’t have a chance to discover all the new configurable options. In Git 2.24, you can now opt into feature macros—one Git configuration that implies many others. These are hand-selected by the developers of Git, and they let you opt into a certain feature or adopt a handful of settings based on the characteristics of your repository.
For example, let’s pretend that you have a particularly large repository, and you’re noticing some slow-downs. With enough searching, you might find that setting index.version
to 4
could help, but discovering this can seem like a stretch. Instead, you can enable feature.manyFiles
with:
$ git config feature.manyFiles true
Now you’re opted into the features that will make your experience with Git the smoothest it can be. Setting this signals to Git that you’re willing to adopt whichever settings Git developers feel can make your experience smoothest (right now, this means that the index.version
and core.untrackedCache
to enable path-prefix compression and the untracked cache, respectively). Not only that, but you can feel even better knowing that any new features in a release that might help your use case will be included in the macro.
[source]
Commit graphs by default
You may remember commit graphs, a feature that we have discussed in some of our previous highlights. Since its introduction in Git 2.19, this feature has received a steady stream of attention. When enabled, and kept reasonably up to date, commit graphs can represent an order of magnitude improvement in the performance of loading commits.
Now in Git 2.24, commit graphs are enabled by default, meaning that your repository will see an improvement the next time you run git gc
. Previously, this feature was an opt-in behind an experimental core.commitGraph
configuration (as well as a handful of others), but after extensive testing[2], it’s ready for prime time.
Besides being the new standard, here are a few other changes to commit graphs:
- All
commit-graph
sub-commands (e.g.,git commit-graph write
,git commit-graph verify
, and so on) now have support for the-[no-]progress
. Now the progress meters for these commands behave in the usual way: writing only to a terminal by default, and respecting-[no-]progress
to override that. - A new configuration value to automatically update the
commit-graph
file while fetching has been introduced, which takes advantage of commit graph chains to write a portion of history onto a commit-graph chain for later compaction. This means that every time you get new commits from a remote, they are guaranteed to be in acommit-graph
immediately, and you don’t have to wait around for the next auto-gc. To try this out today, set thefetch.writeCommitGraph
configuration variable totrue
. - Lots of bug fixes to improve the performance and reliability of the
commit-graph
commands, especially when faced with corrupt repositories. - The
commit-graph
commands now also support Git’s latest tracing mechanism, trace2.
[source, source, source, source, source, source, source]
Adopting the Contributor Covenant
Since the last release, the Git project has discussed at length adopting a code of conduct to solidify welcoming and inclusive behavior on the mailing list where Git development takes place.
Because communication between Git developers happens over email, it can be intimidating or unwelcoming to new contributors who may not be familiar with the values of the people contributing to Git. The Git community has long relied on a policy of “be nice, as much as possible” (to quote this thread). This approach is in the right spirit, but it may not be readily apparent to new contributors unfamiliar with the existing culture. Likewise, it can make an individual feel uncomfortable engaging with a project that has not solidified its values.
By adopting a code of conduct, the Git project is making it clear which behaviors it encourages and which it won’t tolerate. New contributors are able to see explicitly what the project’s values are, and they can put their trust in Git’s choice of using the well-trusted and widely-adopted Contributor Covenant. This code of conduct is enforced by the project’s leadership, who will handle any case in which an individual does not adhere to the guidelines.
New contributors can be assured that the Git community is behind this adoption with the introduction of the Code of Conduct, Acked-by
16 prominent members of the Git community.
[source]
Alternative history rewriting tools
If you’ve ever wanted to perform a complicated operation over the history of your repository—like expunging a file from a repository’s history or extracting the history pertaining to just one directory—you may have visited the documentation for git filter-branch
.
git filter-branch
is a long-standing and powerful tool for rewriting history[3]. With git filter-branch
, you can do all of those aforementioned operations and much more. However, this flexibility comes at a hefty cost: git filter-branch
is notoriously complicated to use (not to mention slow), and can often lead its users towards unintended changes, including repository corruption and data loss.
In other words, git filter-branch
is starting to show its age. Now, as of Git 2.24, the Git project instead recommends a new, independent tool, git filter-repo
.
git filter-repo
serves to avoid many of the pitfalls that users experienced with git filter-branch
. Instead of reprocessing every commit in order, git filter-repo
operates on an efficient, stream representation of history to run much faster. The tool is extremely powerful, and all of its capabilities are documented thoroughly. Here are a few highlights about how you can use git filter-repo
:
git filter-repo --analyze
provides a human-readable selection of metrics profiling the size of your repository. This includes how many objects of each kind there are, which files and directories are largest, which extensions take up the most space, and so on. And this isn’t your only option in this space. For additional metrics on the shape of your repository, check out another tool,git sizer
.- You can also filter the history of your repository to contain only certain paths, with
--path-{glob,regex}
and similar options. [source] - Likewise, you can run a “find and replace” operation over history, as well as strip blobs that are larger than a fixed threshold. [source]
- When rewriting history, any rewritten commits (along with their ancestors) will get a new SHA-1 to identify them. By default,
git filter-repo
updates all other references to these SHA-1s, like other commit messages that reference them. By a similar token,git filter-repo
also has options to rewrite the names of contributors using.mailmap
. [source, source] - Finally,
git filter-repo
is extensible. It provides a flexible interface for specifying callbacks in Python (e.g., calling a function whengit filter-repo
encounters a blob/tree/commit, new filetype, etc.), as well as defining new sub-commands entirely. View a portfolio ofdemo
extensions, or define your own to support a complex history rewrite. [source]
git filter-branch
will, however, remain included in the usual distributions of Git for some time. git filter-repo
is another alternative for performing complex modifications to your repository’s history, and it’s now the official recommendation from upstream.
[source]
Tidbits
You might be aware that many Git commands take one or more optional reference names as arguments. For example, git log
without arguments will display a log of everything that’s reachable from the currently checked-out branch, but git log my-feature ^master
will show you only what’s on my-feature
and not on master
.
But what if your branch is called --super-dangerous-option
, you probably don’t want to invoke git log
since it’ll interpret the argument as an option, not a branch name. You could try and disambiguate by invoking git log 'refs/heads/--super-dangerous-option'
, but if you’re scripting, you may not know under what namespace the argument you’re getting belongs.
Git 2.24 has a new way to prevent this sort of option injection attack using --end-of-options
. When Git sees this as an argument to any command, it knows to treat the remaining arguments as such, and won’t interpret them as more options.
So, instead of the string previously mentioned, you could write the following to get the history of your (admittedly, pretty oddly named) branch:
$ git log --end-of-options --super-dangerous-option
Not using the standard --
was an intentional choice here, since this is already a widely-used mechanism in Git to separate reference names from files. In this example, you could have also written git log --end-of-options --super-dangerous-option ^master -- path/to/file
to get only the history over that range which modified that specific file.
[source]
In a previous post, we talked about git rebase
‘s new --rebase-merges
option, which allows users to perform rebases while preserving the structure of history. But when Git encounters a merge point how does it unify the two histories? By default, it uses a strategy known internally as “recursive”, which is most likely the merge strategy you’re currently using.
You might not know that you can tell Git which merge strategy to use, and picking one over the other may result in a different resolution[4].
Now, git rebase --rebase-merges
supports the --strategy
and --strategy-option
options of git rebase
, so you can rebase history while both preserving its structural integrity and specifying your own merge resolution strategy.
[source]
Git supports a number of hooks, which are specially-named executable files that Git will run at various points during your workflow. For example, a pre-push
hook is invoked after running git push
but before the push actually occurs and so on.
A new hook has been added to allow callers to interact with Git after a merge has been carried out, but before the resulting commit is written. To intercept this point, callers can place an executable file of their choice in .git/hooks/pre-merge-commit
.
[source]
Git has learned a handful of new tricks since the last release to handle partial clones. For those who aren’t up to date on what partial cloning in Git looks like, here’s a quick primer. When cloning a repository, users can specify that they would only like some of its objects by using a filter. When doing so, the remote from which the user clones is designated as a “promisor,” meaning that it promises to send the remaining objects later on if the user requests them down the road.
Up until 2.24, Git only supported a single promisor remote, but this latest release now supports more than one promisor remote. This is especially interesting since it means that users can configure a handful of geographically “close” remotes, and not all of those remotes have to have all of the objects.
There’s more work planned in this area, so stay tuned for more updates on this feature in the future.
[source]
Last but not least, Git’s command-line completion engine has learned how to complete configuration variables on per-command configurations.
Git has a hierarchy of places where gitconfig
files can be found: your repository (via .git/config
), your home directory, and so on. But, Git also supports the top-level -c
flag to specify configuration variables for each command. Here’s an example:
$ git -c core.autocrlf=false add path/to/my/file
This invocation will disable Git’s auto-CRLF conversion just for the duration of the git add
(and any other internal commands that Git may run as a part of git add
).
However, if you forgot the name of the variable that you were trying to set Git’s command-line completion engine learned how to provide a completion list of configuration variable names in Git 2.24. So, if you ever forget where you are in the middle of a -c ...
, all you need to do is press tab.
[source]
[1] When you create a commit, these are the configuration values that Git is looking at to generate the signature; Git-parlance for the name/email-pair that is shown wherever an identity is expected.
[2] Including at GitHub, where we’ve used the commit-graph
file behind the scenes since August to achieve speed-ups of over 50 percent on operations which traverse history.
[3] In fact, this tool appeared in git/git
with 6f6826c52b
which was first released in v1.5.3 over 12 years ago.
[4] For example, the patience
merge strategy is widely regarded as one that will “move up” chunks of your diff (these are usually known as “hunks”) that look textually related (for example, closing function braces), but in fact produce awkward-looking diffs.
Learn more
That’s just a sample of changes from the latest version. Check out the release notes for 2.24 or any previous versions in the Git repository.
Tags:
Written by
Related posts
How to build an open source metrics dashboard
How GitHub volunteers built an open source metrics dashboard for the World Health Organization and some best practices they picked up along the way.
Automating open source: How Ersilia distributes AI models to advance global health equity
Discover how the Ersilia Open Source Initiative accelerates drug discovery by using GitHub Actions to disseminate AI/ML models.
Highlights from Git 2.46
Git 2.46 is here with new features like pseudo-merge bitmaps, more capable credential helpers, and a new git config command. Check out our coverage on some of the highlights here.