Highlights from Git 2.24

Image of Taylor Blau

The open source Git project just released Git 2.24 with features and bug fixes from over 78 contributors, 21 of them new. Here’s our look at some of the most exciting features and changes introduced since Git 2.23.

Feature macros

Since the very early days, Git has shipped with a configuration subsystem that lets you configure different global or repository-specific settings. For example, the first time you wrote a commit on a new machine, you might have been reminded to set your user.name and user.email settings[1] if you haven’t already.

It turns out, git config is used for many things ranging anywhere from the identity you commit with and what line endings to use all the way to configuring aliases to other Git commands and what algorithm is chosen to produce diffs.

Usually, configuring some behavior requires only a single configuration change, like enabling or disabling any of the aforementioned values. But what about when it doesn’t? What do you do when you don’t know which configuration values to change? For example, let’s say you want to live on the bleeding-edge of the latest from upstream Git, but don’t have a chance to discover all the new configurable options. In Git 2.24, you can now opt into feature macros—one Git configuration that implies many others. These are hand-selected by the developers of Git, and they let you opt into a certain feature or adopt a handful of settings based on the characteristics of your repository.

For example, let’s pretend that you have a particularly large repository, and you’re noticing some slow-downs. With enough searching, you might find that setting index.version to 4 could help, but discovering this can seem like a stretch. Instead, you can enable feature.manyFiles with:

$ git config feature.manyFiles true

Now you’re opted into the features that will make your experience with Git the smoothest it can be. Setting this signals to Git that you’re willing to adopt whichever settings Git developers feel can make your experience smoothest (right now, this means that the index.version and core.untrackedCache to enable path-prefix compression and the untracked cache, respectively). Not only that, but you can feel even better knowing that any new features in a release that might help your use case will be included in the macro.

[source]

Commit graphs by default

You may remember commit graphs, a feature that we have discussed in some of our previous highlights.  Since its introduction in Git 2.19, this feature has received a steady stream of attention. When enabled, and kept reasonably up to date, commit graphs can represent an order of magnitude improvement in the performance of loading commits.

Now in Git 2.24, commit graphs are enabled by default, meaning that your repository will see an improvement the next time you run git gc. Previously, this feature was an opt-in behind an experimental core.commitGraph configuration (as well as a handful of others), but after extensive testing[2], it’s ready for prime time.

Besides being the new standard, here are a few other changes to commit graphs:

  • All commit-graph sub-commands (e.g., git commit-graph write, git commit-graph verify, and so on) now have support for the -[no-]progress. Now the progress meters for these commands behave in the usual way: writing only to a terminal by default, and respecting -[no-]progress to override that.
  • A new configuration value to automatically update the commit-graph file while fetching has been introduced, which takes advantage of commit graph chains to write a portion of history onto a commit-graph chain for later compaction. This means that every time you get new commits from a remote, they are guaranteed to be in a commit-graph immediately, and you don’t have to wait around for the next auto-gc. To try this out today, set the fetch.writeCommitGraph configuration variable to true.
  • Lots of bug fixes to improve the performance and reliability of the commit-graph commands, especially when faced with corrupt repositories.
  • The commit-graph commands now also support Git’s latest tracing mechanism, trace2.

[source, source, source, source, source, source, source]

Adopting the Contributor Covenant

Since the last release, the Git project has discussed at length adopting a code of conduct to solidify welcoming and inclusive behavior on the mailing list where Git development takes place.

Because communication between Git developers happens over email, it can be intimidating or unwelcoming to new contributors who may not be familiar with the values of the people contributing to Git. The Git community has long relied on a policy of “be nice, as much as possible” (to quote this thread). This approach is in the right spirit, but it may not be readily apparent to new contributors unfamiliar with the existing culture. Likewise, it can make an individual feel uncomfortable engaging with a project that has not solidified its values.

By adopting a code of conduct, the Git project is making it clear which behaviors it encourages and which it won’t tolerate. New contributors are able to see explicitly what the project’s values are, and they can put their trust in Git’s choice of using the well-trusted and widely-adopted Contributor Covenant. This code of conduct is enforced by the project’s leadership, who will handle any case in which an individual does not adhere to the guidelines.

New contributors can be assured that the Git community is behind this adoption with the introduction of the Code of Conduct, Acked-by 16 prominent members of the Git community.

[source]

Alternative history rewriting tools

If you’ve ever wanted to perform a complicated operation over the history of your repository—like expunging a file from a repository’s history or extracting the history pertaining to just one directory—you may have visited the documentation for git filter-branch.

git filter-branch is a long-standing and powerful tool for rewriting history[3]. With git filter-branch, you can do all of those aforementioned operations and much more. However, this flexibility comes at a hefty cost: git filter-branch is notoriously complicated to use (not to mention slow), and can often lead its users towards unintended changes, including repository corruption and data loss.

In other words, git filter-branch is starting to show its age. Now, as of Git 2.24, the Git project instead recommends a new, independent tool, git filter-repo.

git filter-repo serves to avoid many of the pitfalls that users experienced with git filter-branch. Instead of reprocessing every commit in order, git filter-repo operates on an efficient, stream representation of history to run much faster. The tool is extremely powerful, and all of its capabilities are documented thoroughly. Here are a few highlights about how you can use git filter-repo:

  • git filter-repo --analyze provides a human-readable selection of metrics profiling the size of your repository. This includes how many objects of each kind there are, which files and directories are largest, which extensions take up the most space, and so on. And this isn’t your only option in this space. For additional metrics on the shape of your repository, check out another tool, git sizer.
  • You can also filter the history of your repository to contain only certain paths, with --path-{glob,regex} and similar options. [source]
  • Likewise, you can run a “find and replace” operation over history, as well as strip blobs that are larger than a fixed threshold. [source]
  • When rewriting history, any rewritten commits (along with their ancestors) will get a new SHA-1 to identify them. By default, git filter-repo updates all other references to these SHA-1s, like other commit messages that reference them. By a similar token, git filter-repo also has options to rewrite the names of contributors using .mailmap. [source, source]
  • Finally, git filter-repo is extensible. It provides a flexible interface for specifying callbacks in Python (e.g., calling a function when git filter-repo encounters a blob/tree/commit, new filetype, etc.), as well as defining new sub-commands entirely. View a portfolio of demo extensions, or define your own to support a complex history rewrite. [source]

git filter-branch will, however, remain included in the usual distributions of Git for some time. git filter-repo is another alternative for performing complex modifications to your repository’s history, and it’s now the official recommendation from upstream.

[source]

Tidbits

You might be aware that many Git commands take one or more optional reference names as arguments. For example, git log without arguments will display a log of everything that’s reachable from the currently checked-out branch, but git log my-feature ^master will show you only what’s on my-feature and not on master.

But what if your branch is called --super-dangerous-option, you probably don’t want to invoke git log since it’ll interpret the argument as an option, not a branch name. You could try and disambiguate by invoking git log 'refs/heads/--super-dangerous-option', but if you’re scripting, you may not know under what namespace the argument you’re getting belongs.

Git 2.24 has a new way to prevent this sort of option injection attack using --end-of-options. When Git sees this as an argument to any command, it knows to treat the remaining arguments as such, and won’t interpret them as more options.

So, instead of the string previously mentioned, you could write the following to get the history of your (admittedly, pretty oddly named) branch:

$ git log --end-of-options --super-dangerous-option

Not using the standard -- was an intentional choice here, since this is already a widely-used mechanism in Git to separate reference names from files. In this example, you could have also written git log --end-of-options --super-dangerous-option ^master -- path/to/file to get only the history over that range which modified that specific file.

[source]


In a previous post, we talked about git rebase‘s new --rebase-merges option, which allows users to perform rebases while preserving the structure of history. But when Git encounters a merge point how does it unify the two histories? By default, it uses a strategy known internally as “recursive”, which is most likely the merge strategy you’re currently using.

You might not know that you can tell Git which merge strategy to use, and picking one over the other may result in a different resolution[4].

Now, git rebase --rebase-merges supports the --strategy and --strategy-option options of git rebase, so you can rebase history while both preserving its structural integrity and specifying your own merge resolution strategy.

[source]


Git supports a number of hooks, which are specially-named executable files that Git will run at various points during your workflow. For example, a pre-push hook is invoked after running git push but before the push actually occurs and so on.

A new hook has been added to allow callers to interact with Git after a merge has been carried out, but before the resulting commit is written. To intercept this point, callers can place an executable file of their choice in .git/hooks/pre-merge-commit.

[source]


Git has learned a handful of new tricks since the last release to handle partial clones. For those who aren’t up to date on what partial cloning in Git looks like, here’s a quick primer. When cloning a repository, users can specify that they would only like some of its objects by using a filter. When doing so, the remote from which the user clones is designated as a “promisor,” meaning that it promises to send the remaining objects later on if the user requests them down the road.

Up until 2.24, Git only supported a single promisor remote, but this latest release now supports more than one promisor remote. This is especially interesting since it means that users can configure a handful of geographically “close” remotes, and not all of those remotes have to have all of the objects.

There’s more work planned in this area, so stay tuned for more updates on this feature in the future.

[source]


Last but not least, Git’s command-line completion engine has learned how to complete configuration variables on per-command configurations.

Git has a hierarchy of places where gitconfig files can be found: your repository (via .git/config), your home directory, and so on. But, Git also supports the top-level -c flag to specify configuration variables for each command. Here’s an example:

$ git -c core.autocrlf=false add path/to/my/file

This invocation will disable Git’s auto-CRLF conversion just for the duration of the git add (and any other internal commands that Git may run as a part of git add).

However, if you forgot the name of the variable that you were trying to set Git’s command-line completion engine learned how to provide a completion list of configuration variable names in Git 2.24. So, if you ever forget where you are in the middle of a -c ..., all you need to do is press tab.

[source]



[1] When you create a commit, these are the configuration values that Git is looking at to generate the signature; Git-parlance for the name/email-pair that is shown wherever an identity is expected.

[2] Including at GitHub, where we’ve used the commit-graph file behind the scenes since August to achieve speed-ups of over 50 percent on operations which traverse history.

[3] In fact, this tool appeared in git/git with 6f6826c52b which was first released in v1.5.3 over 12 years ago.

[4] For example, the patience merge strategy is widely regarded as one that will “move up” chunks of your diff (these are usually known as “hunks”) that look textually related (for example, closing function braces), but in fact produce awkward-looking diffs.

Learn more

That’s just a sample of changes from the latest version. Check out the release notes for 2.24 or any previous versions in the Git repository.

See what launched at GitHub Universe

Missed the main event? Learn more about everything that launched at GitHub Universe, from GitHub for mobile and a redesigned notifications experience to the GitHub Archive Program.

Read the day one keynote recap

Secure the world's code, together

On day two of GitHub Universe, we announced GitHub Security Lab, bringing together security researchers, maintainers, and companies across the industry to secure open source.

Read the day two keynote recap