Git 2.6, including flexible fsck and improved status

The open source Git project has just released Git 2.6.0. Here’s our take on its most useful new features: git fsck flexibility If you’re tired of having git fsck remind…

|
| 5 minutes

The open source Git project has just released Git 2.6.0. Here’s our take on its most useful new features:

git fsck flexibility

If you’re tired of having git fsck remind you of peccadillos from your distant past, read on. With Git 2.6, you can tell git fsck not to be quite so picky about minor data errors in your project’s history.

The oddly-named1 git fsck command verifies the integrity of a Git repository, checking both that its data has not been corrupted and that its objects have the right format. You can even enable Git’s transfer.fsckObjects configuration variable to make Git apply the same checks to objects that are pushed into your repository. At GitHub, for example, we enable this feature on our servers to block malformed objects from entering your project’s history in the first place.

But what happens when you detect a problem in an object that is already part of your project’s history? You may recall that Git objects are content-addressed and immutable. Changing an object’s content, even to fix a syntactic error, changes its SHA-1 hash. And any commits which build on that object will need to update their references to it, changing their SHA-1 hashes, and so on.

Therefore, fixing a broken object requires all subsequent history to be rewritten. Sometimes it is worth doing. But sometimes, for mild forms of corruption in a repository that has already been cloned by many users, the repair process can be more trouble than it’s worth. Accepting the objects as they are can sometimes be the least bad solution. But until now, that meant turning off these fsck checks entirely, losing all of the extra protection.

But now With Git v2.6, there is a better option. You can now adjust the severity of particular fsck warnings, or even tell fsck to ignore warnings entirely for a particular object. To demonstrate, let’s create two commit objects with broken email addresses:

$ git cat-file commit HEAD | sed 's/</<</' | git hash-object --stdin -w -t commit
2fae34b45b7202796fbe07fdb73d47ba94af1878
$ git cat-file commit HEAD^ | sed 's/</<</' | git hash-object --stdin -w -t commit
b4a1fc6eed23c19fd5eed3f17d50b2a155d56aa9

git fsck rightly complains about the broken objects:

$ git fsck
error in commit 2fae34b45b7202796fbe07fdb73d47ba94af1878: badEmail: invalid author/committer line - bad email
error in commit b4a1fc6eed23c19fd5eed3f17d50b2a155d56aa9: badEmail: invalid author/committer line - bad email
Checking object directories: 100% (256/256), done.

If your history is riddled with invalid email addresses, you might decide to tell git fsck to ignore this type of corruption in the whole repository:

$ git config fsck.badEmail ignore
$ git fsck
Checking object directories: 100% (256/256), done.

On the other hand, perhaps you only want to tolerate the existing broken objects, but you still want git fsck to complain if any new ones enter your repository. If so, you can tell fsck “don’t give these objects any trouble; they’re with me.”

$ git config fsck.skiplist "$PWD/.git/skiplist"
$ echo 2fae34b45b7202796fbe07fdb73d47ba94af1878 >>.git/skiplist
$ echo b4a1fc6eed23c19fd5eed3f17d50b2a155d56aa9 >>.git/skiplist
$ git fsck
Checking object directories: 100% (256/256), done.

Since Git objects are immutable, you know that the objects you have vouched for cannot develop new problems—unless of course the object’s data gets corrupted on disk, in which case fsck will complain anyway, regardless of the skiplist.

git status more informative during interactive rebase

If you’ve ever used Git’s interactive rebase on a large series of commits, you know that it’s easy to get confused about where you are. The git status command used to report that you are rebasing, but not much more:

$ git status
rebase in progress; onto 1e7a542
You are currently editing a commit while rebasing branch 'master' on '1e7a542'.

nothing to commit, working directory clean

But as of v2.6, git status gives much more context about your progress through the rebase:

$ git status
interactive rebase in progress; onto 1e7a542
Last commands done (5 commands done):
   pick da6bc48 commit four
   edit 12ee9f3 commit five
Next commands to do (5 remaining commands):
   pick 8262b99 commit six
   squash 0f9e8ec commit seven
You are currently editing a commit while rebasing branch 'master' on '1e7a542'.

nothing to commit, working directory clean

log --date can use custom formats

Git’s --date option (and the matching log.date configuration variable) let you format commit dates in a variety of standard formats, including rfc2822 and iso8601. But until now, you couldn’t invent your own formats. As of v2.6, Git supports the same formatting language as your system’s strftime(3) function. This lets you do silly things:

$ git log --date=format:"On the %d day of %B, my true git gave to me..."
commit bee53d49e0899e697f72ea3cee87427f319f5707
Author: Jeff King <peff@peff.net>
Date:   On the 05 day of April, my true git gave to me...

    ...a commit message in a pear tree.

But it also lets you take advantage of your system’s localized date code. For example, %c shows the time in your preferred local format:

$ git config log.date "format:%c"
$ export LANG=es_ES.UTF-8
$ git log
commit bee53d49e0899e697f72ea3cee87427f319f5707
Author: Jeff King <peff@peff.net>
Date:   dom 05 abr 2015 13:02:03 GMT

    ...a commit message in a pear tree

git log --cc now shows diffs by default

The --cc option (e.g., to git log) modifies the way Git displays the diffs for merge commits. But until now it did not actually tell Git to show the diff. Now --cc implies -p, so that you don’t have to specify the latter option separately.

The rest of the iceberg

That’s just a peek at the improvements in Git v2.6. For the full list of changes, check out the release notes.

This Git release has 665 commits from 67 contributors. We’d especially like to shout out to Junio C Hamano, who recently celebrated his 10th year as Git’s maintainer. Thanks, everybody!


[1] The name isn’t so odd if you know the history. Unix systems have long had an fsck command, which does a File System ChecK. The original idea of git was to act as a content-addressable filesystem for versioned data. Thus, its checking tool was naturally named git-fsck.

Related posts