Improved Commit Diffs
We recently rolled out a bunch of improvements to commit pages to make reviewing diffs a bit more pleasant. Diffstats Diffstat style histograms of insertions and deletions for each file…
We recently rolled out a bunch of improvements to commit pages to make reviewing diffs a bit more pleasant.
Diffstats
Diffstat style histograms of insertions and deletions for each file are now displayed on commit pages. This is useful for getting a high level feel for the impact of a commit:
The diffstat display is similar in spirit to the output generated by git diff --stat
: a numeric representing the total number of changed lines (insertions + deletions) followed by a simple visualization of the insertion to deletion ratio.
Rename Detection
Git doesn’t track file renames, but it does support heuristic detection of renamed files when performing diff
and log
operations. We’ve enabled it. The file list now displays a single line for renames instead of separate file add/remove lines:
While it’s nice to see renames reported as such in the file list, the larger benefit comes with the actual diff. Without rename detection, commits with even a small number of renamed files can generate large and noisy diffs. The entire file contents is displayed twice: first with all deleted lines and then again with all added lines. These same diffs are reduced down to pure signal with rename detection enabled because only the lines modified between the two files are shown:
See the -M
option to git-diff(1)
for information on using rename detection from the command line.
Added / Removed Files
Previously, files added or removed in a commit were shown in the file list at the top of commit pages but the actual diffs were omitted. This was a simple guard against Insanely Large Diffs That Crashed Browsers but had a few notable drawbacks:
- It was easy to miss important changes introduced by added or removed files when reviewing commits.
- It wasn’t possible to comment on specific lines in added or removed files.
- It didn’t always avoid large diffs. Consider cases like SQL database dumps where each line of a large generated file is modified as part of an otherwise tiny commit. Omitting added/removed files gave no guarantee that diffs would not exceed a reasonable size.
According to Aldo Cortesi’s GitHub project analysis, the average commit touches about 4 files and 19 lines of code. We felt that commit pages needed to do a better job showing all pertinent information on these common case commits, so from now on you’ll see diffs for added and removed files:
Large Diffs
Displaying added/removed files left the problem of how to deal with very large diffs. What we came up with is a set of rules for omitting portions of large diffs that ensures a sane upper bound on overall diff size. It works something like this:
- Diffs are not shown for any individual file with more than 300 changed lines (this includes modified files as well as added/removed files).
- No more than 150 total file diffs are displayed.
- No more than 3,000 total changed lines are shown across all diffs.
While we expect to tune these numbers over the coming weeks, the result so far has been diffs that show more of what you typically want to see and less of what you don’t.
Written by
Related posts
GitHub Availability Report: November 2024
In November, we experienced one incident that resulted in degraded performance across GitHub services.
The top 10 gifts for the developer in your life
Whether you’re hunting for the perfect gift for your significant other, the colleague you drew in the office gift exchange, or maybe (just maybe) even for yourself, we’ve got you covered with our top 10 gifts that any developer would love.
Congratulations to the winners of the 2024 Gaady Awards
The Gaady Awards are like the Emmy Awards for the field of digital accessibility. And, just like the Emmys, the Gaadys are a reason to celebrate! On November 21, GitHub was honored to roll out the red carpet for the accessibility community at our San Francisco headquarters.