Jj `diff --stat` Output: Binary Files Vs Git's Visual Alignment
Let's dive into the visual representation of binary file changes in diff --stat
output, specifically within the context of the jj version control system. This article breaks down the recent discussions and decisions surrounding how these changes are displayed, comparing jj's approach to that of Git, and exploring the nuances of creating a clear and consistent user experience. We'll examine the trade-offs between showing size deltas and before/after sizes, and how these choices impact readability. By the end, you'll have a solid grasp of the challenges involved in visualizing binary file differences and the rationale behind the current implementation in jj.
The Initial Change and its Output
So, the initial change in question stems from a fix implemented in commit 60a6e9421bc8d8f67d3bfa8778fdbde2398b3b86
to address issue #6865. This fix brought about a new output format for changes involving binary files. To illustrate, consider the following output snippet:
win32/dll/dinput.dll | (binary) +1024 bytes
win32/src/winapi/dinput/builtin.rs | 48 ++++++++++++++++++++++++++++++++++++++++++++----
win32/src/winapi/dinput/dinput.rs | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++--
Here, we see that the binary file win32/dll/dinput.dll
is represented by (binary) +1024 bytes
, indicating a size increase of 1024 bytes. In contrast, the text files show the number of lines changed along with a visual representation using +
and -
characters.
Git's Approach to Binary File Diffs
Now, let's compare this to how Git handles the same scenario. Git renders the output like this:
win32/dll/dinput.dll | Bin 2560 -> 3584 bytes
win32/src/winapi/dinput/builtin.rs | 48 ++++++++++++++++++++++++++++++++++++++++++++----
Notice the difference? Git displays the file size before and after the change (2560 -> 3584 bytes
), prefixed with the word Bin
. This approach provides more information at a glance, but the question arises: is it the most consistent way to represent binary file changes alongside text file diffs?
The Rationale Behind Showing Size Deltas
The core argument in favor of showing only the size delta (e.g., +1024 bytes
) is consistency. For text files, we typically see the number of changed lines, not the total number of lines before and after the change. Therefore, it seems logical to apply a similar principle to binary files, displaying only the change in size. This aligns with the overall goal of providing a concise and easily digestible summary of changes.
However, the discussion didn't end there. While the size delta approach offers consistency, there's a visual aspect to consider. Git's clever formatting, where the Bin
text is padded to align with the text diff counts, contributes to a cleaner and more organized output. This brings us to the crux of the issue: how can jj achieve a similar level of visual harmony?
The Visual Noise Problem
The initial implementation in jj, while consistent in its display of size deltas, introduced a degree of visual clutter. Consider this example output generated using diff -f zzzz
on the jj repository:
demos/demo_resolve_conflicts.sh | 49 +
demos/demo_working_copy.sh | 58 +
demos/git_compat.png | (binary) +77170 bytes
demos/git_compat.svg | 128 +
demos/helpers.sh | 49 +
demos/juggle_conflicts.png | (binary) +62497 bytes
demos/juggle_conflicts.svg | 122 +
demos/operation_log.png | (binary) +132128 bytes
demos/operation_log.svg | 176 +
The varying lengths of the byte counts for binary files disrupt the visual flow, making it harder to quickly scan the output. The (binary) +[size] bytes
format, while informative, doesn't align as neatly with the line counts for text files. This is where the