• No results found

Unified Git diff format

In document Mastering Git (Page 92-97)

Git, by default and in most cases, will show the changes in unified diff output format. Understanding this output is very important, not only when examining changes to be committed, but also when reviewing and examining changes (for example, in code review, or in finding bugs after git bisect has found the suspected commit).

You can request only statistics of changes with the --stat or --dirstat option, or just names of the changed files with --name- only, or file names with type of changes with --name-status, or tree-level view of changes with --raw, or a condensed summary of extended header information with --summary (see later for an explanation of what extended header means and what information it contains). You can also request word diff, rather than line diff, with --word-diff; though this changes only the formatting of chunks of changes, headers and chunk headers remain similar.

Diff generation can also be configured for specific files or types of files with appropriate gitattributes. You can specify external diff helper, that is, the command that describes the changes, or you can specify text conversion filter for binary files (you will learn more about this in

Chapter 4, Managing Your Worktree).

If you prefer to examine the changes in a graphical tool (which usually provides side-by-side diff), you can do it by using git difftool in place of git diff . This may require some configuration, and will be explained in Chapter 10, Customizing and Extending Git.

Let's take a look at an example of advanced diff from Git project history . Let's use the diff from the commit 1088261f from the git.git repository. You can view these

changes in a web browser, for example, on GitHub; this is the third patch in this commit: diff --git a/builtin-http-fetch.c b/http-fetch.c

similarity index 95%

rename from builtin-http-fetch.c rename to http-fetch.c index f3e63d7..e8f44ba 100644 --- a/builtin-http-fetch.c +++ b/http-fetch.c @@ -1,8 +1,9 @@ #include "cache.h" #include "walker.h"

-int cmd_http_fetch(int argc, const char **argv, const char *prefix) +int main(int argc, const char **argv)

{

+ const char *prefix; struct walker *walker; int commits_on_stdin = 0; int commits;

@@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, int get_verbosely = 0;

int get_recover = 0;

+ prefix = setup_git_directory(); +

git_config(git_default_config, NULL); while (arg < argc && argv[arg][0] == '-') { Let's analyze this patch line after line:

• The first line, diff --git a/builtin-http-fetch.c b/http-fetch.c,

is a git diff header in the form diff --git a/file1 b/file2. The a/ and b/ filenames are the same unless rename or copy is involved (such as in our

case), even if the file is added or deleted. The --git option means that diff is

• The next lines are one or more extended header lines. The first three lines in this example tell us that the file was renamed from builtin-http- fetch.c to http-fetch.c and that these two files are 95% identical (which

information was used to detect this rename): similarity index 95%

rename from builtin-http-fetch.c Rename to http-fetch.c

Extended header lines describe information that cannot be

represented in an ordinary unified diff (except for information that file was renamed). Besides similarity (or dissimilarity) score like in example they can describe the changes in file type (example from non-executable to executable).

• The last line in extended diff header, which, in this example is index f3e63d7..e8f44ba 100644 tells us about the mode of given file (100644

means that it is an ordinary file and not a symbolic link, and that it doesn't have executable permission bit; these three are only file permissions tracked by Git), and about shortened hash of pre-image (the version of the file before the given change) and post-image (the version of the file after the change). This line is used by git am --3way to try to do a three-way merge if the

patch cannot be applied itself. For the new files, pre-image hash is 0000000,

the same for the deleted files with post-image hash. • Next is the unified diff header, which consists of two lines:

--- a/builtin-http-fetch.c +++ b/http-fetch.c

• Compared to the diff -U result, it doesn't have from-file-modification-time

or to-file-modification-time after source (pre-image) and destination or target (post-image) filenames. If the file was created, the source would be /dev/ null; if the file was deleted, the target would be /dev/null.

If you set the diff.mnemonicPrefix configuration variable to true, in place of the a/ prefix for pre-image and b/ for post-image in this two-line header, you can instead have the c/ prefix for commit, i/ for index, w/ for worktree, and o/ for object, respectively, to show what you compare.

• Next comes one or more hunk of differences; each hunk shows one area where the files differ. Unified format hunks start with the line describing where the changes were in the file:

@@ -1,8 +1,9 @@

This line is in the format @@ from-file-range to-file-range @@. The

from-file-range is in the form -<start line>,<number of lines>, and

to-file-range is +<start line>,<number of lines>. Both start-line and

number-of-lines refer to the position and length of hunk in pre-image and post-image, respectively. If number-of-lines is not shown, it means that it is 0.

In this example, the changes, both in pre-image (file before the changes) and post-image (file after the changes) begin at the first line of the file, and the fragment of code corresponding to this hunk of diff has 8 lines in pre-image,

and 9 lines in post-image (one line is added). By default, Git will also show

three unchanged lines surrounding changes (three context lines). Git will also show the function where each change occurs (or equivalent, if any, for other types of files; this can be configured with .gitattributes); it is like the -p

option in GNU diff:

@@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char • Next is the description of where and how files differ. The lines common

to both the files begin with a space (" ") indicator character. The lines that actually differ between the two files have one of the following indicator characters in the left print column:

° +: A line was added here to the second file

° -: A line was removed here from the first file

Note that the changed line is denoted as removing the old version and adding the new version of the line.

In the plain word-diff format, instead of comparing file contents line by line, added words are surrounded by {+ and +}, while removed by [- and -].

• If the last hunk includes, among its lines, the very last line of either version of the file, and that last line is incomplete, (which means that the file does not end with the end-of-line character at the end of hunk) you would find:

\ No newline at end of file

So, for the example used here, first chunk means that cmd_http_fetch was replaced

by main and the const char *prefix; line was added:

#include "cache.h" #include "walker.h"

-int cmd_http_fetch(int argc, const char **argv, const char *prefix) +int main(int argc, const char **argv)

{

+ const char *prefix; struct walker *walker; int commits_on_stdin = 0; int commits;

See how for the replaced line, the old version of the line appears as removed (-) and

the new version as added (+).

In other words, before the change, the appropriate fragment of the file, that was then named builtin-http-fetch.c, looked similar to the following:

#include "cache.h" #include "walker.h"

int cmd_http_fetch(int argc, const char **argv, const char *prefix) {

struct walker *walker; int commits_on_stdin = 0; int commits;

After the change, this fragment of the file that is now named http-fetch.c, looks

similar to this instead: #include "cache.h" #include "walker.h"

int main(int argc, const char **argv) {

const char *prefix; struct walker *walker; int commits_on_stdin = 0; int commits;

In document Mastering Git (Page 92-97)

Related documents