The open source Git project just released Git 2.19, with features and bug-fixes from over 60 contributors. Here’s a look at some of the most interesting features introduced in the latest versions of Git.
You might have used
git rebase, which is a powerful tool for rewriting history
by altering commits, commit order, or branch bases to name a few. Many people
do this to “polish” a series of commits before proposing to merge them into a
project. But how can we visualize the differences between two sets of commits,
before and after a rebase?
We can use
git diff to show the difference between the two end states, but
that doesn’t provide information about the individual commits. And if the base
on which the commits were built has changed, the resulting state might be
quite different, even if the changes in the commits are largely the same.
Git 2.19 introduces
git range-diff, a tool for comparing two sequences of
commits, including changes to their order, commit messages, and the actual
content changes they introduce.
In this example, we rewrote a series of three commits, and compared the tips of
each version using
git range-diff shows that we moved the
README.md to be first instead of second, amended both the
commit message and body of the typo fix, and introduced a new commit to add a
git grep’s new tricks
When you search for a phrase using
git grep, it’s often helpful to have
additional information pertaining to each match, such as its line number and
In Git 2.19 you can now locate the first matching column of your query with
git grep --column.
If you’re using Vim, you can also try out
git-jump, a Git add-on that
converts useful locations in your code to jump locations in your text editor.
git-jump can take you to merge conflicts, diff hunks, and now, exact grep
git grep --column.
git grep also learned the new
-o option (meaning
--only-matching). This is
useful if you have a non-trivial regular expression and want to gather only the
matching parts of your search.
For example, if you want to count all of the various ways that the Git source code spells “SHA-1” (e.g., “sha1”, “SHA1”, and so on):
(The other options
-hiI are to omit the filename, search case-insensitively,
and ignore matches in binary files, respectively.)
git branch command, like
git tag (and their scriptable counterpart,
for-each-ref), takes a
--sort option to let you order the results by a number
of properties. For example, to show branches in the order of most recent update,
you could use
git branch --sort=-authordate. But if you always prefer that
order, typing that sort option can get tiresome.
Now, you can use the
branch.sort config to set the default ordering of
Note that by default,
git branch sorts by refname, hence
master is first and
newest is last. In the above example, we tell Git that we would instead prefer
the most recently updated branch first, and the rest in descending order. Hence,
newest is first and
master is last.
You might also want to try these other sorting options:
--sort=numparentshows merges by how awesome they are
--sort=refnamesorts branches alphabetically by their name (this is the default, but may be useful to override in your configuration)
--sort=upstreamsorts branches by the remote from which they originate
Git has always detected renamed files as part of merges. For example, if one
branch moves a file from
B and another modifies content in
the resulting merge will apply that modification to the content’s new location
The same thing can happen with files in a directory. If one branch moves a
B but another adds a new file
A/file, we can infer
that the file should become
B/file when the two are merged. In Git 2.18,
merge does this whenever rename detection is enabled (which is by default).
In Git v2.18, a remote code execution vulnerability in
fixed, where an attacker could execute scripts when the victim cloned with
--recurse-submodules. If you haven’t upgraded, please do! The fix was also
backported to v2.17.1, v2.16.4, v2.15.2, v2.14.4, and v2.13.7, so you’re safe
if you’re running one of those.
Have you ever run into a Git command line option that should have tab-completed but didn’t? Keeping these up to date has long been an annoying source of manual work for the project, but now the completion of options for most commands is generated automatically (along with the list of commands itself, the names of config options, and more). [source, source, source, source]
gpg signing and verification of commits and tags has been extended to work
gpgsm, which uses X.509 certificates instead of OpenPGP keys. These
certificates may be easier to manage for centralized groups (e.g., developers
working for a large enterprise).
To fetch a configuration variable with a “fallback” value, it’s common for
scripts to say
git config core.myFoo || echo <default>. But that doesn’t
give Git the opportunity to interpret
<default> for you. When it comes to
colors, this is especially important for instances where you ultimately need
the ANSI color code, for say, “bold red”, but don’t want to type
git config has long supported this with a special
--get-color option, but
now there are options that can be applied uniformly to all types of config.
git config --type=int --default=2M core.myInt will expand the
default to 2097152, and
git config --type=expiry --default=2.weeks.ago
gc.pruneExpire consistently returns a number of seconds.
Quick quiz: if
git tag -l is shorthand for
git tag --list, then what does
git branch -l do? If you thought, “surely it doesn’t list all branches”,
then congratulations: you’re a veteran Git user!
git branch -l has been used since 2006 to establish a reflog for a
newly created branch, something that you probably didn’t care about since it
became the default shortly after being introduced.
That usage has been deprecated (you will receive a warning if you use
branch -l), thus clearing the way for
git branch -l to mean
In our last post, we discussed the new
--color-moved option, which
(unsurprisingly) colors lines moved in a diff. The lines that were moved must
be identical, meaning that the feature would miss re-indented code unless you
specified a diff option such as
--ignore-space-change. Keep in mind that
this option would affect the whole diff, potentially missing space changes
that you do care about. In Git 2.19, the whitespace for move detection can
be configured independently with the new
Many of Git’s commands are colorized, like
git status, and so
on. Since 2.17, a few more commands improved their support for colorization,
git blame learned to colorize lines based on
Messages sent from a remote server are now colorized based on their keyword
(e.g., “error”, “warning”, etc.). Finally, push errors are now painted red for
If you’ve ever run
git checkout with the name of a remote branch, you might
know that Git will automatically create a local branch that tracks the
remote one. However, if that branch name is found in more than one remote, Git
does not know which to use, and simply gives up.
In 2.19, Git learned the
checkout.defaultRemote configuration, which
specifies a remote to default to when resolving such an ambiguity.
Git interprets certain text encodings (e.g.
UTF-16) as binary, meaning that
git diff will not show a textual diff. Normally it’s recommended
to store your text files as
UTF-8, but this isn’t always possible if other
tools generate or expect another encoding.
You can now tell Git which encoding you prefer in your working tree on a
per-file basis by setting the
working-tree-encoding attribute. This will
cause Git to store the files as
UTF-8 internally, and convert them back to
your preferred encoding on checkout. The result looks good in
git diff, as
well as on hosting sites.
Some features are so big that they’re developed over the course of several releases. We have historically avoided reporting on works in progress in these posts, since the features are often still experimental, or there’s nothing you can directly start using.
That said, some of the topics upstream around this release are too exciting to ignore! So, here’s an incomplete summary of what’s happening upstream:
An important part of Git’s decentralized design is that all clones receive the full history of the project, making all clones true peers of one another. When there aren’t a large number of objects in your repository, things go quickly, but at a certain size clones can become frustratingly slow.
There’s ongoing work to allow “partial” clones which omit some blob and tree
objects, in favor of requesting objects from the server as-needed. You can see a
design overview of the feature, or even start experimenting yourself. Note
that most public servers do not yet support the feature, but you can play with
git clone --filter=blob:none against your local Git 2.19 install.
Git has a very simple data model: everything is an object named after the hash
of its contents, and objects point to each other by those names. Many operations
walk the graph formed by those pointers. For example, asking “which releases
contain this bug-fix” is really “which tag objects have a path to walk back to
X is the commit fixing the aforementioned bug).
Those walks have traditionally required loading each object from disk to find its pointers. But now Git can compute and store properties of each commit in a more efficient format, leading to significantly faster traversals. You can read more about it in a series of blog posts from the feature’s author.
Git still uses roughly the same protocol for fetching that was developed in 2005: after a client connects, the server dumps the current state of all branches and tags (called the “ref advertisement”), and then the client asks for the parts it needs to update. As repositories have grown, the cost of this advertisement has become a source of inefficiency.
The protocol has added new features over the years in a backwards-compatible way by negotiating capabilities between the server and client. But one thing that couldn’t be changed is the ref advertisement itself, because it happens before there’s a chance to negotiate.
Now there’s a new protocol which addresses this (and more), providing a way to transfer the advertisement more efficiently. Only a few servers support the new protocol so far, but you can read more about it in this blog post from its designer.
We mentioned earlier that all Git objects are named according to a hash of their contents. You might know that the algorithm that determines the value of that hash is SHA-1, which has not been considered safe for some time. In fact, a collision attack was discovered and published last year, which we wrote about in our post on its remediation.
Though SHA-1 collisions in Git are unlikely in practice, the Git project has decided to pick a new hashing algorithm and has made significant progress towards implementing it. Git has chosen SHA-256 as the successor to SHA-1, and is working through the transition plan to convert to it.