Motivation

The default merge method on GitHub Pull Requests creates a merge commit. But GitHub also provides us several merge options and “Squash and merge” option is one of them. Because a squash-merged commit is not the same as the commits in a local branch, git cannot list the squash-merged local branch by git branch --merged. Therefore, we cannot clean up outdated local branches with the following commands.

$ git fetch --prune
$ git branch --merged | egrep -v "(^\*|master|ANY_BRANCH_YOU_WANT_TO_EXCLUDE)" | xargs -I % git branch -d %

These commands can remove obsolete remote branches on GitHub and local branches when we use default merge commits.

To clean up the local branch corresponding to the squash-merged PR, I use git cherry, git-merge-base, and git-commit-tree. Because these are the commands I hardly use in my software development process, I want to present the usage of the commands.

Goal

  • Explain git cherry, git-merge-base, and git-commit-tree
  • Share the shell script to clean up obsolete branches from my local git

Strategy to detect squash-merged branch

Let’s say we have a topic branch A, and A is merged into a master on GitHub with the “Squash and merge” method. For that situation, we can detect the squash-merged branch from local git with the steps below.

  1. Find a parent commit for master and topic A.
  2. Create a temporary commit squashed from the parent commit to the latest commits of A.
  3. Validate the squashed commit has already been applied to master.

Verification of the strategy

Based on the strategy described above, I will experiment and elaborate on the steps on how we can detect the squash-merged branch in my local git.

Preparation before experiment

The repository I use during the experiment is prepared with the following commands. First of all, an initialization of git and adding the first commit are executed.

$ mkdir remote-squash-merge-branch
$ cd remote-squash-merge-branch
$ git init
$ echo "1" > README.md
$ git add README.md
$ git commit -m "Initial commit"
$ echo "parent commit for master and A" >> README.md
$ git commit -am "parent commit for master and A"

$ git log --oneline
546d16d (HEAD -> master) parent commit for master and A
6669578 Initial commit

Secondly, a topic branch named A is created and two commits are added.

$ git checkout -b A
$ echo "First commit with topic branch A" >> README.md
$ git commit -am "First commit with topic branch A"
$ echo "Second commit with topic branch A" >> README.md
$ git commit -am "Second commit with topic branch A"

We can check the current status of the git trees with git log --graph. From the tree diagram below, we can confirm that A is two commits ahead of the master branch.

$ git log --graph --date-order --all --pretty=format:'%h %Cred%d %Cgreen%ad %Cblue%cn %Creset%s' --date=short
* 709b12e  (HEAD -> A) 2020-04-18 Takayuki WATANABE Second commit with topic branch A
* 6a55918  2020-04-18 Takayuki WATANABE First commit with topic branch A
* 546d16d  (master) 2020-04-18 Takayuki WATANABE parent commit for master and A
* 6669578  2020-04-18 Takayuki WATANABE Initial commit

Thirdly, we have to push A to a remote repository and create a pull request.

$ git remote add origin git@github.com:takanabe/remote-squash-merged-branch.git
$ git push origin master

Finally, hit “Squash and merge” on GitHub and check out and merge the commit into the local master branch.

$ git checkout master
$ git fetch --all
$ git merge origin/master

The following graph illustrates the git tree I want to use for the experiment. As you can see from the graph, the local master is including squash-merged commit. The ancestor commit for the master and A branches is the commit hash 546d16d.

$ git log --graph --date-order --all --pretty=format:'%h %Cred%d %Cgreen%ad %Cblue%cn %Creset%s' --date=short
* 2b3d4ae  (HEAD -> master, origin/master) 2020-04-18 GitHub A (#1)
| * 709b12e  (origin/A, A) 2020-04-18 Takayuki WATANABE Second commit with topic branch A
| * 6a55918  2020-04-18 Takayuki WATANABE First commit with topic branch A
|/
* 546d16d  2020-04-18 Takayuki WATANABE parent commit for master and A
* 6669578  2020-04-18 Takayuki WATANABE Initial commit

Step 1. Find a parent commit for master and topic A

All right, let’s start the experiment. As the first step, we need to find a parent commit for master and topic A”, whose commit hash is 546d16d. There is a handy git sub-command called merged-base. Literally, git merged-base can find the common ancestors as possible for a merge for master and A.

$ git merge-base master A
546d16de58d48d045a24649817ab9e0e256d4f7a
$ ANCESTOR=`git merge-base master A`

$ git show --oneline $ANCESTOR
546d16d parent commit for master and A
diff --git a/README.md b/README.md
index d00491f..817a93f 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,2 @@
 1
+parent commit for master and A

Because of the git merge-base master A, the variable ANCESTOR now has the commit hash for the ancestor commit.

Step 2. Create a temporary squashed commit

The next step is creating a temporary commit which has all commits for A but is squashed into one commit. When we create the temporary commit, we don’t want to change the commits and tree for A. git commit-tree crafts a new commit object by using an ancestor commit object and a tree object.

$ TEMP_TREE=`git commit-tree $(git rev-parse A^{tree}) -p $ANCESTOR -m "temporary tree object"`

$ git log --oneline $TEMP_TREE

be75fbc temporary tree object
546d16d parent commit for master and A
6669578 Initial commit

$ git show --oneline $TEMP_TREE

be75fbc temporary tree object
diff --git a/README.md b/README.md
index 817a93f..e5f7cf8 100644
--- a/README.md
+++ b/README.md
@@ -1,2 +1,4 @@
 1
 parent commit for master and A
+First commit with topic branch A
+Second commit with topic branch A

As I shared above, commit-tree does not change the state of the existing commit and tree objects and create a new commit object instead. So, if you check the tree for master, you cannot find the temporary commit from the graph.

$ git log --graph --date-order --all --pretty=format:'%h %Cred%d %Cgreen%ad %Cblue%cn %Creset%s' --date=short
* 2b3d4ae  (HEAD -> master, origin/master) 2020-04-18 GitHub A (#1)
| * 709b12e  (origin/A, A) 2020-04-18 Takayuki WATANABE Second commit with topic branch A
| * 6a55918  2020-04-18 Takayuki WATANABE First commit with topic branch A
|/
* 546d16d  2020-04-18 Takayuki WATANABE parent commit for master and A
* 6669578  2020-04-18 Takayuki WATANABE Initial commit

If you specify the temporary commit hash for git log, you can confirm the existence.

$ git log --graph --date-order --all --pretty=format:'%h %Cred%d %Cgreen%ad %Cblue%cn %Creset%s' --date=short $TEMP_TREE

* 9f8364a  (HEAD -> master) 2020-04-18 Takayuki WATANABE One step forward from squash-merged commit
| * be75fbc  2020-04-18 Takayuki WATANABE temporary tree object <- This is the temporary commit object
* | 2b3d4ae  (origin/master) 2020-04-18 GitHub A (#1)
|/
| * 709b12e  (origin/A, A) 2020-04-18 Takayuki WATANABE Second commit with topic branch A
| * 6a55918  2020-04-18 Takayuki WATANABE First commit with topic branch A
|/
* 546d16d  2020-04-18 Takayuki WATANABE parent commit for master and A
* 6669578  2020-04-18 Takayuki WATANABE Initial commit

Step 3. Validation of squash-merged branch

Finally, we can compare the current master and the temporary commit created during step 2. Since a squash creates a new commit and change commit hash, commit hashes are of no use. But, because the temporary commit embraces all commits for A in one commit, the validation of squash-merge on GitHub is possible by comparing whether the master branch includes a commit having the same content as the temporary commit.

git cherry enables us to determine if the master has an equivalent commit. This command shows the SHA1 of every commit in <head>..<upstream>. Therefore, when the temporary commit is equivalent to one of the commits in the master, the command outputs a SHA1 prefixed with -.

$ git cherry master $TEMP_TREE

- be75fbc3b3e3136e622b44936c391fc0f7252c33

Cool. Now we could verify that the master branch includes the commit squash-merged on GitHub.

Additional experiment

git cherry output the SHA1 with - prefix. We can instinctively expect that when the commits compared with git cherry are not equivalent, there is another prefix instead of -. Let’s add one more commit to the master and run git cherry.

$ echo "One step forward from squash-merged commit" >> README.md
$ git commit -am "One step forward from squash-merged commit"

$ git log --graph --date-order --all --pretty=format:'%h %Cred%d %Cgreen%ad %Cblue%cn %Creset%s' --date=short $TEMP_TREE

* 9f8364a  (HEAD -> master) 2020-04-18 Takayuki WATANABE One step forward from squash-merged commit
| * be75fbc  2020-04-18 Takayuki WATANABE temporary tree object
* | 2b3d4ae  (origin/master) 2020-04-18 GitHub A (#1)
|/
| * 709b12e  (origin/A, A) 2020-04-18 Takayuki WATANABE Second commit with topic branch A
| * 6a55918  2020-04-18 Takayuki WATANABE First commit with topic branch A
|/
* 546d16d  2020-04-18 Takayuki WATANABE parent commit for master and A
* 6669578  2020-04-18 Takayuki WATANABE Initial commit

$ git cherry be75fbc  master
- 2b3d4ae8cf714a14d03604396a7dc0a6d4a23e7a
+ 9f8364a3e99911527d22aa39bfa0d76649a997ce

+ is the prefix used to stand for different commits.

Removing outdated remote and local branches

We can remove obsolete local branches with the combination of merged and squash-merged branch deletions. git for-each-ref and git rev-parse are useful for this kind of automation. Here is an example shell script that removes outdated remote and local branches simultaneously.

#!/usr/bin/env zsh

# gcl: git-cleanup-remote-and-local-branches
#
# Cleaning up remote and local branch is delivered as follows:
# 1. Prune remote branches when they are deleted or merged
# 2. Remove local branches when their remote branches are removed
# 3. Remove local branches when a master included squash and merge commits

function git_prune_remote() {
  echo "Start removing out-dated remote merged branches"
  git fetch --prune
  echo "Finish removing out-dated remote merged branches"
}

function git_remove_merged_local_branch() {
  echo "Start removing out-dated local merged branches"
  git branch --merged | egrep -v "(^\*|master|ANY_BRANCH_YOU_WANT_TO_EXCLUDE)" | xargs -I % git branch -d %
  echo "Finish removing out-dated local merged branches"
}

# When we use `Squash and merge` on GitHub,
# `git branch --merged` cannot detect the squash-merged branches.
# As a result, git_remove_merged_local_branch() cannot clean up
# unused local branches. This function detects and removes local branches
# when remote branches are squash-merged.
#
# There is an edge case. If you add suggested commits on GitHub,
# the contents in local and remote are different. As a result,
# This clean up function cannot remove local squash-merged branch.
function git_remove_squash_merged_local_branch() {
  echo "Start removing out-dated local squash-merged branches"
  git checkout -q master &&
    git for-each-ref refs/heads/ "--format=%(refname:short)" |
    while read branch; do
      ancestor=$(git merge-base master $branch) &&
        [[ $(git cherry master $(git commit-tree $(git rev-parse $branch^{tree}) -p $ancestor -m _)) == "-"* ]] &&
        git branch -D $branch
    done
  echo "Finish removing out-dated local squash-merged branches"
}

# Clean up remote and local branches
function gcl() {
  git_prune_remote
  git_remove_merged_local_branch
  git_remove_squash_merged_local_branch
}

gcl

Tips for suggested commits on GitHub

During the Pull Request reviews on GitHub, reviewers can suggest code changes, and an author can incorporate them. If we want to remove local branches, we have to fetch suggested changes in the local branch as well. Otherwise, the script above cannot remove the local branch.

Conclusion

We can detect squash-merged branch with git merge-base, git commit-tree, and git cherry. Thus we can remove all outdated local branches even we use the “Squash and merge” method on GitHub.

Acknowledgements

This article is inspired by not-an-aardvark/git-delete-squashed

References