Bitbucket Git Tutorial PDF
Bitbucket Git Tutorial PDF
Objective
Learn the basics of Git with this space themed tutorial.
Mission Brief
Your mission is to learn the ropes of Git by completing the tutorial and tracking
down all your team's space stations. Commands covered in this tutorial:
git clone, git config, git add, git status, git commit, git push, git pull, git
branch, git checkout, and git merge
You have a Bitbucket account
As our new Bitbucket space station administrator, you need to be organized. When
you make files for your space station, you’ll want to keep them in one place and
shareable with teammates, no matter where they are in the universe. With Bitbucket,
that means adding everything to a repository. Let’s create one!
You have access to all files in your local repository, whether you are
working on one file or multiple files.
You can view public repositories without a Bitbucket account if you have
the URL for that repository.
The repository owner is the only person who can delete the repository.
If the repository belongs to a team, an admin can delete the repository.
Click + from the global sidebar for common actions for a repository. Click items in
the navigation sidebar to see what's behind each one, including Settings to update
repository details and other settings. To view the shortcuts available to navigate
these items, press the ? key on your keyboard.
When you click the Commits option in the sidebar, you find that you have no
commits because you have not created any content for your repository. Your
repository is private and you have not invited anyone to the repository, so the only
person who can create or edit the repository's content right now is you, the
repository owner.
Now that you have a place to add and share your space station files, you need a way
to get to it from your local system. To set that up, you want to copy the Bitbucket
repository to your system. Git refers to copying a repository as "cloning" it. When
you clone a repository, you create a connection between the Bitbucket server (which
Git knows as origin) and your local system.
You are about to use a whole bunch of Git and non-Git commands from a
terminal. If you've never used the command line before, learn where to find itat
The Command Line Crash Course.
Step 1. Clone your repository to your local system
Open a browser and a terminal window from your desktop. After opening the
terminal window, do the following:
$ cd ~
As you use Bitbucket more, you will probably work in multiple repositories. For
that reason, it's a good idea to create a directory to contain all those
repositories.
$ mkdir repos
3. From the terminal, update the directory you want to work in to your new
repos directory.
$ cd ~/repos
Bitbucket displays a pop-up clone dialog. By default, the clone dialog sets the
protocol to HTTPS or SSH, depending on your settings. For the purposes of
this tutorial, don't change your default protocol.
7. From your terminal window, paste the command you copied from Bitbucket
and press Return.
8. Enter your Bitbucket password when the terminal asks for it. If you created an
account by linking to Google, use your password for that account.
$ git clone
https://emmap1@bitbucket.org/emmap1/bitbucke
Cloning into 'bitbucketspacestation'...
fatal: could not read
Password for 'https://emmap1@bitbucket.org'
If you get this error, enter the following at the command line:
Then go back to step 4 and repeat the clone process. The bash
agent should now prompt you for your password. You should
only have to do this once.
$ cd ~/repos
$ git clone https://emmap1@bitbucket.org/emmap1/
Cloning into 'bitbucketstationlocations'...
Password
warning: You appear to have cloned an empty repo
You already knew that your repository was empty right? Remember that
you have added no source files to it yet.
9. List the contents of your repos directory and you should see your
bitbucketstationlocations directory in it.
$ ls
1. Go to your terminal window and navigate to the top level of your local
repository.
$ cd ~/repos/bitbucketstationlocations/
2. Enter the following line into your terminal window to create a new file with
content.
If the command line doesn't return anything, it means you created the file
correctly!
3. Get the status of your local repository. The git status command tells you
about how your project is progressing in comparison to your Bitbucket
repository.
At this point, Git is aware that you created a new file, and you'll see something
like this:
$ git status
On branch master
Initial commit
Untracked files:
(use "git add <file>..." to include in what will be c
locations.txt
nothing added to commit but untracked files present (
The file is untracked, meaning that Git sees a file not part of a previous
commit. The status output also shows you the next step: adding the file.
4. Tell Git to track your new locations.txt file using the git add command. Just
like when you created a file, the git add command doesn't return anything
when you enter it correctly.
The git add command moves changes from the working directory to the Git
staging area. The staging area is where you prepare a snapshot of a set of
changes before committing them to the official history.
$ git status
On branch master
Initial commit
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: locations.txt
Now you can see the new file has been added (staged) and you can commit it
when you are ready. The git status command displays the state of the
working directory and the staged snapshot.
6. Issue the git commit command with a commit message, as shown on the next
line. The -m indicates that a commit message follows.
Up until this point, everything you have done is on your local system and
invisible to your Bitbucket repository until you push those changes.
7. Go back to your local terminal window and send your committed changes to
Bitbucket using git push origin master. This command specifies that you are
pushing to the master branch (the branch on Bitbucket) on origin (the
Bitbucket server).
8. Go to your BitbucketStationLocations repository on Bitbucket.
Remember how the repository looked when you first created it? It probably looks a
bit different now.
Next on your list of space station administrator activities, you need a file with more
details about your locations. Since you don't have many locations at the moment,
you are going to add them right from Bitbucket.
2. From the Source page, click New file in the top right corner. This button only
appears after you have added at least one file to the repository.
A page for creating the new file opens, as shown in the following image.
A. Branch with new file: Change if you want to add file to a different branch.
B. New file area: Add content for your new file here.
You now have a new file in Bitbucket! You are taken to a page with details of the
commit, where you can see the change you just made:
If you want to see a list of the commits you've made so far, click Commits in the
sidebar.
1. Open your terminal window and navigate to the top level of your local
repository.
$ cd ~/repos/bitbucketstationlocations/
2. Enter the git pull --all command to pull all the changes from Bitbucket. (In
more complex branching workflows, pulling and merging all changes might
not be appropriate .) Enter your Bitbucket password when asked for it. Your
terminal should look similar to the following:
The git pull command merges the file from your remote repository
(Bitbucket) into your local repository with a single command.
3. Navigate to your repository folder on your local system and you'll see the file
you just added.
Fantastic! With the addition of the two files about your space station location, you
have performed the basic Git workflow (clone, add, commit, push, and pull) between
Bitbucket and your local system.
Branches are most powerful when you're working on a team. You can work on your
own part of a project from your own branch, pull updates from Bitbucket, and then
merge all your work into the main branch when it's ready. Our
documentationincludes more explanation of why you would want to use branches.
1. Go to your terminal window and navigate to the top level of your local
repository using the following command:
$ cd ~/repos/bitbucketstationlocations/
This command creates a branch but does not switch you to that branch, so
your repository looks something like this:
The repository history remains unchanged. All you get is a new pointer to the
current branch. To begin working on the new branch, you have to check out
the branch you want to use.
3. Checkout the new branch you just created to start using it.
8. Enter git status in the terminal window. You will see something like this:
$ git status
On branch future-plans
Changes not staged for commit:
(use "git add <file>..." to update what will be commi
(use "git checkout -- <file>..." to discard changes i
modified: stationlocations
no changes added to commit (use "git add" and/or "git
10. Enter the git commit command in the terminal window, as shown with the
following:
With this recent commit, your repository looks something like this:
Now it's time to merge the change that you just made back into the master
branch.
Because you created only one branch and made one change, use the fast-forward
branch method to merge. You can do a fast-forward merge because you have a
linear path from the current branch tip to the target branch. Instead of “actually”
merging the branches, all Git has to do to integrate the histories is move (i.e., “fast-
forward”) the current branch tip up to the target branch tip. This effectively combines
the histories, since all of the commits reachable from the target branch are now
available through the current one.
This branch workflow is common for short-lived topic branches with smaller changes
and are not as common for longer-running features.
1. Go to your terminal window and navigate to the top level of your local
repository.
$ cd ~/repos/bitbucketstationlocations/
2. Enter the git status command to be sure you have all your changes
committed and find out what branch you have checked out.
$ git status
On branch future-plans
nothing to commit, working directory clean
You've essentially moved the pointer for the master branch forward to the
current head and your repository looks something like the fast forward merge
above.
5. Because you don't plan on using future-plans anymore, you can delete the
branch.
When you delete future-plans, you can still access the branch from master
using a commit id. For example, if you want to undo the changes added from
future-plans, use the commit id you just received to go back to that branch.
6. Enter git status to see the results of your merge, which show that your local
repository is one ahead of your remote repository. It will look something like
this:
$ git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
nothing to commit, working directory clean
Next, we need to push all this work back up to Bitbucket, your remote repository.
This diagram shows what happens when your local repository has changes that the
central repository does not have and you push those changes to Bitbucket.
2. Click the Overview page of your Bitbucket repository, and notice you can see
your push in the Recent Activity stream.
3. Click Commits and you can see the commit you made on your local system.
Notice that the change keeps the same commit id as it had on your local
system.
You can also see that the line to the left of the commits list has a straight-
forward path and shows no branches. That’s because the future-plansbranch
never interacted with the remote repository, only the change we created and
committed.
4. Click Branches and notice that the page has no record of the branch either.
5. Click Source, and then click the stationlocations file. You can see the last
change to the file has the commit id you just pushed.
6. Click the file history list to see the changes committed for this file, which will
look similar to the following figure.
Objective
Start a repository with someone else and get some feedback about your change.
Mission Brief
So far, you've been the only person working in a repository. But what if you wanted
to collaborate with your teammates on a repository? You can do that, whether you're
in the same room or across the universe.
When you work on a team with multiple Bitbucket users, you'll want to work on
your own set of code separately from the main codebase. Branches allow you to do
just that. A branch represents an independent line of development for your
repository. Think of it as a brand-new working directory, staging area, and project
history. After you create a branch, you work on and commit code to that branch, pull
updates from Bitbucket to keep your branch up-to-date, and then push all your work
to Bitbucket.
Once you've got code changes on a branch in Bitbucket, you can create a pull
request, which is where code review takes place. Your teammates will comment on
your code with feedback and questions and eventually (hopefully) approve the pull
request. When you have enough approvals, merge the pull request to merge your
branch into the main code.
You just arrived at the Bitbucket space station and it's time to go through the
orientation process, part of which involves making updates to your welcome
package and getting them approved. To get you started, we'll walk you through
creating a team repository with some content and giving someone access.
Step 1. Create a team and add a teammate
Start by creating a team for your repository and teammate. No need to have a
teammate for this tutorial. For our purposes, you'll make a new friend that goes by
the username breezycloud.
2. Enter a Team name you'd like to use. If you can't think of any, we suggest
starting with Planet followed by a number or name, for example Planet Breezy
Cloud or Planet 727. If you pick a team name that already exists, you may
need to edit the Team ID because all IDs must be unique.
3. Click Done.
Now when you create a pull request for your future repository, you'll have someone
to review it!
5. From Version control system, pick an option for the type of repository you
want to create. If you're not sure, keep as is.
8. Name the file survey.html, then copy this code and paste it into the main text
area.
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; chars
<style media="screen" type="text/css">
body {
margin: auto;
width: 700px;
color: #FFFFFF;
font-family: Arial, sans-serif;
background-color: #172B4D;
}
body>h1 {
margin: 50px;
font-size: 50px;
text-align: center;
color: #0052CC;
}
</style>
</head>
<body>
<h1>Team up in space</h1>
<p>
Welcome to the team! You've made it this far so we know t
</p>
<p>
Because you're on your own branch, you can go crazy. Spic
</p>
<br>
<p>
<b>Question 1</b>: Have you used pull requests before?
</p>
<p>
<b>Answer 1</b>: **** Your answer here ****
</p>
<p>
<b>Question 2</b>: Why do you want to learn about code re
</p>
<p>
<b>Answer 2</b>: **** Your answer here ****
</p>
<p>
<b>Question 3</b>: Who do you plan to work with on Bitbuc
</p>
<p>
<b>Answer 3</b>: **** Your answer here ****
</p>
</body>
</html>
In a typical team context, you'd most likely already have the repository cloned before
creating a branch. So that's what we're going to do first before you set up your own
branch.
3. From a terminal window, change into the local directory where you want to
clone your repository.
$ cd ~/<path_to_directory>
For more details, check out our cloning video to see how it's done:
3. After you create a branch, you need to check it out on your local system.
Bitbucket provides you with a fetch and checkout command that you can copy
and paste into your command line, similar to the following:
As you can see, you've switched to your new branch locally, allowing you to work on
and push that separate line of code.
1. Open the survey.html file (or whatever you named it) with a text editor.
2. Make your changes, big or small, and then save and close the file.
3. From your terminal window, you should still be in the repository directory
unless you've changed something. Display the status of the repository with
git status. You should see the survey.html file you modified. If you added or
modified other files, you'll see those as well.
$ git status
On branch my-updates
Your branch is up-to-date with 'origin/my-updates'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committe
(use "git checkout -- <file>..." to discard changes in w
modified: survey.html
no changes added to commit (use "git add" and/or "git comm
5. Commit your changes locally with git commit -m "your commit message":
6. Enter git push origin <branch_name> to push the changes to your branch on
Bitbucket, and enter your password to finish pushing changes.
$ git push origin my-updates
To https://bitbucket.org/planetbreezycloud/first-impressio
1. From the repository, click + in the global sidebar and select Clone this
repository under Get to work.
4. Click the Clone button.
To see how it's done, check out the Clone a repository part of our video here:
1. Click Branches from the left navigation. You'll see that you already have one
branch — your main branch.
3. Enter a Branch name and click Create. If you aren't sure what to name your
branch, go with something like my-updates.
4. After you create a branch, you need to check it out from your local system. To
do so, click the Check out in Sourcetree button.
Now you've got a branch in Bitbucket and it's checked out to your local system,
allowing you to work on and push that separate line of code.
2. Open the survey.html file (or whatever you named it) with a text editor.
3. Make your changes, big or small, and then save and close the file.
10. From the dialog that appears, click OK to push your branch with the commit
to Bitbucket.
11. From Bitbucket, click the Source page of your repository. You should see both
branches in the dropdown. Any other commits you make to my-updates will
also appear on that branch.
To alert your teammates to your updates and get their approval, your next step is to
create a pull request. In addition to a place for code review, a pull request shows
a comparison of your changes against the original repository (also known as a diff)
and provides an easy way to merge code when ready.
1. From your repository, click + in the global sidebar. Then, click Create a pull
request under Get to work.
Bitbucket displays the request form.
Bitbucket opens the pull request and your reviewer receives an email notification
with details of the pull request for them to review.
From the pull request, the reviewer can view the diff and add comments to start a
discussion before clicking the Approve button.
When someone approves your pull request, you'll get an email notification. Once
you've got the approvals you need (in this case just one!), you can merge. From the
pull request, click Merge. And that's it! If you want to see what it looks like when
your branch merges with the main branch, click Commits to see the commit tree.
Learn Branching with Bitbucket Cloud
Get set up Review the branching workflow
Objective
This tutorial will teach you the basics of creating, working in, reviewing, and merging
branches using Git and Bitbucket Cloud.
You have a Bitbucket account
This tutorial is for you if you already understand the basic Git workflow including
how to:
Clone: copying the remote repository in Bitbucket Cloud to your local system
Add or stage: taking changes you have made and get them ready to add to
your git history
Commit: add new or changed files to the git history for the repository
Pull: get new changes others have added to the repository into your local
repository
Push: get changes from your local system onto the remote repository
If you don't know the Git basics, don't worry just check out our Learn Git with
Bitbucket Cloud tutorial and you'll be up to speed in no time.
Get set up
Since we want you to feel like you're working on a team, in a common Bitbucket
repository, we will have you fork a public repository we have supplied.
What is a fork?
Fork is another way of saving a clone or copy. The term fork (in
programming) derives from a Unix system call that creates a copy of an
existing process. So, unlike a branch, a fork is independent from the original
repository. If the original repository is deleted, the fork remains. If you fork a
repository, you get that repository and all of its branches.
1. Go to tutorials/tutorials.git.bitbucket.org
2. Click + > Fork this repository on the left side of the screen.
4. Create a directory for the repository which will be easy to navigate to. You
might choose something like this:
$ mkdir test-repositories
$ cd test-repositories/
$ test-repositories
The preceding example creates the test-repositories directory using the mkdir
(make directory) command and switches to that directory using the cd
(change directory) command.
5. Clone the forked repository into the directory you just created. It might look
something like this:
Which clones the repository using the git clone command and creates the
directory the clone created mygittutorial.git.bitbucket.io.
2. Check out the branch you just created using the git checkout command.
$ git branch
master
* test-1
Note: your change isn't committed to the Git history yet it's in a "waiting"
state. We learned about this in Saving changes.
Note: now the changes is part of the Git history as a single "commit" We
learned about this in Saving changes.
git push
fatal: The current branch test-1 has no upstream bran
To push the current branch and set the remote as upst
git push --set-upstream origin test-1
You will see an error because the first time you push a new branch you created
locally you have to designate that branch.
This tells the system that the origin repository is the destination of this new
branch.
9. Open your tutorial repository and click Branches. You should now see both the
master and the test-1 branches. It should look something like this:
3. Copy the git fetch command in the check out your branch dialog. It will
probably look something like this:
$ git branch
master
test-1
* test-2
The branch with the asterisk * is the active branch. This is critical to remember
when you are working in any branching workflow.
$ git status
On branch test-2
Your branch is up-to-date with 'origin/test-2'.
nothing to commit, working tree clean
You can see what branch you're on and that the branch is currently up to date
with your remote (origin) branch.
You would also add reviewers on your team to the pull request. Learn
more about pull requests
3. Make a comment in the pull request by selecting a line in the diff (the area
displaying the change you made to the editme.html file).
4. Click Approve in the top left of the page. Of course in a real pull request
you'd have reviewers making comments
5. Click Merge.
8. Click Commits and you will see how the branch you just merged fits into the
larger scheme of changes.
$ git status
On branch test-1
nothing to commit, working tree clean
You can see you're on the branch you just used to make your change and that
you don't have any changes. We're ready to get rid of that branch now that
we've finished that work.
Notice that the message says you are up-to-date? This is only your local
branch. We know this because we just merged a change into master and
haven't pulled that change from the remote repository to our local system.
That's what we'll do next.
3. Run the git pull command. The result should look something like this:
$ git pull
remote: Counting objects: 1, done.
remote: Total 1 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (1/1), done.
From https://bitbucket.org/dstevenstest/dans.git.bitb
2d4c0ab..dd424cb master -> origin/master
Updating 2d4c0ab..dd424cb
Fast-forward
editme.html | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
What happened is that when you pull the changes from the remote repository
git runs a fast-forward merge to integrate the changes you made. It also lists
how many files and lines in that file changed.
You can see that it deleted the branch and what the last commit hash was for
that branch. This is the safe way to delete a branch because git won't allow
you to delete the branch if it has uncommitted changes. You should be aware
however that this won't prevent deleting changes which are committed to the
git history but not merged into another branch.
6. Merge the master branch into your working branch using the git merge
master test-2 command. The result will look something like this:
To see what branch is active at any time use git branch and the active
branch will have an asterisk or use git status and it will tell you want
branch you are on and if there are pending local changes.
We hope you've learned a bit about branching and the commands involved. Let's
review what we just covered:
The Git Feature Branch workflow is an efficient way to get working with your team in
Bitbucket. In this workflow, all feature development takes place on branches separate
from the main master branch. As a result, multiple developers can work on their own
features without touching the main code.
Create a new-branch
Use a separate branch for each feature or issue you work on.
After creating a branch, check it out locally so that any
changes you make will be on that branch.
Resolve feedback
Now your teammates comment and approve. Resolve their
comments locally, commit, and push changes to Bitbucket.
Your updates appear in the pull request.
Objective
Learn how to undo changes on your local machine and a Bitbucket Cloud repository
while collaborating with others.
Mission Brief
Commands covered in this tutorial: git revert, git reset, git log, and git status
You have a Bitbucket account
Everyone makes mistakes. Not every push is perfect so this tutorial will help you use
the most common git functions to undo a change or changes safely.
git clone
git commit
git pull
git push
If you don't know those commands we can help you Learn git with Bitbucket Cloud.
Then come back here and learn how to undo changes. These git commands are
applicable to a windows or unix environment. This tutorial will utilize unix command
line utilities when instructing file system navigation.
When the change you want to undo is on your local system and hasn't been pushed
to a remote repository there are two primary ways to undo your change:
Command Definition
--soft: Only resets the HEAD to the commit you select. Works
basically the same as git checkout <commit #> but does not
create a detached head state.
--mixed: Resets the HEAD to the commit you select in both the
git reset
history and undoes the changes in the index.
--hard: Resets the HEAD to the commit you select in both the
history, undoes the changes in the index, and undoes the
changes in your working directory. We won't be testing a hard
reset for this tutorial.
As you progress through the tutorial you'll learn several other git commands as part
of learning how to undo changes, so let's get started.
Fork a repository
Let's begin by creating a unique repository with all the code from the original. This
process is called “forking a repository”. Forking is an extended git process that is
enabled when a shared repository is hosted with a 3rd party hosting service like
Bitbucket.
2. Click the + symbol on the left sidebar, then select Fork this repository, review
the dialog and click Fork repository.
Now that you've got a repository full of code and an existing history on your local
system you're ready to begin undoing some changes.
You'll have to be able to find and reference the change you want to undo. This can
be accomplished by browsing the commit UI on Bitbucket and there are a few
command line utilities that can locate a specific change.
git status
Git status returns the state of your working directory (the location of the repository
on your local system) and the staging area (where you prepare a set of changes to
add to the project history) and will show any files which have changes and if those
changes have been added to the staging area. Let us now execute git status and
examine the current state of the repository.
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean
The output of git status here shows us that everything is up-to-date with the
remote master branch and there are no pending changes are waiting to be
committed. In the next example we will make some edits to the repository and
examine it in a pending changes state. This means you have changes to files in the
repository on your local system that you haven't prepared (or staged) to be added to
the project history.
To demonstrate this next example, first open the myquote2.html file. Make some
modifications to the contents of myquote2.html, save and exit the file. Let us once
again execute git status to examine the repository in this state.
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Modified: myquote2.html
The output here shows that the repository has pending modifications to
myquote2.html. Good news! If the change you want to undo has, like the example
above, not been added to the staging area yet you can just edit the file and keep
going. Git only starts tracking a change when you add it to the staging area and then
commit it to the project history.
Let us now “undo” the changes we have made to myquote2.html. Because this is a
simplified example with minimal changes, we have two available methods for
undoing the changes. If we execute git checkout myquote2.html The repository will
restore myquote2.html to the previously committed version. Alternatively, we can
execute git reset --hard which will revert the whole repository to the last commit.
git log
The git log command lets you list the project history, filter it, and search for specific
changes. While git status lets you inspect the working directory and the staging
area, git log only shows the committed history.
The same log of commited history can be found within the Bitbucket UI by accessing
the “commits” view of a repository. The commits view for our demo repository can
be found at: https://bitbucket.org/dans9190/tutorial-documentation-
tests/commits/all. This view will have similar output to the git log command line
utility. It can be used to find and identify a commit to undo.
In the following example you can see several things in the history but each change is,
at it's root, a commit so that's what we'll need to find and undo.
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
$ git log
commit 1f08a70e28d84d5034a8076db9103f22ec2e982c
Author: Daniel Stevens <dstevens@atlassian.com>
Date: Wed Feb 7 17:06:50 2018 +0000
commit 52f823ca251a132225dd1cc18ad768de8d336e84
Author: Daniel Stevens <dstevens@atlassian.com>
Date: Fri Sep 30 15:50:58 2016 -0700
commit 4801b87c2147dce83f1bf31acfcffa6cb1d7e0a5
Merge: 1a6a403 3b29606
Author: Dan Stevens [Atlassian] <dstevens@atlassian.com>
Date: Fri Jul 29 18:45:34 2016 +0000
Changes
commit 52f823ca251a132225dd1cc18ad768de8d336e84
Author: Daniel Stevens <dstevens@atlassian.com>
Date: Fri Sep 30 15:50:58 2016 -0700
What you can see is each commit message has four elements:
Element Description
Commit Best practice tip: write short descriptive commit messages and you'll help
message create a more harmonious working repository for everyone.
1. Go to your terminal window and navigate to the top level of your local
repository using the cd (change directory) command.
$ cd ~/repos/tutorial-documentation-tests/
Enter the git log --oneline command. Adding --oneline will display each commit
on a single line that allows you to see more history in your terminal.
Press the q key to exit the commit log and return to your command prompt at any
time.
3. Locate the commit with the hash c5826da and more changes in the list the
git log command produced. Someone didn't write a descriptive commit
message so we'll have to figure out if that's got the changes we need.
4. Highlight and copy the commit hash c5826da from the git log result in your
terminal window.
5. Type git show then paste or transcribe the commit hash you copied and press
enter. You should see something like this:
more changes
* Summary of set up
-* Configuration
-* Dependencies
-* Database configuration
-* How to run tests
-* Deployment instructions
-* more stuff and things
:
The prompt at the bottom will continue to fill in until it shows the entire change.
Press q to exit to your command prompt.
Filter the git log to find a specific commit
You can filter and adjust the output of the git log with the following additions:
The 10 most
Limits the number of commits shown recent commits in
-<n> git log -10
the history
This was a very brief look at the git log command if you like working in the
command like you'll probably want to check out the advanced git log tutorial.
To get started let's just undo the latest commit in the history. In this case let's say
you just enabled Bitbucket's CI/CD solution pipelines but realized the script isn't
quite right.
2. Copy the commit hash for the second commit in the log: 52f823c then
press q to exit the log.
3. Enter git reset --soft 52f823c in your terminal window. The command
should run in the background if successful. That's it, you've undone your first
change. Now let's see the result of this action.
4. Enter git status in your terminal window and you will see the commit was
undone and is now an uncommitted change. It should look something like
this:
$ git status
On branch master
Your branch is behind 'origin/master' by 1 commit, and can
(use "git pull" to update your local branch)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
6. You can see the new HEAD of the branch is commit 52f823c which is exactly
what you wanted.
7. Press q to exit the log. Leave your terminal open because now that you've
learned how to do a simple reset, let's try something a little more complex.
Let's say you've realized that pull request #6 ( 4801b87), needed to be reworked and
you want to keep a clean history so you'll reset the HEAD to commit 1a6a403 this
time you'll use the git reset command.
2. Copy the commit hash 1a6a403 (myquote edited online with Bitbucket) which
is the commit just below pull request #6 which has the changes we want to
undo.
3. Enter git reset 1a6a403 in your terminal window. The output should look
something like this:
You can see that the changes are now in an uncommitted state. This means that now
we've removed several changes from both the history of the project and the staging
area.
4. Enter git status in your terminal window. The output should look something
like this:
$ git status
On branch master
Your branch is behind 'origin/master' by 6 commits, and ca
(use "git pull" to update your local branch)
modified: README.md
modified: myquote2.html
Untracked files:
(use "git add <file>..." to include in what will be comm
bitbucket-pipelines.yml
Now you can see that the first change we undid (the bitbucket-pipelines.yml file) is
now completely untracked by git. This is because invoking git reset removes the
change from both the head of the branch and the tracking or index area of git. The
underlying process is a bit more complex than we can cover here, you can read more
in git reset.
The log output now shows the commit history has also been modified and begins at
commit 1a6a403. For the sake of demonstration and further example, Let’s say we
want to now undo the reset we just did. After further consideration, maybe we
wanted to keep the contents of pull request #6.
Git resets are one of a few “undo” methods git offers. Resets are generally
considered an ‘unsafe’ option for undoing changes. Resets are fine when working
locally on isolated code but become risky when shared with team members.
In order to share a branch that has been reset with a remote team a ‘forced push’
has to be executed. A ‘forced push’ is initiated by executing git push -f. A forced
push will destroy any history on the branch that was built after the point of the push.
Dev B has been working on the same branch developing a separate feature.
Dev B decides to reset the branch to an earlier state before both Dev A and
Dev B started work.
Dev B then force pushes the reset branch to the remote repository.
Dev A pulls the branch to receive any updates. During this pull Dev A receives
the forced update. This resets Dev A’s local branch back in time before any of
their feature work was done and loses their commits.
Undo a git reset
So far we have been passing git commit Sha hashes to git reset. The git log
output is now missing commits that we have reset. How will we get those commits
back? Git never fully deletes commit unless it has become detached any pointers to
it. Furthermore git stores a separate log of all ref movement called “the reflog”. We
can examine the reflog by executing git reflog.
Your output from git reflog should be similar to the above. You can see a history of
actions on the repo. The top line is a reference to the reset we did to reset pull
request #6. Let us now reset the reset to restore pull request #6. The second column
of this reflog output indicates a ref pointer to a modification action take on the repo.
Here HEAD@{0} is a reference to the reset command we previously executed. We do
not want to reply that reset command so we will restore the repo to HEAD@{1}.
Let us now examine the repos commit history with git log --oneline:
Here we can see that the repo’s commit history has been restored to the previous
version we were experimenting with. We can see that commit 4801b87 restored even
though it appeared lost from the first reset operation. The git reflog is a powerful
tool for undoing changes in the repository. Learn more in depth usage on the
git reflog page.
git revert
The previous set of examples did some serious time traveling undo operations using
git reset and git reflog. Git contains another ‘undo’ utility which is often
considered ‘safer’ than reseting. Reverting creates new commits which contain an
inverse of the specified commits changes. These revert commits can then be safely
pushed to remote repositories to share with other developers.
The following section will demonstrate git revert usage. Let us continue with our
example from the previous section. To start let us examine the log and find a commit
to revert.
For this example let’s pick the most recent commit 1f08a70 as our commit to
operate on. For this scenario let's say that we want to undo the edits made in that
commit. Execute:
This will kick off a git merge workflow. Git will create a new commit thats content is
a reverse of the commit that was specified for the revert. Git will then open up a
configured text editor to prompt for a new commit message. Reverts are considered
the safer undo option because of this commit workflow. The creation of revert
commits leave a clear trail in the commit history of when an undo operation was
executed.
Congratulations, you’re done! Come back to this tutorial any time or head to the
Undoing Changes section to go more in depth. Keep up the good work in Bitbucket!
What is version control
Benefits of version control
Version control systems are a category of software tools that help a software team
manage changes to source code over time. Version control software keeps track of
every modification to the code in a special kind of database. If a mistake is made,
developers can turn back the clock and compare earlier versions of the code to help
fix the mistake while minimizing disruption to all team members.
For almost all software projects, the source code is like the crown jewels - a precious
asset whose value must be protected. For most software teams, the source code is a
repository of the invaluable knowledge and understanding about the problem
domain that the developers have collected and refined through careful effort.
Version control protects source code from both catastrophe and the casual
degradation of human error and unintended consequences.
Software developers working in teams are continually writing new source code and
changing existing source code. The code for a project, app or software component is
typically organized in a folder structure or "file tree". One developer on the team
may be working on a new feature while another developer fixes an unrelated bug by
changing code, each developer may make their changes in several parts of the file
tree.
Version control helps teams solve these kinds of problems, tracking every individual
change by each contributor and helping prevent concurrent work from conflicting.
Changes made in one part of the software can be incompatible with those made by
another developer working at the same time. This problem should be discovered and
solved in an orderly manner without blocking the work of the rest of the team.
Further, in all software development, any change can introduce new bugs on its own
and new software can't be trusted until it's tested. So testing and development
proceed together until a new version is ready.
Software teams that do not use any form of version control often run into problems
like not knowing which changes that have been made are available to users or the
creation of incompatible changes between two unrelated pieces of work that must
then be painstakingly untangled and reworked. If you're a developer who has never
used version control you may have added versions to your files, perhaps with suffixes
like "final" or "latest" and then had to later deal with a new final version. Perhaps
you've commented out code blocks because you want to disable certain functionality
without deleting the code, fearing that there may be a use for it later. Version control
is a way out of these problems.
Version control software is an essential part of the every-day of the modern software
team's professional practices. Individual software developers who are accustomed to
working with a capable version control system in their teams typically recognize the
incredible value version control also gives them even on small solo projects. Once
accustomed to the powerful benefits of version control systems, many developers
wouldn't consider working without it even for non-software projects.
Developing software without using version control is risky, like not having backups.
Version control can also enable developers to move faster and it allows software
teams to preserve efficiency and agility as the team scales to include more
developers.
Version Control Systems (VCS) have seen great improvements over the past few
decades and some are better than others. VCS are sometimes known as SCM (Source
Code Management) tools or RCS (Revision Control System). One of the most popular
VCS tools in use today is called Git. Git is a Distributed VCS, a category known as
DVCS, more on that later. Like many of the most popular VCS systems available
today, Git is free and open source. Regardless of what they are called, or which
system is used, the primary benefits you should expect from version control are as
follows.
1. A complete long-term change history of every file. This means every change
made by many individuals over the years. Changes include the creation and
deletion of files as well as edits to their contents. Different VCS tools differ on
how well they handle renaming and moving of files. This history should also
include the author, date and written notes on the purpose of each change.
Having the complete history enables going back to previous versions to help
in root cause analysis for bugs and it is crucial when needing to fix problems
in older versions of software. If the software is being actively worked on,
almost everything can be considered an "older version" of the software.
3. Traceability. Being able to trace each change made to the software and
connect it to project management and bug tracking software such as Jira, and
being able to annotate each change with a message describing the purpose
and intent of the change can help not only with root cause analysis and other
forensics. Having the annotated history of the code at your fingertips when
you are reading the code, trying to understand what it is doing and why it is
so designed can enable developers to make correct and harmonious changes
that are in accord with the intended long-term design of the system. This can
be especially important for working effectively with legacy code and is crucial
in enabling developers to estimate future work with any accuracy.
While it is possible to develop software without using any version control, doing so
subjects the project to a huge risk that no professional team would be advised to
accept. So the question is not whether to use version control but which version
control system to use.
There are many choices, but here we are going to focus on just one, Git. Learn more
about other types of version control software.
What is Git
Performance Security Flexibility Version control with Git
By far, the most widely used modern version control system in the world today is Git.
Git is a mature, actively maintained open source project originally developed in 2005
by Linus Torvalds, the famous creator of the Linux operating system kernel. A
staggering number of software projects rely on Git for version control, including
commercial projects as well as open source. Developers who have worked with Git
are well represented in the pool of available software development talent and it
works well on a wide range of operating systems and IDEs (Integrated Development
Environments).
In addition to being distributed, Git has been designed with performance, security
and flexibility in mind.
Performance
The raw performance characteristics of Git are very strong when compared to many
alternatives. Committing new changes, branching, merging and comparing past
versions are all optimized for performance. The algorithms implemented inside Git
take advantage of deep knowledge about common attributes of real source code file
trees, how they are usually modified over time and what the access patterns are.
Unlike some version control software, Git is not fooled by the names of the files
when determining what the storage and version history of the file tree should be,
instead, Git focuses on the file content itself. After all, source code files are frequently
renamed, split, and rearranged. The object format of Git's repository files uses a
combination of delta encoding (storing content differences), compression and
explicitly stores directory contents and version metadata objects.
For example, say a developer, Alice, makes changes to source code, adding a feature
for the upcoming 2.0 release, then commits those changes with descriptive
messages. She then works on a second feature and commits those changes too.
Naturally these are stored as separate pieces of work in the version history. Alice
then switches to the version 1.3 branch of the same software to fix a bug that affects
only that older version. The purpose of this is to enable Alice's team to ship a bug fix
release, version 1.3.1, before version 2.0 is ready. Alice can then return to the 2.0
branch to continue working on new features for 2.0 and all of this can occur without
any network access and is therefore fast and reliable. She could even do it on an
airplane. When she is ready to send all of the individually committed changes to the
remote repository, Alice can "push" them in one command.
Security
Git has been designed with the integrity of managed source code as a top priority.
The content of the files as well as the true relationships between files and directories,
versions, tags and commits, all of these objects in the Git repository are secured with
a cryptographically secure hashing algorithm called SHA1. This protects the code
and the change history against both accidental and malicious change and ensures
that the history is fully traceable.
With Git, you can be sure you have an authentic content history of your source code.
Some other version control systems have no protections against secret alteration at
a later date. This can be a serious information security vulnerability for any
organization that relies on software development.
Flexibility
One of Git's key design objectives is flexibility. Git is flexible in several respects: in
support for various kinds of nonlinear development workflows, in its efficiency in
both small and large projects and in its compatibility with many existing systems and
protocols.
Git has been designed to support branching and tagging as first-class citizens (unlike
SVN) and operations that affect branches and tags (such as merging or reverting) are
also stored as part of the change history. Not all version control systems feature this
level of tracking.
Git is the best choice for most software teams today. While every team is different
and should do their own analysis, here are the main reasons why version control with
Git is preferred over alternatives:
Git is good
Git has the functionality, performance, security and flexibility that most teams and
individual developers need. These attributes of Git are detailed above. In side-by-
side comparisons with most other alternatives, many teams find that Git is very
favorable.
Vast numbers of developers already have Git experience and a significant proportion
of college graduates may have experience with only Git. While some organizations
may need to climb the learning curve when migrating to Git from another version
control system, many of their existing and future developers do not need to be
trained on Git.
In addition to the benefits of a large talent pool, the predominance of Git also means
that many third party software tools and services are already integrated with Git
including IDEs, and our own tools like DVCS desktop client Sourcetree, issue and
project tracking software, Jira, and code hosting service, Bitbucket.
If you are an inexperienced developer wanting to build up valuable skills in software
development tools, when it comes to version control, Git should be on your list.
Git enjoys great community support and a vast user base. Documentation is
excellent and plentiful, including books, tutorials and dedicated web sites. There are
also podcasts and video tutorials.
Being open source lowers the cost for hobbyist developers as they can use Git
without paying a fee. For use in open-source projects, Git is undoubtedly the
successor to the previous generations of successful open source version control
systems, SVN and CVS.
Criticism of Git
One common criticism of Git is that it can be difficult to learn. Some of the
terminology in Git will be novel to newcomers and for users of other systems, the Git
terminology may be different, for example, revert in Git has a different meaning
than in SVN or CVS. Nevertheless, Git is very capable and provides a lot of power to
its users. Learning to use that power can take some time, however once it has been
learned, that power can be used by the team to increase their development speed.
For those teams coming from a non-distributed VCS, having a central repository may
seem like a good thing that they don't want to lose. However, while Git has been
designed as a distributed version control system (DVCS), with Git, you can still have
an official, canonical repository where all changes to the software must be stored.
With Git, because each developer's repository is complete, their work doesn't need
to be constrained by the availability and performance of the "central" server. During
outages or while offline, developers can still consult the full project history. Because
Git is flexible as well as being distributed, you can work the way you are accustomed
to but gain the additional benefits of Git, some of which you may not even realise
you're missing.
Now that you understand what version control is, what Git is and why software
teams should use it, read on to discover the benefits Git can provide across the
whole organization.
Why Git for your organization
Git for developers Git for marketing Git for product
management Git for designers Git for customer
support Git for human resources Git for anyone
managing a budget
Switching from a centralized version control system to Git changes the way your
development team creates software. And, if you’re a company that relies on its
software for mission-critical applications, altering your development workflow
impacts your entire business.
In this article, we’ll discuss how Git benefits each aspect of your organization, from
your development team to your marketing team, and everything in between. By the
end of this article, it should be clear that Git isn’t just for agile software development
—it’s for agile business.
Using feature branches is not only more reliable than directly editing production
code, but it also provides organizational benefits. They let you represent
development work at the same granularity as the your agile backlog. For example,
you might implement a policy where each Jira ticket is addressed in its own feature
branch.
Distributed Development
In SVN, each developer gets a working copy that points back to a single central
repository. Git, however, is a distributed version control system. Instead of a working
copy, each developer gets their own local repository, complete with a full history of
commits.
Having a full local history makes Git fast, since it means you don’t need a network
connection to create commits, inspect previous versions of a file, or perform diffs
between commits.
Community
In many circles, Git has come to be the expected version control system for new
projects. If your team is using Git, odds are you won’t have to train new hires on your
workflow, because they’ll already be familiar with distributed development.
In addition, Git is very popular among open source projects. This means it’s easy to
leverage 3rd-party libraries and encourage others to fork your own open source
code.
As you might expect, Git works very well with continuous integration and continuous
delivery environments. Git hooks allow you to run scripts when certain events occur
inside of a repository, which lets you automate deployment to your heart’s content.
You can even build or deploy code from specific branches to different servers.
For example, you might want to configure Git to deploy the most recent commit
from the develop branch to a test server whenever anyone merges a pull request
into it. Combining this kind of build automation with peer review means you have
the highest possible confidence in your code as it moves from development to
staging to production.
The shorter development cycle facilitated by Git makes it much easier to divide these
into individual releases. This gives marketers more to talk about, more often. In the
above scenario, marketing can build out three campaigns that revolve around each
feature, and thus target very specific market segments.
For instance, they might prepare a big PR push for the game changing feature, a
corporate blog post and newsletter blurb for Mary’s feature, and some guest posts
about Rick’s underlying UX theory for sending to external design blogs. All of these
activities can be synchronized with a separate release.
The benefits of Git for product management is much the same as for marketing.
More frequent releases means more frequent customer feedback and faster updates
in reaction to that feedback. Instead of waiting for the next release 8 weeks from
now, you can push a solution out to customers as quickly as your developers can
write the code.
The feature branch workflow also provides flexibility when priorities change. For
instance, if you’re halfway through a release cycle and you want to postpone one
feature in lieu of another time-critical one, it’s no problem. That initial feature can sit
around in its own branch until engineering has time to come back to it.
This same functionality makes it easy to manage innovation projects, beta tests, and
rapid prototypes as independent codebases.
Pull requests take this one step further and provide a formal place for interested
parties to discuss the new interface. Designers can make any necessary changes, and
the resulting commits will show up in the pull request. This invites everybody to
participate in the iteration process.
Perhaps the best part of prototyping with branches is that it’s just as easy to merge
the changes into production as it is to throw them away. There’s no pressure to do
either one. This encourages designers and UI developers to experiment while
ensuring that only the best ideas make it through to the customer.
Customer support and customer success often have a different take on updates than
product managers. When a customer calls them up, they’re usually experiencing
some kind of problem. If that problem is caused by your company’s software, a bug
fix needs to be pushed out as soon as possible.
Git’s streamlined development cycle avoids postponing bug fixes until the next
monolithic release. A developer can patch the problem and push it directly to
production. Faster fixes means happier customers and fewer repeat support tickets.
Instead of being stuck with, “Sorry, we’ll get right on that” your customer support
team can start responding with “We’ve already fixed it!
To a certain extent, your software development workflow determines who you hire. It
always helps to hire engineers that are familiar with your technologies and
workflows, but using Git also provides other advantages.
Employees are drawn to companies that provide career growth opportunities, and
understanding how to leverage Git in both large and small organizations is a boon to
any programmer. By choosing Git as your version control system, you’re making the
decision to attract forward-looking developers.
Git is all about efficiency. For developers, it eliminates everything from the time
wasted passing commits over a network connection to the man hours required to
integrate changes in a centralized version control system. It even makes better use of
junior developers by giving them a safe environment to work in. All of this affects the
bottom line of your engineering department.
But, don’t forget that these efficiencies also extend outside your development team.
They prevent marketing from pouring energy into collateral for features that aren’t
popular. They let designers test new interfaces on the actual product with little
overhead. They let you react to customer complaints immediately.
Being agile is all about finding out what works as quickly as possible, magnifying
efforts that are successful, and eliminating ones that aren’t. Git serves as a multiplier
for all your business activities by making sure every department is doing their job
more efficiently.
Install Git
Install Git on Mac OS X Install Git on WindowsInstall Git
on Linux
There are several ways to install Git on a Mac. In fact, if you've installed XCode (or it's
Command Line Tools), Git may already be installed. To find out, open a terminal and
enter git --version.
$ git --version
git version 2.7.0 (Apple Git-66)
Apple actually maintain and ship their own fork of Git, but it tends to lag behind
mainstream Git by several major versions. You may want to install a newer version of
Git using one of the methods below:
$ git --version
git version 2.9.2
4. Configure your Git username and email using the following commands,
replacing Emma's name with your own. These details will be associated with
any commits that you create:
5. (Optional) To make Git remember your username and password when working
with HTTPS repositories, configure the git-credential-osxkeychain helper.
3. Configure your Git username and email using the following commands,
replacing Emma's name with your own. These details will be associated with
any commits that you create:
4. (Optional) To make Git remember your username and password when working
with HTTPS repositories, install the git-credential-osxkeychain helper.
3. Install Git with bash completion, the OS X keychain helper, and the docs:
4. Configure your Git username and email using the following commands,
replacing Emma's name with your own. These details will be associated with
any commits that you create:
5. (Optional) To make Git remember your username and password when working
with HTTPS repositories, configure the git-credential-osxkeychain helper.
$ git credential-osxkeychain
usage: git credential-osxkeychain <get|store|erase>
If you receive a usage statement, skip to step 4. If the helper is not installed,
go to step 2.
$ curl -O http://github-media-downloads.s3.amazonaws.
$ sudo mv git-credential-osxkeychain /usr/local/bin/
The next time Git prompts you for a username and password, it will cache
them in your keychain for future use.
To learn how to use Git with Sourcetree (and how to host your Git repositories on
Bitbucket) you can follow our comprehensive Git tutorial with Bitbucket and
Sourcetree.
1. From your terminal install XCode's Command Line Tools (if you haven't
already):
$ xcode-select --install
2. Install Homebrew.
2. When you've successfully started the installer, you should see the Git
Setupwizard screen. Follow the Next and Finish prompts to complete the
installation. The default options are pretty sensible for most users.
3. Open a Command Prompt (or Git Bash if during installation you elected not to
use Git from the Windows Command Prompt).
4. Run the following commands to configure your Git username and email using
the following commands, replacing Emma's name with your own. These details
will be associated with any commits that you create:
Bitbucket supports pushing and pulling over HTTP to your remote Git
repositories on Bitbucket. Every time you interact with the remote repository,
you must supply a username/password combination. You can store these
credentials, instead of supplying the combination every time, with the Git
Credential Manager for Windows.
To learn how to use Git with Sourcetree (and how to host your Git repositories on
Bitbucket) you can follow our comprehensive Git tutorial with Bitbucket and
Sourcetree.
$ git --version
git version 2.9.2
3. Configure your Git username and email using the following commands,
replacing Emma's name with your own. These details will be associated with
any commits that you create:
Fedora (dnf/yum)
Git packages are available via both yum and dnf:
1. From your shell, install Git using dnf (or yum, on older versions of Fedora):
or
$ git --version
git version 2.9.2
3. Configure your Git username and email using the following commands,
replacing Emma's name with your own. These details will be associated with
any commits that you create
Git requires the several dependencies to build on Linux. These are available via apt:
1. From your shell, install the necessary dependencies using apt-get:
2. Clone the Git source (or if you don't yet have a version of Git
installed, download and extract it):
Fedora
Git requires the several dependencies to build on Linux. These are available via
both yum and dnf:
1. From your shell, install the necessary build dependencies using dnf (or yum,
on older versions of Fedora):
or using yum. For yum, you may need to install the Extra Packages for
Enterprise Linux (EPEL) repository first:
3. Clone the Git source (or if you don't yet have a version of Git
installed, download and extract it):
git init Create empty Git repo in specified directory. Run with no arguments git commit Replace the last commit with the staged changes and last commit
<directory> to initialize the current directory as a git repository. --amend combined. Use with nothing staged to edit the last commit’s message.
git clone <repo> Clone repo located at <repo> onto local machine. Original repo can be git rebase <base> Rebase the current branch onto <base>. <base> can be a commit ID,
located on the local filesystem or on a remote machine via HTTP or SSH. a branch name, a tag, or a relative reference to HEAD.
git config Define author name to be used for all commits in current repo. Devs git reflog Show a log of changes to the local repository’s HEAD. Add
user.name <name> commonly use --global flag to set config options for current user. --relative-date flag to show date info or --all to show all refs.
git add Stage all changes in <directory> for the next commit.
<directory> Replace <directory> with a <file> to change a specific file. Git Branches
git commit -m Commit the staged snapshot, but instead of launching a text editor, git branch List all of the branches in your repo. Add a <branch> argument to
"<message>" use <message> as the commit message. create a new branch with the name <branch>.
git status List which files are staged, unstaged, and untracked. git checkout -b Create and check out a new branch named <branch>. Drop the -b
<branch> flag to checkout an existing branch.
git log Display the entire commit history using the default format. git merge <branch> Merge <branch> into the current branch.
For customization see additional options.
git diff Show unstaged changes between your index and working directory.
Remote Repositories
git remote add Create a new connection to a remote repo. After adding a remote,
Undoing Changes <name> <url> you can use <name> as a shortcut for <url> in other commands.
git revert Create new commit that undoes all of the changes made in git fetch Fetches a specific <branch>, from the repo. Leave off <branch> to
<commit> <commit>, then apply it to the current branch. <remote> <branch> fetch all remote refs.
git reset <file> Remove <file> from the staging area, but leave the working directory git pull <remote> Fetch the specified remote’s copy of current branch and immediately
unchanged. This unstages a file without overwriting any changes. merge it into the local copy.
git clean -n Shows which files would be removed from working directory. Use git push Push the branch to <remote>, along with necessary commits and
the -f flag in place of the -n flag to execute the clean. <remote> <branch> objects. Creates named branch in the remote repo if it doesn’t exist.
git config --global git diff HEAD Show difference between working directory and last commit.
Define the author name to be used for all commits by the current user.
user.name <name> git diff --cached Show difference between staged changes and last commit
git config --global Define the author email to be used for all commits by the current user.
user.email <email>
git reset
git config --global
Create shortcut for a Git command. E.g. alias.glog log --graph git reset Reset staging area to match most recent commit, but leave the
alias. <alias-name>
--oneline will set git glog equivalent to git log --graph --oneline. working directory unchanged.
<git-command>
git config --system Set text editor used by commands for all users on the machine. <editor> git reset --hard Reset staging area and working directory to match most recent
core.editor <editor> arg should be the command that launches the desired editor (e.g., vi). commit and overwrites all changes in the working directory.
git config Open the global configuration file in a text editor for manual editing. git reset <commit> Move the current branch tip backward to <commit>, reset the
--global --edit staging area to match, but leave the working directory alone.
git reset --hard Same as previous, but resets both the staging area & working directory to
git log <commit> match. Deletes uncommitted changes, and all commits after <commit>.
git log -<limit> Limit number of commits by <limit>. E.g. git log -5 will limit to 5
commits. git rebase
git log --oneline Condense each commit to a single line. git rebase -i Interactively rebase current branch onto <base>. Launches editor to enter
git log -p <base> commands for how each commit will be transferred to the new base.
Display the full diff of each commit.
git log --stat Include which files were altered and the relative number of lines
that were added or deleted from each of them. git pull
git log --author= Search for commits by a particular author. git pull --rebase Fetch the remote’s copy of current branch and rebases it into the local
”<pattern>” <remote> copy. Uses git rebase instead of merge to integrate the branches.
git log Search for commits with a commit message that matches <pattern>.
--grep=”<pattern>”
git push
git log Show commits that occur between <since> and <until>. Args can be a git push <remote> Forces the git push even if it results in a non-fast-forward merge. Do not use
<since>..<until> commit ID, branch name, HEAD, or any other kind of revision reference. --force the --force flag unless you’re absolutely sure you know what you’re doing.
git log -- <file> Only display commits that have the specified file. git push <remote> Push all of your local branches to the specified remote.
--all
git log --graph --graph flag draws a text based graph of commits on left side of commit git push <remote> Tags aren’t automatically pushed when you push a branch or use the
--decorate msgs. --decorate adds names of branches or tags of commits shown. --tags --all flag. The --tags flag sends all of your local tags to the remote repo.
This tutorial provides an overview of how to set up a repository (repo) under Git
version control. This resource will walk you through initializing a Git repository for a
new or existing project. Included below are workflow examples of repositories both
created locally and cloned from remote repositories. This guide assumes a basic
familiarity with a command-line interface.
By the end of this module, you should be able to create a Git repo, use common Git
commands, commit a modified file, view your project’s history and configure a
connection to a Git hosting service (Bitbucket).
A Git repository is a virtual storage of your project. It allows you to save versions of
your code, which you can access when needed.
To create a new repo, you'll use the git init command. git init is a one-time
command you use during the initial setup of a new repo. Executing this command
will create a new .git subdirectory in your current working directory. This will also
create a new master branch.
cd /path/to/your/existing/code
git init
Pointing git init to an existing project directory will execute the same initialization
setup as mentioned above, but scoped to that project directory.
git init <project directory>
Visit the git init page for a more detailed resource on git init.
If a project has already been set up in a central repository, the clone command is the
most common way for users to obtain a local development clone. Like git init,
cloning is generally a one-time operation. Once a developer has obtained a working
copy, all version control operations are managed through their local repository.
git clone is used to create a copy or clone of remote repositories. You pass
git clone a repository URL. Git supports a few different network protocols and
corresponding URL formats. In this example, we'll be using the Git SSH protocol. Git
SSH URLs follow a template of: git@HOSTNAME:USERNAME/REPONAME.git
HOSTNAME: bitbucket.org
USERNAME: rhyolight
REPONAME: javascript-data-store
When executed, the latest version of the remote repo files on the master branch will
be pulled down and added to a new folder. The new folder will be named after the
REPONAME in this case javascript-data-store. The folder will contain the full
history of the remote repository and a newly created master branch.
For more documentation on git clone usage and supported Git URL formats, visit
the git clone Page.
Now that you have a repository cloned or initialized, you can commit file version
changes to it. The following example assumes you have set up a project at
/path/to/project. The steps being taken in this example are:
Create a new commit with a message describing what work was done in the
commit
cd /path/to/project
echo "test content for git tutorial" >> CommitTest.txt
git add CommitTest.txt
git commit -m "added CommitTest.txt to the repo"
After executing this example, your repo will now have CommitTest.txt added to the
history and will track future updates to the file.
This example introduced two additional git commands: add and commit. This was a
very limited example, but both commands are covered more in depth on the git
addand git commit pages. Another common use case for git add is the --all
option. Executing git add --all will take any changed and untracked files in the
repo and add them to the repo and update the repo's working tree.
It’s important to understand that Git’s idea of a “working copy” is very different from
the working copy you get by checking out source code from an SVN repository.
Unlike SVN, Git makes no distinction between the working copies and the central
repository—they're all full-fledged Git repositories.
This makes collaborating with Git fundamentally different than with SVN. Whereas
SVN depends on the relationship between the central repository and the working
copy, Git’s collaboration model is based on repository-to-repository interaction.
Instead of checking a working copy into SVN’s central repository, you push or pull
commits from one repository to another.
Of course, there’s nothing stopping you from giving certain Git repos special
meaning. For example, by simply designating one Git repo as the “central”
repository, it’s possible to replicate a centralized workflow using Git. This is
accomplished through conventions rather than being hardwired into the VCS itself.
If you used git init to make a fresh repo, you'll have no remote repo to push
changes to. A common pattern when initializing a new repo is to go to a hosted Git
service like Bitbucket and create a repo there. The service will provide a Git URL that
you can then add to your local Git repository and git push to the hosted repo. Once
you have created a remote repo with your service of choice you will need to update
your local repo with a mapping. We discuss this process in the Configuration & Set
Up guide below.
If you prefer to host your own remote repo, you'll need to set up a "Bare Repository."
Both git init and git clone accept a --bare argument. The most common use
case for bare repo is to create a remote central Git repository
Once you have a remote repo setup, you will need to add a remote repo url to your
local git config, and set an upstream branch for your local branches. The
git remote command offers such utility.
This command will push the local repo branch under <local_branc_name> to the
remote repo at <remote_name>.
For more in-depth look at git remote, see the Git remote page.
In addition to configuring a remote repo URL, you may also need to set global Git
configuration options such as username, or email. The git config command lets you
configure your Git installation (or an individual repository) from the command line.
This command can define everything from user info, to preferences, to the behavior
of a repository. Several common configuration options are listed below.
Git stores configuration options in three separate files, which lets you scope options
to individual repositories (local), user (Global), or the entire system (system):
Global: /.gitconfig – User-specific settings. This is where options set with the
--global flag are stored.
Define the author name to be used for all commits in the current repository.
Typically, you’ll want to use the --global flag to set configuration options for the
current user.
Define the author name to be used for all commits by the current user.
Adding the --local option or not passing a config level option at all, will set the
user.name for the current local repository.
Define the author email to be used for all commits by the current user.
Create a shortcut for a Git command. This is a powerful utility to create custom
shortcuts for commonly used git commands. A simplistic example would be:
This creates a ci command that you can execute as a shortcut to git commit. To
learn more about git aliases visit the git config page.
git config --system core.editor <editor>
Define the text editor used by commands like git commit for all users on the current
machine. The <editor> argument should be the command that launches the desired
editor (e.g., vi). This example introduces the --system option. The --systemoption
will set the configuration for the entire system, meaning all users and repos on a
machine. For more detailed information on configuration levels visit the git config
page.
Open the global configuration file in a text editor for manual editing. An in-depth
guide on how to configure a text editor for git to use can be found on the Git config
page.
Discussion
All configuration options are stored in plaintext files, so the git config command is
really just a convenient command-line interface. Typically, you’ll only need to
configure a Git installation the first time you start working on a new development
machine, and for virtually all cases, you'll want to use the --global flag. One
important exception is to override the author email address. You may wish to set
your personal email address for personal and open source repositories, and your
professional email address for work-related repositories.
Git stores configuration options in three separate files, which lets you scope options
to individual repositories, users, or the entire system:
When options in these files conflict, local settings override user settings, which
override system-wide. If you open any of these files, you’ll see something like the
following:
You can manually edit these values to the exact same effect as git config.
Example
The first thing you’ll want to do after installing Git is tell it your name/email and
customize some of the default settings. A typical initial configuration might look
something like the following:
This will produce the ~ /.gitconfig file from the previous section. Take a more in-
depth look at git config on the git config page.
Summary
Read our guide about which code repository system is right for your team!
git init
This page will explore the git init command in depth. By the end of this page you
will be informed on the core functionality and extended feature set of git init. This
exploration includes:
.git directory overview
git init templates
The git init command creates a new Git repository. It can be used to convert an
existing, unversioned project to a Git repository or initialize a new, empty repository.
Most other Git commands are not available outside of an initialized repository, so
this is usually the first command you'll run in a new project.
Aside from the .git directory, in the root directory of the project, an existing project
remains unaltered (unlike SVN, Git doesn't require a .git subdirectory in every
subdirectory).
Usage
Compared to SVN, the git init command is an incredibly easy way to create new
version-controlled projects. Git doesn’t require you to create a repository, import
files, and check out a working copy. Additionally, Git does not require any pre-
existing server or admin privileges. All you have to do is cd into your project
subdirectory and run git init, and you'll have a fully functional Git repository.
git init
Transform the current directory into a Git repository. This adds a .git subdirectory
to the current directory and makes it possible to start recording revisions of the
project.
If you've already run git init on a project directory and it contains a .git
subdirectory, you can safely run git init again on the same project directory. It will
not override an existing .git configuration.
Initialize an empty Git repository, but omit the working directory. Shared repositories
should always be created with the --bare flag (see discussion below).
Conventionally, repositories initialized with the --bare flag end in .git. For example,
the bare version of a repository called my-project should be stored in a directory
called my-project.git.
The --bare flag creates a repository that doesn’t have a working directory, making it
impossible to edit files and commit changes in that repository. You would create a
bare repository to git push and git pull from, but never directly commit to it. Central
repositories should always be created as bare repositories because pushing branches
to a non-bare repository has the potential to overwrite changes. Think of --bare as a
way to mark a repository as a storage facility, as opposed to a development
environment. This means that for virtually all Git workflows, the central repository is
bare, and developers local repositories are non-bare.
The most common use case for git init --bare is to create a remote central
repository:
Initializes a new Git repository and copies files from the <template_directory>into
the repository.
The default templates are a good reference and example of how to utilize template
features. A powerful feature of templates that's exhibited in the default templates is
Git Hook configuration. You can create a template with predefined Git hooks and
initialize your new git repositories with common hooks ready to go. Learn more
about Git Hooks at the Git Hook page.
Configuration
-Q
--QUIET
Only prints "critical level" messages, Errors, and Warnings. All other output is
silenced.
--BARE
--TEMPLATE=<TEMPLATEDIRECTORY>
Specifies the directory from which templates will be used. (See the "Git Init
Templates" section above.)
--SEPARATE-GIT-DIR=<GIT DIR>
Creates a text file containing the path to <git dir>. This file acts as a link to the
.git directory. This is useful if you would like to store your .git directory on a
separate location or drive from your project's working files. Some common use cases
for --separate-git-dir are:
To keep your system configuration "dotfiles" ( .bashrc, .vimrc, etc.) in the
home directory while keeping the .git folder elsewhere
Your Git history has grown very large in disk size and you need to move it
elsewhere to a separate high-capacity drive
You can call git init --separate-git-dir on an existing repository and the
.git dir will be moved to the specified <git dir> path.
--SHARED[=(FALSE|TRUE|UMASK|GROUP|ALL|WORLD|EVERYBODY|0XXX)]
Set access permissions for the new repository. This specifies which users and groups
using Unix-level permissions are allowed to push/pull to the repository.
Examples
cd /path/to/code \
git init \
git add . \
git commit
mkdir -p /path/to/template \
echo "Hello World" >> /absolute/path/to/template/README \
git init /new/repo/path --template=/absolute/path/to/templ
cd /new/repo/path \
cat /new/repo/path/README
git clone
Here we'll examine the git clone command in depth. git clone is a Git command
line utility which is used to target an existing repository and create a clone, or copy
of the target repository. In this page we'll discuss extended configuration options
and common use cases of git clone. Some points we'll cover here are:
On the setting up a repository guide, we covered a basic use case of git clone. This
page will explore more complex cloning and configuration scenarios.
If a project has already been set up in a central repository, the git clone command
is the most common way for users to obtain a development copy. Like git init,
cloning is generally a one-time operation. Once a developer has obtained a working
copy, all version control operations and collaborations are managed through their
local repository.
Repo-to-repo collaboration
It’s important to understand that Git’s idea of a “working copy” is very different from
the working copy you get by checking out code from an SVN repository. Unlike SVN,
Git makes no distinction between the working copy and the central repository—
they're all full-fledged Git repositories.
This makes collaborating with Git fundamentally different than with SVN. Whereas
SVN depends on the relationship between the central repository and the working
copy, Git’s collaboration model is based on repository-to-repository interaction.
Instead of checking a working copy into SVN’s central repository,
you push or pullcommits from one repository to another.
Of course, there’s nothing stopping you from giving certain Git repos special
meaning. For example, by simply designating one Git repo as the “central”
repository, it’s possible to replicate a centralized workflow using Git. The point is, this
is accomplished through conventions rather than being hardwired into the VCS itself.
Usage
git clone is primarily used to point to an existing repo and make a clone or copy of
that repo at in a new directory, at another location. The original repository can be
located on the local filesystem or on remote machine accessible supported
protocols. The git clone command copies an existing Git repository. This is sort of
like SVN checkout, except the “working copy” is a full-fledged Git repository—it has
its own history, manages its own files, and is a completely isolated environment from
the original repository.
Clone the repository located at <repo> into the folder called ~<directory>! on the
local machine.
Clone the repository located at <repo> and only clone the ref for <tag>.
Shallow clone
Configuration options
This above example would clone only the new_feature branch from the remote Git
repository. This is purely a convince utility to save you time from downloading the
HEAD ref of the repository and then having to additionally fetch the ref you need.
Clones the repo at <repo location> and applies the template from
<template directory> to the newly created local branch. A thorough refrence on Git
templates can be found on our git init page.
Git URLs
Git has its own URL syntax which is used to pass remote repository locations to Git
commands. Because git clone is most commonly used on remote repositories we
will examine Git URL syntax here.
- GIT
A protocol unique to git. Git comes with a daemon that runs on port (9418). The
protocol is similar to SSH however it has NO AUTHENTICATION.
git://host.xz[:port]/path/to/repo.git/
- HTTP
Hyper text transfer protocol. The protocol of the web, most commonly used for
transferring web page HTML data over the Internet. Git can be configured to
communicate over HTTP http[s]://host.xz[:port]/path/to/repo.git/
Summary
In this document we took a deep look at git clone. The most important takeaways
are:
4. There are many different configuration options available that change the content
of the clone
For further, deeper reference on git clone functionality, consult the official Git
documentation. We also cover practical examples of git clone in our setting up a
repository guide.
git config
In this document, we'll take an in-depth look at the git config command. We briefly
discussed git config usage on our Setting up a Repository page. The git config
command is a convenience function that is used to set Git configuration values on a
global or local project level. These configuration levels correspond to .gitconfigtext
files. Executing git config will modify a configuration text file. We'll be covering
common configuration settings like email, username, and editor. We'll discuss Git
aliases, which allow you to create shortcuts for frequently used Git operations.
Becoming familiar with git config and the various Git configuration settings will
help you create a powerful, customized Git workflow.
Usage
The most basic use case for git config is to invoke it with a configuration name,
which will display the set value at that name. Configuration names are dot delimited
strings composed of a 'section' and a 'key' based on their hierarchy. For example:
user.email
In this example, email is a child property of the user configuration block. This will
return the configured email address, if any, that Git will associate with locally created
commits.
--local
--global
--system
System-level configuration is applied across an entire machine. This covers all users
on an operating system and all repos. The system level configuration file lives in a
gitconfig file off the system root path. $(prefix)/etc/gitconfig on unix systems.
On windows this file can be found at
C:\Documents and Settings\All Users\Application Data\Git\config on Windows
XP, and in C:\ProgramData\Git\config on Windows Vista and newer.
Thus the order of priority for configuration levels is: local, global, system. This means
when looking for a configuration value, Git will start at the local level and bubble up
to the system level.
Writing a value
Expanding on what we already know about git config, let's look at an example in
which we write a value:
Many Git commands will launch a text editor to prompt for further input. One of the
most common use cases for git config is configuring which editor Git should use.
Listed below is a table of popular editors and matching git config commands:
Sublime Text
~ git config --global core.editor "subl -n -w"~
(Mac)
Sublime Text
(Win, 32-bit ~ git config --global core.editor "'c:/program files (x86)/sublime text 3/sublimetext.exe' -w"~
install)
Sublime Text
(Win, 64-bit ~ git config --global core.editor "'c:/program files/sublime text 3/sublimetext.exe' -w"~
install)
Merge tools
In the event of a merge conflict, Git will launch a "merge tool." By default, Git uses an
internal implementation of the common Unix diff program. The internal Git diff is a
minimal merge conflict viewer. There are many external third party merge conflict
resolutions that can be used instead. For an overview of various merge tools and
configuration, see our guide on tips and tools to resolve conflits with Git.
Colored outputs
Git supports colored terminal output which helps with rapidly reading Git output.
You can customize your Git output to use a personalized color theme. The
git config command is used to set these color values.
color.ui
This is the master variable for Git colors. Setting it to false will disable all Git's
colored terminal output.
By default, color.ui is set to auto which will apply colors to the immediate terminal
output stream. The auto setting will omit color code output if the output stream is
redirected to a file or piped to another process.
You can set the color.ui value to always which will also apply color code output
when redirecting the output stream to files or pipes. This can unintentionally cause
problems since the receiving pipe may not be expecting color-coded input.
normal
black
red
green
yellow
blue
magenta
cyan
white
Colors may also be specified as hexadecimal color codes like #ff0000, or ANSI 256
color values if your terminal supports it.
2. color.branch.<slot>
This value is also applicable to Git branch output. < slot> is one of the
following:
1. current: the current branch
3. color.diff
Applies colors to git diff, git log, and git show output
Configuring a < slot> value under color.diff tells git which part of the patch
to use a specific color on.
1. context: The context text of the diff. Git context is the lines of text
content shown in a diff or patch that highlights changes.
5. color.decorate.<slot>
Customize the color for git log --decorate output. The supported < slot>
values are: branch, remoteBranch, tag, stash, or HEAD. They are respectively
applicable to local branches, remote-tracking branches, tags, stashed changes
and HEAD.
6. color.grep
7. color.grep.<slot>
Also applicable to git grep. The < slot> variable specifies which part of the
grep output to apply color.
1. context: non-matching text in context lines
8. color.interactive
This variable applies color for interactive prompts and displays. Examples are
git add --interactive and git clean --interactive
9. color.interactive.<slot>
10. color.pager
11. color.showBranch
Enables or disables color output for the git show branch command
12. color.status
A boolean value that enables or disables color output for Git status
13. color.status.<slot>
Used to specify custom color for specified git status elements. < slot> supports the
following values:
1. header
Targets the header text of the status area
2. added or updated
Both target files which are added but not committed
3. changed
Targets files that are modified but not added to the git index
4. untracked
Targets files which are not tracked by Git
5. branch
Applies color to the current branch
6. nobranch
The color the "no branch" warning is shown in
7. unmerged
Colors files which have unmerged changes
Aliases
You may be familiar with the concept of aliases from your operating system
command-line; if not, they're custom shortcuts that define which command will
expand to longer or combined commands. Aliases save you the time and energy cost
of typing frequently used commands. Git provides its own alias system. A common
use case for Git aliases is shortening the commit command. Git aliases are stored in
Git configuration files. This means you can use the git config command to
configure aliases.
This example creates an alias amend which composes the ci alias into a new alias
that uses --amend flag.
Git has several "whitespace" features that can be configured to highlight whitespace
issues when using git diff. The whitespace issues will be highlighted using the
configured color color.diff.whitespace
Summary
In this article, we covered the use of the git config command. We discussed how the
command is a convince method for editing raw git config files on the filesystem.
We looked at basic read and write operations for configuration options. We took a
look at common config patterns:
Overall, git config is a helper tool that provides a shortcut to editing raw
git config files on disk. We covered in depth personal customization options. Basic
knowledge of git configuration options is a prerequisite for setting up a repository.
See our guide there for a demonstration of the basics.
Git Alias
This section will focus on Git aliases. To better understand the value of Git aliases we
must first discuss what an alias is. The term alias is synonymous with a shortcut. Alias
creation is a common pattern found in other popular utilities like `bash` shell. Aliases
are used to create shorter commands that map to longer commands. Aliases enable
more efficient workflows by requiring fewer keystrokes to execute a command. For a
brief example, consider the git checkout command. The checkout command is a
frequently used git command, which adds up in cumulative keystrokes over time. An
alias can be created that maps git co to git checkout, which saves precious human
fingertip power by allowing the shorter keystroke form: git co to be typed instead.
The previous code example creates globally stored shortcuts for common git
commands. Creating the aliases will not modify the source commands. So
git checkout will still be available even though we now have the git co alias. These
aliases were created with the --global flag which means they will be stored in Git's
global operating system level configuration file. On linux systems, the global config
file is located in the User home directory at /.gitconfig.
[alias]
co = checkout
br = branch
ci = commit
st = status
This demonstrates that the aliases are now equivalent to the source commands.
Usage
Git aliasing is enabled through the use of git config, For command-line option and
usage examples please review the git config documentation.
Examples
The preceding code example creates a new alias unstage. This now enables the
invocation of git unstage. git unstage which will perform a reset on the staging
area. This makes the following two commands equivalent.
Discussion
The config files will respect an [alias] section that looks like:
[alias]
co = checkout
Invoking this command will update the underlying global config file just as it had
been edited in our previous example.
When working in Git, or other version control systems, the concept of "saving" is a
more nuanced process than saving in a word processor or other traditional file
editing applications. The traditional software expression of "saving" is synonymous
with the Git term "committing". A commit is the Git equivalent of a "save". Traditional
saving should be thought of as a file system operation that is used to overwrite an
existing file or write a new file. Alternatively, Git committing is an operation that acts
upon a collection of files and directories.
Saving changes in Git vs SVN is also a different process. SVN Commits or 'check-ins'
are operations that make a remote push to a centralized server. This means an SVN
commit needs Internet access in order to fully 'save' project changes. Git commits
can be captured and built up locally, then pushed to a remote server as needed
using the git push -u origin master command. The difference between the two
methods is a fundamental difference between architecture designs. Git is a
distributed application model whereas SVN is a centralized model. Distributed
applications are generally more robust as they do not have a single point of failure
like a centralized server.
The commands: git add, git status, and git commit are all used in combination to
save a snapshot of a Git project's current state.
Git has an additional saving mechanism called 'the stash'. The stash is an ephemeral
storage area for changes that are not ready to be committed. The stash operates on
the working directory, the first of the three trees and has extensive usage options. To
learn more visit the git stash page.
A Git repository can be configured to ignore specific files or directories. This will
prevent Git from saving changes to any ignored content. Git has multiple methods of
configuration that manage the ignore list. Git ignore configure is discussed in further
detail on the git ignore page.
git add
The git add command adds a change in the working directory to the staging area. It
tells Git that you want to include updates to a particular file in the next commit.
However, git add doesn't really affect the repository in any significant way—
changes are not actually recorded until you run git commit.
In conjunction with these commands, you'll also need git status to view the state
of the working directory and the staging area.
How it works
The git add and git commit commands compose the fundamental Git workflow.
These are the two commands that every Git user needs to understand, regardless of
their team’s collaboration model. They are the means to record versions of a project
into the repository’s history.
Developing a project revolves around the basic edit/stage/commit pattern. First, you
edit your files in the working directory. When you’re ready to save a copy of the
current state of the project, you stage changes with git add. After you’re happy with
the staged snapshot, you commit it to the project history with git commit. The
git reset command is used to undo a commit or staged snapshot.
In addition to git add and git commit, a third command git push is essential for a
complete collaborative Git workflow. git push is utilized to send the committed
changes to remote repositories for collaboration. This enables other team members
to access a set of saved changes.
The git add command should not be confused with svn add, which adds a file to
the repository. Instead, git add works on the more abstract level of changes. This
means that git add needs to be called every time you alter a file, whereas svn add
only needs to be called once for each file. It may sound redundant, but this workflow
makes it much easier to keep a project organized.
Instead of committing all of the changes you've made since the last commit, the
stage lets you group related changes into highly focused snapshots before actually
committing it to the project history. This means you can make all sorts of edits to
unrelated files, then go back and split them up into logical commits by adding
related changes to the stage and commit them piece-by-piece. As in any revision
control system, it’s important to create atomic commits so that it’s easy to track
down bugs and revert changes with minimal impact on the rest of the project.
Common options
git add -p
Begin an interactive staging session that lets you choose portions of a file to add to
the next commit. This will present you with a chunk of changes and prompt you for a
command. Use y to stage the chunk, n to ignore the chunk, s to split it into smaller
chunks, e to manually edit the chunk, and q to exit.
Examples
When you’re starting a new project, git add serves the same function as svn import.
To create an initial commit of the current directory, use the following two commands:
git add .
git commit
Once you’ve got your project up-and-running, new files can be added by passing the
path to git add:
The above commands can also be used to record changes to existing files. Again, Git
doesn’t differentiate between staging changes in new files vs. changes in files that
have already been added to the repository.
Summary
In review, git add is the first command in a chain of operations that directs Git to
"save" a snapshot of the current project state, into the commit history. When used
on its own, git add will promote pending changes from the working directory to the
staging area. The git status command is used to examine the current state of the
repository and can be used to confirm a git add promotion. The git reset
command is used to undo a git add. The git commit command is then used to
Commit a snapshot of the staging directory to the repositories commit history.
git commit
The git commit command commits the staged snapshot to the project history. Committed snapshots can be thought of as “safe” versions of a project—Git will never change
them unless you explicity ask it to. Along with git add, this is one of the most important Git commands.
While they share the same name, this command is nothing like svn commit. Snapshots are committed to the local repository, and this requires absolutely no interaction with
other Git repositories.
Usage
git commit
Commit the staged snapshot. This will launch a text editor prompting you for a commit message. After you’ve entered a message, save the file and close the editor to create the
actual commit. git commit -m "<message>"
Commit the staged snapshot, but instead of launching a text editor, use <message>as the commit message.
git commit -a
Commit a snapshot of all changes in the working directory. This only includes modifications to tracked files (those that have been added with git add at some point in their
history).
Discussion
Snapshots are always committed to the local repository. This is fundamentally different from SVN, wherein the working copy is committed to the central repository. In contrast,
Git doesn’t force you to interact with the central repository until you’re ready. Just as the staging area is a buffer between the working directory and the project history, each
developer’s local repository is a buffer between their contributions and the central repository.
This changes the basic development model for Git users. Instead of making a change and committing it directly to the central repo, Git developers have the opportunity to
accumulate commits in their local repo. This has many advantages over SVN-style collaboration: it makes it easier to split up a feature into atomic commits, keep related
commits grouped together, and clean up local history before publishing it to the central repository. It also lets developers work in an isolated environment, deferring integration
until they’re at a convenient break point.
Git's snapshot model has a far-reaching impact on virtually every aspect of its version control model, affecting everything from its branching and merging tools to its
collaboration workflows.
Example
The following example assumes you’ve edited some content in a file called hello.py and are ready to commit it to the project history. First, you need to stage the file with
git add, then you can commit the staged snapshot.
This will open a text editor (customizable via git config) asking for a commit message, along with a list of what’s being committed:
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
#modified: hello.py
Git doesn't require commit messages to follow any specific formatting constraints, but the canonical format is to summarize the entire commit on the first line in less than 50
characters, leave a blank line, then a detailed explanation of what’s been changed. For example:
Note that many developers also like to use present tense in their commit messages. This makes them read more like actions on the repository, which makes many of the
history-rewriting operations more intuitive.
Git diff
git add git commit git diff git stash.gitignore
Diffing is a function that takes two input data sets and outputs the changes between
them. git diff is a multi-use Git command that when executed runs a diff function
on Git data sources. These data sources can be commits, branches, files and more.
This document will discuss common invocations of git diff and diffing work flow
patterns. The git diff command is often used along with git status and git log
to analyze the current state of a Git repo.
If we execute git diff at this point, there will be no output. This is expected
behavior as there are no changes in the repo to diff. Once the repo is created and
we've added the diff_test.txt file, we can change the contents of the file to start
experimenting with diff output.
Executing this command will change the content of the diff_test.txt file. Once
modified, we can view a diff and analyze the output. Now executing git diff will
produce the following output:
1. Comparison input
diff --git a/diff_test.txt b/diff_test.txt
This line displays the input sources of the diff. We can see that a/diff_test.txtand
b/diff_test.txt have been passed to the diff.
2. Meta data
This line displays some internal Git metadata. You will most likely not need this
information. The numbers in this output correspond to Git object version hash
identifiers.
--- a/diff_test.txt
+++ b/diff_test.txt
These lines are a legend that assigns symbols to each diff input source. In this case,
changes from a/diff_test.txt are marked with a --- and the changes from
b/diff_test.txt are marked with the +++ symbol.
4. Diff chunks
The remaining diff output is a list of diff 'chunks'. A diff only displays the sections of
the file that have changes. In our current example, we only have one chunk as we are
working with a simple scenario. Chunks have their own granular output semantics.
@@ -1 +1 @@
-this is a git diff test example
+this is a diff example
The first line is the chunk header. Each chunk is prepended by a header inclosed
within @@ symbols. The content of the header is a summary of changes made to the
file. In our simplified example, we have -1 +1 meaning line one had changes. In a
more realistic diff, you would see a header like:
@@ -34,6 +34,8 @@
In this header example, 6 lines have been extracted starting from line number 34.
Additionally, 8 lines have been added starting at line number 34.
The remaining content of the diff chunk displays the recent changes. Each changed
line is prepended with a + or - symbol indicating which version of the diff input the
changes come from. As we previously discussed, - indicates changes from the
a/diff_test.txt and + indicates changes from b/diff_test.txt.
Highlighting changes
Now the output displays only the color-coded words that have changed.
2. git diff-highlight
If you clone the git source, you’ll find a sub-directory called contrib. It contains a
bunch of git-related tools and other interesting bits and pieces that haven’t yet been
promoted to git core. One of these is a Perl script called diff-highlight. Diff-highlight
pairs up matching lines of diff output and highlights sub-word fragments that have
changed.
Now we’ve pared down our diff to the smallest possible change.
In addition to the text file utilities we have thus far demonstrated, git diff can be
run on binary files. Unfortunately, the default output is not very helpful.
Git does have a feature that allows you to specify a shell command to transform the
content of your binary files into text prior to performing the diff. It does require a
little set up though. First, you need to specify a textconv filter describing how to
convert a certain type of binary to text. We're using a simple utility called pdftohtml
(available via homebrew) to convert my PDFs into human readable HTML. You can
set this up for a single repository by editing your .git/config file, or globally by
editing ~ /.gitconfig
[diff "pdfconv"]
textconv=pdftohtml -stdout
Then all you need to do is associate one or more file patterns with our pdfconv filter.
You can do this by creating a .gitattributes file in the root of your repository.
*.pdf diff=pdfconv
Once configured, git diff will first run the binary file through the configured
converter script and diff the converter output. The same technique can be applied to
get useful diffs from all sorts of binary files, for example: zips, jars and other archives:
using unzip -l (or similar) in place of pdf2html will show you paths that have been
added or removed between commits images: exiv2 can be used to show metadata
changes such as image dimensions documents: conversion tools exist for
transforming .odf, .doc and other document formats to plain text. In a pinch, strings
will often work for binary files where no formal converter exists.
The git diff command can be passed an explicit file path option. When a file path
is passed to git diff the diff operation will be scoped to the specified file. The
below examples demonstrate this usage.
This example is scoped to ./path/to/file when invoked, it will compare the specific
changes in the working directory, against the index, showing the changes that are
not staged yet. By default git diff will execute the comparison against HEAD.
Omitting HEAD in the example above git diff ./path/to/file has the same effect.
When git diff is invoked with the --cached option the diff will compare the staged
changes with the local repository. The --cached option is synonymous with
--staged.
Invoking git diff without a file path will compare changes across the entire
repository. The above, file specific examples, can be invoked without the
./path/to/file argument and have the same output results across all files in the
local repo.
By default git diff will show you any uncommitted changes since the last commit.
git diff
g
git diff can be passed Git refs to commits to diff. Some example refs are, HEAD,
tags, and branch names. Every commit in Git has a commit ID which you can get
when you execute GIT LOG. You can also pass this commit ID to git diff.
Comparing branches
This example introduces the dot operator. The two dots in this example indicate the
diff input is the tips of both branches. The same effect happens if the dots are
omitted and a space is used between the branches. Additionally, there is a three dot
operator:
The three dot operator initiates the diff by changing the first input parameter
branch1. It changes branch1 into a ref of the shared common ancestor commit
between the two diff inputs, the shared ancestor of branch1 and other-feature-
branch. The last parameter input parameter remains unchanged as the tip of other-
feature-branch.
To compare a specific file across branches, pass in the path of the file as the third
argument to git diff
Summary
This page disscused the Git diffing process and the git diff command. We
discussed how to read git diff output and the various data included in the output.
Examples were provided on how to alter the git diff output with highlighting and
p p g p g g g
colors. We discussed different diffing strategies such as how to diff files in branches
and specific commits. In addition to the git diff command, we also used git log
and git checkout.
Git stash
git add git commit git diff git stash.gitignore
Git Stash
Stashing your work
Partial stashes
The git stash command takes your uncommitted changes (both staged and
unstaged), saves them away for later use, and then reverts them from your working
copy. For example:
$ git status
On branch master
Changes to be committed:
new file: style.css
Changes not staged for commit:
modified: index.html
$ git stash
Saved working directory and index state WIP on master: 500
HEAD is now at 5002d47 our new homepage
$ git status
On branch master
nothing to commit, working tree clean
At this point you're free to make changes, create new commits, switch branches, and
perform any other Git operations; then come back and re-apply your stash when
you're ready.
Note that the stash is local to your Git repository; stashes are not transferred to the
server when you push.
Re-applying your stashed changes
You can reapply previously stashed changes with git stash pop:
$ git status
On branch master
nothing to commit, working tree clean
$ git stash pop
On branch master
Changes to be committed:
new file: style.css
Changes not staged for commit:
modified: index.html
Dropped refs/stash@{0} (32b3aa1d185dfe6d57b3c3cc3b32cbf3e3
Popping your stash removes the changes from your stash and reapplies them to your
working copy.
Alternatively, you can reapply the changes to your working copy and keep them in
your stash with git stash apply:
This is useful if you want to apply the same stashed changes to multiple branches.
Now that you know the basics of stashing, there is one caveat with git stash you
need to be aware of: by default Git won't stash changes made to untracked or
ignored files.
changes made to files that are currently tracked by Git (unstaged changes)
But it will not stash:
new files in your working copy that have not yet been staged
So if we add a third file to our example above, but don't stage it (i.e. we don't run
git add), git stash won't stash it.
$ script.js
$ git status
On branch master
Changes to be committed:
new file: style.css
Changes not staged for commit:
modified: index.html
Untracked files:
script.js
$ git stash
Saved working directory and index state WIP on master: 500
HEAD is now at 5002d47 our new homepage
$ git status
On branch master
Untracked files:
script.js
Adding the -u option (or --include-untracked) tells git stash to also stash your
untracked files:
$ git status
On branch master
Changes to be committed:
new file: style.css
Changes not staged for commit:
modified: index.html
Untracked files:
script.js
$ git stash -u
Saved working directory and index state WIP on master: 500
HEAD is now at 5002d47 our new homepage
$ git status
On branch master
nothing to commit, working tree clean
You can include changes to ignored files as well by passing the -a option (or --all)
when running git stash.
git stash
git stash -u
git stash -a
You aren't limited to a single stash. You can run git stash several times to create
multiple stashes, and then use git stash list to view them. By default, stashes are
identified simply as a "WIP" – work in progress – on top of the branch and commit
that you created the stash from. After a while it can be difficult to remember what
each stash contains:
To provide a bit more context, it's good practice to annotate your stashes with a
description, using git stash save "message":
By default, git stash pop will re-apply the most recently created stash: stash@{0}
You can choose which stash to re-apply by passing its identifier as the last argument,
for example:
Or pass the -p option (or --patch) to view the full diff of a stash:
Partial stashes
You can also choose to stash just a single file, a collection of files, or individual
changes from within files. If you pass the -p option (or --patch) to git stash, it will
iterate through each changed "hunk" in your working copy and ask whether you
wish to stash it:
$ git stash -p
diff --git a/style.css b/style.css
new file mode 100644
index 0000000..d92368b
--- /dev/null
+++ b/style.css
@@ -0,0 +1,3 @@
+* {
+ text-decoration: blink;
+}
Stash this hunk [y,n,q,a,d,/,e,?]? y
diff --git a/index.html b/index.html
index 9daeafb..ebdcbd2 100644
--- a/index.html
+++ b/index.html
@@ -1 +1,2 @@
+<link rel="stylesheet" href="style.css"/>
Stash this hunk [y,n,q,a,d,/,e,?]? n
git stash -p
You can hit ? for a full list of hunk commands. Commonly useful ones are:
Command Description
? help
q quit (any hunks that have already been selected will be stashed)
There is no explicit "abort" command, but hitting CTRL-C(SIGINT) will abort the stash
process.
If the changes on your branch diverge from the changes in your stash, you may run
into conflicts when popping or applying your stash. Instead, you can use
git stash branch to create a new branch to apply your stashed changes to:
This checks out a new branch based on the commit that you created your stash from,
and then pops your stashed changes onto it.
If you decide you no longer need a particular stash, you can delete it with
git stash drop:
$ git stash drop stash@{1}
Dropped stash@{1} (17e2697fd8251df6163117cb3d58c1f62a5e7cd
If you just wanted to know how use git stash, you can stop reading here. But if
you're curious about how Git (and git stash) works under the hood, read on!
Stashes are actually encoded in your repository as commit objects. The special ref at
.git/refs/stash points to your most recently created stash, and previously created
stashes are referenced by the stash ref's reflog. This is why you refer to stashes by
stash@{n}: you're actually referring to the nth reflog entry for the stash ref. Since a
stash is just a commit, you can inspect it with git log:
Depending on what you stashed, a single git stash operation creates either two or
three new commits. The commits in the diagram above are:
stash@{0}, a new commit to store the tracked files that were in your working
copy when you ran git stash
stash@{0}'s first parent, the pre-existing commit that was at HEAD when you
ran git stash
stash@{0}'s second parent, a new commit representing the index when you
ran git stash
stash@{0}'s third parent, a new commit representing untracked files that were
in your working copy when you ran git stash. This third parent only created
if:
your working copy actually contained untracked files; and
Worktree
Staged
changes
Index
HEAD
Repository
Invoking git stash encodes any changes to tracked files as two new commits
in your DAG: one for unstaged changes, and one for changes staged in the
index. The special refs/stash ref is updated to point to them.
git stash
Untracked Ignored
Files Files
Worktree
Empty
Index
Index
HEAD
Repository
Using the --include-untracked option also encodes any changes to untracked
files as an additional commit.
stash --include-untracked
Ignored
Files
Worktree
Empty
Index
Tracked files
refs / stash
HEAD
Repository
Clean
Worktree
Empty
Index
Tracked files
refs / stash
HEAD
Repository
When you run git stash pop, the changes from the commits above are used to
update your working copy and index, and the stash reflog is shuffled to remove the
popped commit. Note that the popped commits aren't immediately deleted, but do
become candidates for future garbage collection.
.gitignore
git add git commit git diff git stash.gitignore
Git sees every file in your working copy as one of three things:
Ignored files are usually build artifacts and machine generated files that can be
derived from your repository source or should otherwise not be committed. Some
common examples are:
Ignored files are tracked in a special file named .gitignore that is checked in at the
root of your repository. There is no explicit git ignore command: instead the
.gitignore file must be edited and committed by hand when you have new files that
you wish to ignore. .gitignore files contain patterns that are matched against file
names in your repository to determine whether or not they should be ignored.
Prepending an exclamation
debug.log mark to a pattern negates
trace.log it. If a file matches a
*.log
but not pattern, but also matches a
!important.log
important.log negating pattern defined
logs/important.log later in the file, it will not be
ignored.
debug0.log
debugg.log A question mark matches
debug?.log
but not exactly one character.
debug10.log
debug2.log
An exclamation mark can
but not
be used to match any
debug[!01].log debug0.log
character except one from
debug1.log
debug01.log the specified set.
debuga.log
debugb.log Ranges can be numeric or
debug[a-z].log
but not alphabetic.
debug1.log
logs/ logs/debug.log
Nope! Due to a
!logs/important.log logs/important.log
performance-related quirk
in Git, you can notnegate a
file that is ignored due to a
pattern matching a
directory
logs/debug.log
A double asterisk matches
logs/**/debug.log logs/monday/debug.log
zero or more directories.
logs/monday/pm/debug.log
logs/monday/debug.log
logs/tuesday/debug.log Wildcards can be used in
logs/*day/debug.log
but not directory names as well.
logs/latest/debug.log
** these explanations assume your .gitignore file is in the top level directory of your
repository, as is the convention. If your repository has multiple .gitignore files, simply
mentally replace "repository root" with "directory containing the .gitignore file" (and
consider unifying them, for the sanity of your team).*
You can use \ to escape .gitignore pattern characters if you have files or directories
containing them:
Git ignore rules are usually defined in a .gitignore file at the root of your repository.
However, you can choose to define multiple .gitignore files in different directories
in your repository. Each pattern in a particular .gitignore file is tested relative to the
directory containing that file. However the convention, and simplest approach, is to
define a single .gitignore file in the root. As your .gitignore file is checked in, it is
versioned like any other file in your repository and shared with your teammates
when you push. Typically you should only include patterns in .gitignore that will
benefit other users of the repository.
You can also define personal ignore patterns for a particular repository in a special
file at .git/info/exclude. These are not versioned, and not distributed with your
repository, so it's an appropriate place to include patterns that will likely only benefit
you. For example if you have a custom logging setup, or special development tools
that produce files in your repository's working directory, you could consider adding
them to .git/info/exclude to prevent them from being accidentally committed to
your repository.
In addition, you can define global Git ignore patterns for all repositories on your
local system by setting the Git core.excludesFile property. You'll have to create this
file yourself. If you're unsure where to put your global .gitignore file, your home
directory isn't a bad choice (and makes it easy to find later). Once you've created the
file, you'll need to configure its location with git config:
$ touch ~/.gitignore
$ git config --global core.excludesFile ~/.gitignore
You should be careful what patterns you choose to globally ignore, as different file
types are relevant for different projects. Special operating system files (e.g.
.DS_Store and thumbs.db) or temporary files created by some developer tools are
typical candidates for ignoring globally.
If you want to ignore a file that you've committed in the past, you'll need to delete
the file from your repository and then add a .gitignore rule for it. Using the
--cached option with git rm means that the file will be deleted from your
repository, but will remain in your working directory as an ignored file.
You can omit the --cached option if you want to delete the file from both the
repository and your local file system.
$ cat .gitignore
*.log
$ git add -f debug.log
$ git commit -m "Force adding debug.log"
You might consider doing this if you have a general pattern (like *.log) defined, but
you want to commit a specific file. However a better solution is to define an
exception to the general rule:
This approach is more obvious, and less confusing, for your teammates.
git stash is a powerful Git feature for temporarily shelving and reverting local
changes, allowing you to re-apply them later on. As you'd expect, by default
git stash ignores ignored files and only stashes changes to files that are tracked by
Git. However, you can invoke git stash with the --all option to stash changes to
ignored and untracked files as well.
You can pass multiple file names to git check-ignore if you like, and the names
themselves don't even have to correspond to files that exist in your repository.
Git Status: Inspecting a repository
git status git tag git blame
git status
The git status command displays the state of the working directory and the
staging area. It lets you see which changes have been staged, which haven’t, and
which files aren’t being tracked by Git. Status output does not show you any
information regarding the committed project history. For this, you need to use
git log.
git blame
The high-level function of git blame is the display of author metadata
attached to specific committed lines in a file. This is used to explore the
history of specific code and answer questions about what, how, and
why the code was added to a repository.
git log
The git log command displays committed snapshots. It lets you list the
project history, filter it, and search for specific changes.
Usage
git status
Discussion
The git status command is a relatively straightforward command. It simply shows
you what's been going on with git add and git commit. Status messages also
include relevant instructions for staging/unstaging files. Sample output showing the
three main categories of a git status call is included below:
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
#modified: hello.py
#
# Changes not staged for commit:
# (use "git add <file>..." to update what will be committe
# (use "git checkout -- <file>..." to discard changes in w
#
#modified: main.py
#
# Untracked files:
# (use "git add <file>..." to include in what will be comm
#
#hello.pyc
Ignoring Files
Untracked files typically fall into two categories. They're either files that have just
been added to the project and haven't been committed yet, or they're compiled
binaries like .pyc, .obj, .exe, etc. While it's definitely beneficial to include the
former in the git status output, the latter can make it hard to see what’s actually
going on in your repository.
For this reason, Git lets you completely ignore files by placing paths in a special file
called .gitignore. Any files that you'd like to ignore should be included on a
separate line, and the * symbol can be used as a wildcard. For example, adding the
following to a .gitignore file in your project root will prevent compiled Python
modules from appearing in git status:
*.pyc
Example
It's good practice to check the state of your repository before committing changes
so that you don't accidentally commit something you don't mean to. This example
displays the repository status before and after staging and committing a snapshot:
# Edit hello.py
git status
# hello.py is listed under "Changes not staged for commit"
git add hello.py
git status
# hello.py is listed under "Changes to be committed"
git commit
git status
# nothing to commit (working directory clean)
The first status output will show the file as unstaged. The git add action will be
reflected in the second git status, and the final status output will tell you that there
is nothing to commit—the working directory matches the most recent commit. Some
Git commands (e.g., git merge) require the working directory to be clean so that you
don't accidentally overwrite changes.
git log
The git log command displays committed snapshots. It lets you list the project
history, filter it, and search for specific changes. While git status lets you inspect
the working directory and the staging area, git log only operates on the committed
history.
Log output can be customized in several ways, from simply filtering commits to
displaying them in a completely user-defined format. Some of the most common
configurations of git log are presented below.
Usage
git log
Display the entire commit history using the default formatting. If the output takes up
more than one screen, you can use Space to scroll and q to exit.
Limit the number of commits by <limit>. For example, git log -n 3 will display only
3 commits.
Condense each commit to a single line. This is useful for getting a high-level
overview of the project history.
Along with the ordinary git log information, include which files were altered and
the relative number of lines that were added or deleted from each of them.
git log -p
Display the patch representing each commit. This shows the full diff of each commit,
which is the most detailed view you can have of your project history.
Show only commits that occur between <since> and <until>. Both arguments can
be either a commit ID, a branch name, HEAD, or any other kind of revision reference.
Only display commits that include the specified file. This is an easy way to see the
history of a particular file.
A few useful options to consider. The --graph flag that will draw a text based graph
of the commits on the left hand side of the commit messages. --decorate adds the
names of branches or tags of the commits that are shown. --oneline shows the
commit information on a single line making it easier to browse through commits at-
a-glance.
Discussion
The git log command is Git's basic tool for exploring a repository’s history. It’s what
you use when you need to find a specific version of a project or figure out what
changes will be introduced by merging in a feature branch.
commit 3157ee3718e180a9476bf2e5cab8e3f1e78a73b7
Author: John Smith
Most of this is pretty straightforward; however, the first line warrants some
explanation. The 40-character string after commit is an SHA-1 checksum of the
commit’s contents. This serves two purposes. First, it ensures the integrity of the
commit—if it was ever corrupted, the commit would generate a different checksum.
Second, it serves as a unique ID for the commit.
This ID can be used in commands like git log <since>..<until> to refer to specific
commits. For instance, git log 3157e..5ab91 will display everything between the
commits with ID's 3157e and 5ab91. Aside from checksums, branch names
(discussed in the Branch Module) and the HEAD keyword are other common
methods for referring to individual commits. HEAD always refers to the current
commit, be it a branch or a specific commit.
The ~ character is useful for making relative references to the parent of a commit.
For example, 3157e~1 refers to the commit before 3157e, and HEAD~3 is the great-
grandparent of the current commit.
The idea behind all of these identification methods is to let you perform actions
based on specific commits. The git log command is typically the starting point for
these interactions, as it lets you find the commits you want to work with.
Example
The Usage section provides many examples of git log, but keep in mind that several
options can be combined into a single command:
This will display a full diff of all the changes John Smith has made to the file
hello.py.
The .. syntax is a very useful tool for comparing branches. The next example displays
a brief overview of all the commits that are in some-feature that are not in master.
Tagging
This document will discuss the Git concept of tagging and the git tag command.
Tags are ref's that point to specific points in Git history. Tagging is generally used to
capture a point in history that is used for a marked version release (i.e. v1.0.1). A tag
is like a branch that doesn’t change. Unlike branches, tags, after being created, have
no further history of commits. For more info on branches visit the git branch page.
This document will cover the different kind of tags, how to create tags, listing all
tags, deleting tags, sharing tags, and more.
Creating a tag
Replace <tagname> with a semantic identifier to the state of the repo at the time the
tag is being created. A common pattern is to use version numbers like git tag v1.4.
Git supports two different types of tags, annotated and lightweight tags. The
previous example created a lightweight tag. Lightweight tags and Annotated tags
differ in the amount of accompanying meta data they store. A best practice is to
consider Annotated tags as public, and Lightweight tags as private. Annotated tags
store extra meta data such as: the tagger name, email, and date. This is important
data for a public release. Lightweight tags are essentially 'bookmarks' to a commit,
they are just a name and a pointer to a commit, useful for creating quick links to
relevant commits.
Annotated Tags
Annotated tags are stored as full objects in the Git database. To reiterate, They store
extra meta data such as: the tagger name, email, and date. Similar to commits and
commit messages Annotated tags have a tagging message. Additionally, for security,
annotated tags can be signed and verified with GNU Privacy Guard (GPG). Suggested
best practices for git tagging is to prefer annotated tags over lightweight so you can
have all the associated meta-data.
Executing this command will create a new annotated tag identified with v1.4. The
command will then open up the configured default text editor to prompt for further
meta data input.
git tag -a v1.4 -m "my version 1.4"
Executing this command is similar to the previous invocation, however, this version
of the command is passed the -m option and a message. This is a convenience
method similar to git commit -m that will immediately create a new tag and forgo
opening the local text editor in favor of saving the message passed in with the -m
option.
Lightweight Tags
Listing Tags
git tag
v0.10.0
v0.10.0-rc1
v0.11.0
v0.11.0-rc1
v0.11.1
v0.11.2
v0.12.0
v0.12.0-rc1
v0.12.1
v0.12.2
v0.13.0
v0.13.0-rc1
v0.13.0-rc2
To refine the list of tags the -l option can be passed with a wild card expression:
This previous example uses the -l option and a wildcard expression of -rc which
returns a list of all tags marked with a -rc prefix, traditionally used to identify release
candidates.
Tagging Old Commits
Executing git log will output a list of commits. In this example we will pick the top
most commit Merge branch 'feature' for the new tag. We will need to reference to
the commit SHA hash to pass to Git:
Executing the above git tag invocation will create a new annotated commit
identified as v1.2 for the commit we selected in the previous git log example.
If you try to create a tag with the same identifier as an existing tag, Git will throw an
error like:
Additionally if you try to tag an older commit with an existing tag identifier Git will
throw the same error.
In the event that you must update an existing tag, the -f FORCE option must be
used.
Sharing tags is similar to pushing branches. By default, git push will not push tags.
Tags have to be explicitly passed to git push.
$ git push origin v1.4
Counting objects: 14, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (12/12), done.
Writing objects: 100% (14/14), 2.05 KiB | 0 bytes/s, done.
Total 14 (delta 3), reused 0 (delta 0)
To git@bitbucket.com:atlasbro/gittagdocs.git
* [new tag] v1.4 -> v1.4
To push multiple tags simultaneously pass the --tags option to git pushcommand.
When another user clones or pulls a repo they will receive the new tags.
You can view the state of a repo at a tag by using the git checkout command.
The above command will checkout the v1.4 tag. This puts the repo in a detached
HEAD state. This means any changes made will not update the tag. They will create a
new detached commit. This new detached commit will not be part of any branch and
will only be reachable directly by the commits SHA hash. Therefore it is a best
practice to create a new branch anytime you're making changes in a detached HEAD
state.
Deleting Tags
$ git tag
v1
v2
v3
$ git tag -d v1
$ git tag
v2
v3
In this example git tag is executed to display a list of tags showing v1, v2, v3, Then
git tag -d v1 is executed which deletes the v1 tag.
Summary
Git blame is often used with a GUI display. Online Git hosting sites
like Bitbucketoffer blame views which are UI wrappers to git blame. These views are
referenced in collaborative discussions around pull requests and commits.
Additionally, most IDE's that have Git integration also have dynamic blame views.
How It Works
In order to demonstrate git blame we need a repository with some history. We will
use the open source project git-blame-example. This open source project is a simple
repository that contains a README.md file which has a few commits from different
authors. The first step of our git blame usage example is to git clone the example
repository.
Now that we have a copy of the example code we can start exploring it with
git blame. The state of the example repo can be examined using git log. The
commit history should look like the following:
$ git log
commit 548dabed82e4e5f3734c219d5a742b1c259926b2
Author: Juni Mukherjee <jmukherjee@atlassian.com>
Date: Thu Mar 1 19:55:15 2018 +0000
Another commit to help git blame track the who, the what,
commit eb06faedb1fdd159d62e4438fc8dbe9c9fe0728b
Author: Juni Mukherjee <jmukherjee@atlassian.com>
Date: Thu Mar 1 19:53:23 2018 +0000
Creating the third commit, along with Kev and Albert, so t
commit 990c2b6a84464fee153253dbf02e845a4db372bb
Merge: 82496ea 89feb84
Author: Albert So <aso@atlassian.com>
Date: Thu Mar 1 05:33:01 2018 +0000
Merged in albert-so/git-blame-example/albert-so/readmemd-e
README.md edited online with Bitbucket
commit 89feb84d885fe33d1182f2112885c2a64a4206ec
Author: Albert So <aso@atlassian.com>
Date: Thu Mar 1 00:54:03 2018 +0000
README.md edited online with Bitbucket
git blame only operates on individual files. A file-path is required for any useful
output The default execution of git blame will simply output the commands help
output. The default execution of git blame will simply output the commands help
menu. For this example, we will operate on the README.MD file. It is a common
open source software practice to include a README file in the root of a git repository
as documentation source for the project.
Executing the above command will give us our first sample of blame output. The
following output is a subset of the full blame output of the README. Additionally,
this output is static is reflective of the state of the repo at the time of this writing.
This is a sample of the first 13 lines of the README.md file. To better understand this
output lets break down a line. The following table displays the content of line 3 and
the columns of the table indicate the column content.
Line
Id Author Timestamp Line Content
Number
If we review the blame output list, we can make some observations. There are three
authors listed. In addition to the project's maintainer Kev Zettler, Albert So, and Juni
Mukherjee are also listed. Authors are generally the most valuable part of git blame
output. The timestamp column is also primarily helpful. What the change was is
indicated by line content column.
Common Options
The -L option will restrict the output to the requested line range. Here we have
restricted the output to lines 1 through 5.
The -M option detects moved or copied lines within in the same file. This will report
the original author of the lines instead of the last author that moved or copied the
lines.
The -C option detects lines that were moved or copied from other files. This will
report the original author of the lines instead of the last author that moved or copied
the lines.
While git blame displays the last author that modified a line, often times you will
want to know when a line was originally added. This can be cumbersome to achieve
using git blame. It requires a combination of the -w, -C, and -M options. It can be
far more convenient to use the git log command.
To list all original commits in-which a specific code piece was added or modified
execute git log with the -S option. Append the -S option with the code you are
looking for. Let's take one of the lines from the README output above to use as an
example. Let us take the text "CSS3D and WebGL renderers" from Line 12 of the
README output.
This output shows us that content from the README was added or modified 3 times
by 3 different authors. It was originally added in commit cb20237cc by Mr.doob. In
this example, git log has also been prepended with the --pretty-format option.
This option converts the default output format of git log into one that matches the
format of git log. For more information on usage and configuration options visit
the git log page.
Summary
The git blame command is used to examine the contents of a file line by line and
see when each line was last modified and who the author of the modifications was.
The output format of git blame can be altered with various command line options.
Online Git hosting solutions like Bitbucket offer blame views, which offer a superior
user experience to command line git blame usage. git blame and git log can be
used in combination to help discover the history of a file's contents. The git log
command has some similar blame functionality, to learn more visit the git log
overview page.
Undoing Commits & Changes
git checkout git clean git revert git reset git rm
In this section, we will discuss the available 'undo' Git strategies and commands. It is
first important to note that Git does not have a traditional 'undo' system like those
found in a word processing application. It will be beneficial to refrain from mapping
Git operations to any traditional 'undo' mental model. Additionally, Git has its own
nomenclature for 'undo' operations that it is best to leverage in a discussion. This
nomenclature includes terms like reset, revert, checkout, clean, and more.
This tutorial provides all of the necessary skills to work with previous revisions of a
software project. First, it shows you how to explore old commits, then it explains the
difference between reverting public commits in the project history vs. resetting
unpublished changes on your local machine.
The whole idea behind any version control system is to store “safe” copies of a
project so that you never have to worry about irreparably breaking your code base.
Once you’ve built up a project history of commits, you can review and revisit any
commit in the history. One of the best utilities for reviewing the history of a Git
repository is the git log command. In the example below, we use git log to get a
list of the latest commits to a popular open-source graphics library.
Each commit has a unique SHA-1 identifying hash. These IDs are used to travel
through the committed timeline and revisit commits. By default, git log will only
show commits for the currently selected branch. It is entirely possible that the
commit you're looking for is on another branch. You can view all commits across all
branches by executing git log --branches=*. The command git branch is used to
view and visit other branches. Invoking the command, git branch -a will return a list
of all known branch names. One of these branch names can then be logged using
git log <branch_name>.
When you have found a commit reference to the point in history you want to visit,
you can utilize the git checkout command to visit that commit. Git checkout is an
easy way to “load” any of these saved snapshots onto your development machine.
During the normal course of development, the HEAD usually points to master or
some other local branch, but when you check out a previous commit, HEAD no longer
points to a branch—it points directly to a commit. This is called a “detached HEAD”
state, and it can be visualized as the following:
Checking out an old file does not move the HEAD pointer. It remains on the same
branch and same commit, avoiding a 'detached head' state. You can then commit the
old version of the file in a new snapshot as you would any other changes. So, in
effect, this usage of git checkout on a file, serves as a way to revert back to an old
version of an individual file. For more information on these two modes visit the
git checkout page
This example assumes that you’ve started developing a crazy experiment, but you’re
not sure if you want to keep it or not. To help you decide, you want to take a look at
the state of the project before you started your experiment. First, you’ll need to find
the ID of the revision you want to see.
Let’s say your project history looks something like the following:
You can use git checkout to view the “Make some import changes to hello.txt”
commit as follows:
This makes your working directory match the exact state of the a1e8fb5 commit. You
can look at files, compile the project, run tests, and even edit files without worrying
about losing the current state of the project. Nothing you do in here will be saved in
your repository. To continue developing, you need to get back to the “current” state
of your project:
git checkout master
This assumes that you're developing on the default master branch. Once you’re back
in the master branch, you can use either git revert or git reset to undo any
undesired changes.
There are technically several different strategies to 'undo' a commit. The following
examples will assume we have a commit history that looks like:
We will focus on undoing the 872fa7e Try something crazy commit. Maybe things
got a little too crazy.
Using the git checkout command we can checkout the previous commit, a1e8fb5,
putting the repository in a state before the crazy commit happened. Checking out a
specific commit will put the repo in a "detached HEAD" state. This means you are no
longer working on any branch. In a detached state, any new commits you make will
be orphaned when you change branches back to an established branch. Orphaned
commits are up for deletion by Git's garbage collector. The garbage collector runs on
a configured interval and permanently destroys orphaned commits. To prevent
orphaned commits from being garbage collected, we need to ensure we are on a
branch.
Let's assume we are back to our original commit history example. The history that
includes the 872fa7e commit. This time let's try a revert 'undo'. If we execute
git revert HEAD, Git will create a new commit with the inverse of the last commit.
This adds a new commit to the current branch history and now makes it look like:
For this undo strategy we will continue with our working example. git reset is an
extensive command with multiple uses and functions. If we invoke
git reset --hard a1e8fb5 the commit history is reset to that specified commit.
Examining the commit history with git log will now look like:
The log output shows the e2f9a78 and 872fa7e commits no longer exist in the
commit history. At this point, we can continue working and creating new commits as
if the 'crazy' commits never happened. This method of undoing changes has the
cleanest effect on history. Doing a reset is great for local changes however it adds
complications when working with a shared remote repository. If we have a shared
remote repository that has the 872fa7e commit pushed to it, and we try to git push
a branch where we have reset the history, Git will catch this and throw an error. Git
will assume that the branch being pushed is not up to date because of it's missing
commits. In these scenarios, git revert should be the preferred undo method.
In the previous section, we discussed different strategies for undoing commits. These
strategies are all applicable to the most recent commit as well. In some cases
though, you might not need to remove or reset the last commit. Maybe it was just
made prematurely. In this case you can amend the most recent commit. Once you
have made more changes in the working directory and staged them for commit by
using git add, you can execute git commit --amend. This will have Git open the
configured system editor and let you modify the last commit message. The new
changes will be added to the amended commit.
Before changes are committed to the repository history, they live in the staging
index and the working directory. You may need to undo changes within these two
areas. The staging index and working directory are internal Git state management
mechanisms. For more detailed information on how these two mechanisms operate,
visit the git reset page which explores them in depth
The working directory is generally in sync with the local file system. To undo changes
in the working directory you can edit files like you normally would using your
favorite editor. Git has a couple utilities that help manage the working directory.
There is the git clean command which is a convince utility for undoing changes to
the working directory. Additionally, git reset can be invoked with the --mixed or
--hard options and will apply a reset to the working directory.
The staging index
The git add command is used to add changes to the staging index. Git reset is
primarily used to undo the staging index changes. A --mixed reset will move any
pending changes from the staging index back into the working directory.
The preferred method of undoing shared history is git revert. A revert is safer than
a reset because it will not remove any commits from a shared history. A revert will
retain the commits you want to undo and create a new commit that inverts the
undesired commit. This method is safer for shared remote collaboration because a
remote developer can then pull the branch and receive the new revert commit which
undoes the undesired commit.
Summary
We covered many high-level strategies for undoing things in Git. It's important to
remember that there is more than one way to 'undo' in a Git project. Most of the
discussion on this page touched on deeper topics that are more thoroughly
explained on pages specific to the relevant Git commands. The most commonly used
'undo' tools are git checkout, git revert, and git reset. Some key points to
remember are:
Use git checkout to move around and review the commit history
git revert is the best tool for undoing shared public changes
In addition to the primary undo commands, we took a look at other Git utilities:
git log for finding lost commits git clean for undoing uncommitted changes
git add for modifying the staging index.
Each of these commands has its own in-depth documentation. To learn more about a
specific command mentioned here, visit the corresponding links.
Git Clean
git checkout git clean git revert git reset git rm
$ mkdir git_clean_test
$ cd git_clean_test/
$ git init .
Initialized empty Git repository in /Users/kev/code/git_cl
$ echo "tracked" > ./tracked_file
$ git add ./tracked_file
$ echo "untracked" > ./untracked_file
$ mkdir ./untracked_dir && touch ./untracked_dir/file
$ git status
On branch master
Initial commit
Changes to be committed: (use "git rm --cached <file>..."
new file: tracked_file
Untracked files: (use "git add <file>..." to include in wh
At this point, executing the default git clean command may produce a fatal error.
The example above demonstrates what this may look like. By default, Git is globally
configured to require that git clean be passed a "force" option to initiate. This is an
important safety mechanism. When finally executed git clean is not undo-able.
When fully executed, git clean will make a hard filesystem deletion, similar to
executing the command line rm utility. Make sure you really want to delete the
untracked files before you run it.
Given the previous explanation of the default git clean behaviors and caveats, the
following content demonstrates various git clean use cases and the accompanying
command line options required for their operation.
-n
The -n option will perform a “dry run” of git clean. This will show you which files
are going to be removed without actually removing them. It is a best practice to
always first perform a dry run of git clean. We can demonstrate this option in the
demo repo we created earlier.
$ git clean -n
Would remove untracked_file
The output tells us that untracked_file will be removed when the git clean
command is executed. Notice that the untracked_dir is not reported in the output
here. By default git clean will not operate recursively on directories. This is another
safety mechanism to prevent accidental permanent deletion.
-f or --force
The force option initiates the actual deletion of untracked files from the current
directory. Force is required unless the clean.requireForce configuration option is
set to false. This will not remove untracked folders or files specified by .gitignore.
Let us now execute a live git clean in our example repo.
$ git clean -f
Removing untracked_file
The command will output the files that are removed. You can see here that
untracked_file has been removed. Executing git status at this point or doing a ls
will show that untracked_file has been deleted and is nowhere to be found. By
default git clean -f will operate on all the current directory untracked files.
Additionally, a <path> value can be passed with the -f option that will remove a
specific file.
The -d option tells git clean that you also want to remove any untracked
directories, by default it will ignore directories. We can add the -d option to our
previous examples:
Here we have executed a 'dry run' using the -dn combination which outputs
untracked_dir is up for removal. Then we execute a forced clean, and receive output
that untracked_dir is removed.
Like the -d option -x can be passed and composed with other options. This
example demonstrates a combination with -f that will remove untracked files from
the current directory as well as any files that Git usually ignores.
We have initiated the interactive session with the -d option so it will also act upon
our untracked_dir. The interactive mode will display a What now> prompt that
requests a command to apply to the untracked files. The commands themselves are
fairly self explanatory. We'll take a brief look at each in a random order starting with
command 6: help. Selecting command 6 will further explain the other commands:
What now> 6
clean - start cleaning
filter by pattern - exclude items from deletion
select by numbers - select items to be deleted by numbers
ask each - confirm each deletion (like "rm -i")
quit - stop cleaning
help - this screen
? - help for prompt selection
5: quit
1: clean
Will delete the indicated items. If we were to execute 1: clean at this point
untracked_dir/ untracked_file would be removed.
4: ask each
will iterate over each untracked file and display a Y/N prompt for a deletion. It looks
like the following:
2: filter by pattern
Will display an additional prompt that takes input used to filter the list of untracked
files.
Here we input the *_file wildcard pattern which then restricts the untracked file list
to just untracked_dir.
3: select by numbers
Similar to command 2, command 3 works to refine the list of untracked file names.
The interactive session will prompt for numbers that correspond to an untracked file
name.
Summary
To recap, git clean is a convenience method for deleting untracked files in a repo's
working directory. Untracked files are those that are in the repo's directory but have
not yet been added to the repo's index with git add. Overall the effect of git clean
can be accomplished using git status and the operating systems native deletion
tools. Git clean can be used alongside git reset to fully undo any additions and
commits in a repository.
Git Revert
git checkout git clean git revert git reset git rm
Reverting should be used when you want to apply the inverse of a commit from your
project history. This can be useful, for example, if you’re tracking down a bug and
find that it was introduced by a single commit. Instead of manually going in, fixing it,
and committing a new snapshot, you can use git revert to automatically do all of
this for you.
How it works
To demonstrate this lets create an example repo using the command line examples
below:
$ mkdir git_revert_test
$ cd git_revert_test/
$ git init .
Initialized empty Git repository in /git_revert_test/.git/
$ touch demo_file
$ git add demo_file
$ git commit -am"initial commit"
[master (root-commit) 299b15f] initial commit
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 demo_file
$ echo "initial content" >> demo_file
$ git commit -am"add new content to demo file"
[master 3602d88] add new content to demo file
n 1 file changed, 1 insertion(+)
$ echo "prepended line content" >> demo_file
$ git commit -am"prepend content to demo file"
[master 86bb32e] prepend content to demo file
1 file changed, 1 insertion(+)
$ git log --oneline
86bb32e prepend content to demo file
3602d88 add new content to demo file
299b15f initial commit
Here we have initialized a repo in a newly created directory named git_revert_test.
We have made 3 commits to the repo in which we have added a file demo_file and
modified its content twice. At the end of the repo setup procedure, we invoke
git log to display the commit history, showing a total of 3 commits. With the repo
in this state, we are ready to initiate a git revert.
Git revert expects a commit ref passed in and will not execute without one. Here
we have passed in the HEAD ref. This will revert the latest commit. This is same
behavior as if we reverted to commit 3602d8815dbfa78cd37cd4d189552764b5e96c58.
Similar to a merge, a revert will create a new commit which will open up the
configured system editor prompting for a new commit message. Once a commit
message has been entered and saved Git will resume operation. We can now
examine the state of the repo using git log and see that there is a new commit
added to the previous log:
Note that the 3rd commit is still in the project history after the revert. Instead of
deleting it, git revert added a new commit to undo its changes. As a result, the 2nd
and 4th commits represent the exact same code base and the 3rd commit is still in
our history just in case we want to go back to it down the road.
Common options
-e
--edit
This is a default option and doesn't need to be specified. This option will open the
configured system editor and prompts you to edit the commit message prior to
committing the revert.
--no-edit
This is the inverse of the -e option. The revert will not open the editor.
-n
--no-commit
Passing this option will prevent git revert from creating a new commit that inverses
the target commit. Instead of creating the new commit this option will add the
inverse changes to the Staging Index and Working Directory. These are the other
trees Git uses to manage state the state of the repository. For more info visit the
git reset page.
Resetting vs. reporting
It's important to understand that git revert undoes a single commit—it does not
"revert" back to the previous state of a project by removing all subsequent commits.
In Git, this is actually called a reset, not a revert.
Reverting has two important advantages over resetting. First, it doesn’t change the
project history, which makes it a “safe” operation for commits that have already been
published to a shared repository. For details about why altering shared history is
dangerous, please see the git reset page.
Second, git revert is able to target an individual commit at an arbitrary point in the
history, whereas git reset can only work backward from the current commit. For
example, if you wanted to undo an old commit with git reset, you would have to
remove all of the commits that occurred after the target commit, remove it, then re-
commit all of the subsequent commits. Needless to say, this is not an elegant undo
solution. For a more detailed discussion on the differences between git revert and
other 'undo' commands see Resetting, Checking Out and Reverting.
Summary
The git reset command is a complex and versatile tool for undoing changes. It has three
primary forms of invocation. These forms correspond to command line arguments
--soft, --mixed, --hard. The three arguments each correspond to Git's three internal
state management mechanism's, The Commit Tree ( HEAD), The Staging Index, and The
Working Directory.
To properly understand git reset usage, we must first understand Git's internal state
management systems. Sometimes these mechanisms are called Git's "three trees". Trees
may be a misnomer, as they are not strictly traditional tree data-structures. They are,
however, node and pointer-based data structures that Git uses to track a timeline of edits.
The best way to demonstrate these mechanisms is to create a changeset in a repository
and follow it through the three trees.
To get started we will create a new repository with the commands below:
$ mkdir git_reset_test
$ cd git_reset_test/
$ git init .
Initialized empty Git repository in /git_reset_test/.git/
$ touch reset_lifecycle_file
$ git add reset_lifecycle_file
$ git commit -m"initial commit"
[master (root-commit) d386d86] initial commit
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 reset_lifecycle_file
The above example code creates a new git repository with a single empty file,
reset_lifecycle_file. At this point, the example repository has a single commit (
d386d86) from adding reset_lifecycle_file.
The first tree we will examine is "The Working Directory". This tree is in sync with the local
filesystem and is representative of the immediate changes made to content in files and
directories.
In our demo repository, we modify and add some content to the reset_lifecycle_file.
Invoking git status shows that Git is aware of the changes to the file. These changes are
currently a part of the first tree, "The Working Directory". Git status can be used to show
changes to the Working Directory. They will be displayed in the red with a 'modified'
prefix.
Staging index
Next up is the 'Staging Index' tree. This tree is tracking Working Directory changes, that
have been promoted with git add, to be stored in the next commit. This tree is a
complex internal caching mechanism. Git generally tries to hide the implementation
details of the Staging Index from the user.
To accurately view the state of the Staging Index we must utilize a lesser known Git
command git ls-files. The git ls-files command is essentially a debug utility for
inspecting the state of the Staging Index tree.
git ls-files -s
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 reset_lifecy
Here we have executed git ls-files with the -s or --stage option. Without the -s
option the git ls-files output is simply a list of file names and paths that are currently
part of the index. The -s option displays additional metadata for the files in the Staging
Index. This metadata is the staged contents' mode bits, object name, and stage number.
Here we are interested in the object name, the second value (
d7d77c1b04b5edd5acfc85de0b592449e5303770). This is a standard Git object SHA-1 hash. It
is a hash of the content of the files. The Commit History stores its own object SHA's for
identifying pointers to commits and refs and the Staging Index has its own object SHA's
for tracking versions of files in the index.
Here we have invoked git add reset_lifecycle_file which adds the file to the Staging
Index. Invoking git status now shows reset_lifecycle_file in green under "Changes
to be committed". It is important to note that git status is not a true representation of
the Staging Index. The git status command output displays changes between the
Commit History and the Staging Index. Let us examine the Staging Index content at this
point.
$ git ls-files -s
100644 d7d77c1b04b5edd5acfc85de0b592449e5303770 0 reset_lifecy
We can see that the object SHA for reset_lifecycle_file has been updated from
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 to
d7d77c1b04b5edd5acfc85de0b592449e5303770.
Commit history
The final tree is the Commit History. The git commit command adds changes to a
permanent snapshot that lives in the Commit History. This snapshot also includes the
state of the Staging Index at the time of commit.
How it works
At a surface level, git reset is similar in behavior to git checkout. Where git checkout
solely operates on the HEAD ref pointer, git reset will move the HEADref pointer and the
current branch ref pointer. To better demonstrate this behavior consider the following
example:
git checkout b
With git checkout, the master ref is still pointing to d. The HEAD ref has been moved,
and now points at commit b. The repo is now in a 'detached HEAD' state.
git reset b
Comparatively, git reset, moves both the HEAD and branch refs to the specified commit.
In addition to updating the commit ref pointers, git reset will modify the state of the
three trees. The ref pointer modification always happens and is an update to the third
tree, the Commit tree. The command line arguments --soft, --mixed, and --harddirect
how to modify the Staging Index, and Working Directory trees.
Main Options
The default invocation of git reset has implicit arguments of --mixed and HEAD. This
means executing git reset is equivalent to executing git reset --mixed HEAD. In this
form HEAD is the specified commit. Instead of HEAD any Git SHA-1 commit hash can be
used.
--hard
This is the most direct, DANGEROUS, and frequently used option. When passed --hard
The Commit History ref pointers are updated to the specified commit. Then, the Staging
Index and Working Directory are reset to match that of the specified commit. Any
previously pending changes to the Staging Index and the Working Directory gets reset to
match the state of the Commit Tree. This means any pending work that was hanging out
in the Staging Index and Working Directory will be lost.
To demonstrate this, let's continue with the three tree example repo we established
earlier. First let's make some modifications to the repo. Execute the following commands
in the example repo:
These commands have created a new file named new_file and added it to the repo.
Additionally, the content of reset_lifecycle_file will be modified. With these changes
in place let us now examine the state of the repo using git status.
$ git status
On branch master
Changes to be committed:
(use "git reset HEAD ..." to unstage)
new file: new_file
Changes not staged for commit:
(use "git add ..." to update what will be committed)
(use "git checkout -- ..." to discard changes in working dire
modified: reset_lifecycle_file
We can see that there are now pending changes to the repo. The Staging Index tree has a
pending change for the addition of new_file and the Working Directory has a pending
change for the modifications to reset_lifecycle_file.
Before moving forward let us also examine the state of the Staging Index:
$ git ls-files -s
100644 8e66654a5477b1bf4765946147c49509a431f963 0 new_file
100644 d7d77c1b04b5edd5acfc85de0b592449e5303770 0 reset_lifecy
We can see that new_file has been added to the index. We have made updates to
reset_lifecycle_file but the Staging Index SHA (
d7d77c1b04b5edd5acfc85de0b592449e5303770) remains the same. This is expected behavior
because have not used git add to promote these changes to the Staging Index. These
changes exist in the Working Directory.
Let us now execute a git reset --hard and examine the new state of the repository.
Here we have executed a "hard reset" using the --hard option. Git displays output
indicating that HEAD is pointing to the latest commit dc67808. Next, we check the state of
the repo with git status. Git indicates there are no pending changes. We also examine
the state of the Staging Index and see that it has been reset to a point before new_file
was added. Our modifications to reset_lifecycle_file and the addition of new_file
have been destroyed. This data loss cannot be undone, this is critical to take note of.
--mixed
This is the default operating mode. The ref pointers are updated. The Staging Index is
reset to the state of the specified commit. Any changes that have been undone from the
Staging Index are moved to the Working Directory. Let us continue.
In the example above we have made some modifications to the repository. Again, we
have added a new_file and modified the contents of reset_lifecycle_file. These
changes are then applied to the Staging Index with git add. With the repo in this state,
we will now execute the reset.
The important things to take note of here is that git status shows us that there are
modifications to reset_lifecycle_file and there is an untracked file: new_file. This is
the explicit --mixed behavior. The Staging Index has been reset and the pending changes
have been moved into the Working Directory. Compare this to the --hard reset case
where the Staging Index was reset and the Working Directory was reset as well, losing
these updates.
--soft
When the --soft argument is passed, the ref pointers are updated and the reset stops
there. The Staging Index and the Working Directory are left untouched. This behavior can
be hard to clearly demonstrate. Let's continue with our demo repo and prepare it for a
soft reset.
Here we have again used git add to promote the modified reset_lifecycle_file into
the Staging Index. We confirm that the index has been updated with the git ls-files
output. The output from git status now displays the "Changes to be committed" in
green. The new_file from our previous examples is floating around in the Working
Directory as an untracked file. Lets quickly execute rm new_file to delete the file as we
will not need it for the upcoming examples.
We have executed a 'soft reset'. Examining the repo state with git status and
git ls-files shows that nothing has changed. This is expected behavior. A soft reset will
only reset the Commit History. By default, git reset is invoked with HEAD as the target
commit. Since our Commit History was already sitting on HEAD and we implicitly reset to
HEAD nothing really happened.
To better understand and utilize --soft we need a target commit that is not HEAD. We
have reset_lifecycle_file waiting in the Staging Index. Let's create a new commit.
At this point, our repo should have three commits. We will be going back in time to the
first commit. To do this we will need the first commit's ID. This can be found by viewing
output from git log.
$ git log
commit 62e793f6941c7e0d4ad9a1345a175fe8f45cb9df
Author: bitbucket
Date: Fri Dec 1 15:03:07 2017 -0800
prepend content to reset_lifecycle_file
commit dc67808a6da9f0dec51ed16d3d8823f28e1a72a
Author: bitbucket
Date: Fri Dec 1 10:21:57 2017 -0800
update content of reset_lifecycle_file
commit 780411da3b47117270c0e3a8d5dcfd11d28d04a4
Author: bitbucket
Date: Thu Nov 30 16:50:39 2017 -0800
initial commit
Keep in mind that Commit History ID's will be unique to each system. This means the
commit ID's in this example will be different from what you see on your personal
machine. The commit ID we are interested in for this example is
780411da3b47117270c0e3a8d5dcfd11d28d04a4. This is the ID that corresponds to the "initial
commit". Once we have located this ID we will use it as the target for our soft reset.
Before we travel back in time lets first check the current state of the repo.
Here we execute a combo command of git status and git ls-files -s this shows us
there are pending changes to the repo and reset_lifecycle_file in the Staging Index is
at a version of 67cc52710639e5da6b515416fd779d0741e3762e. With this in mind lets execute
a soft reset back to our first commit.
The code above executes a "soft reset" and also invokes the git status and
git ls-files combo command, which outputs the state of the repository. We can
examine the repo state output and note some interesting observations. First, git status
indicates there are modifications to reset_lifecycle_file and highlights them indicating
they are changes staged for the next commit. Second, the git ls-files input indicates
that the Staging Index has not changed and retains the SHA
67cc52710639e5da6b515416fd779d0741e3762e we had earlier.
To further clarify what has happened in this reset let us examine the git log:
$ git log
commit 780411da3b47117270c0e3a8d5dcfd11d28d04a4
Author: bitbucket
Date: Thu Nov 30 16:50:39 2017 -0800
initial commit
The log output now shows that there is a single commit in the Commit History. This helps
to clearly illustrate what --soft has done. As with all git reset invocations, the first
action reset takes is to reset the commit tree. Our previous examples with --hard and
--mixed have both been against the HEAD and have not moved the Commit Tree back in
time. During a soft reset, this is all that happens.
This may then be confusing as to why git status indicates there are modified files.
--soft does not touch the Staging Index, so the updates to our Staging Index followed
us back in time through the commit history. This can be confirmed by the output of
git ls-files -s showing that the SHA for reset_lifecycle_file is unchanged. As a
reminder, git status does not show the state of 'the three trees', it essentially shows a
diff between them. In this case, it is displaying that the Staging Index is ahead of the
changes in the Commit History as if we have already staged them.
Resetting vs Reverting
If git revert is a “safe” way to undo changes, you can think of git reset as the
dangerous method. There is a real risk of losing work with git reset. Git reset will
never delete a commit, however, commits can become 'orphaned' which means there is
no direct path from a ref to access them. These orphaned commits can usually be found
and restored using git reflog. Git will permanently delete any orphaned commits after it
runs the internal garbage collector. By default, Git is configured to run the garbage
collector every 30 days. Commit History is one of the 'three git trees' the other two,
Staging Index and Working Directory are not as permanent as Commits. Care must be
taken when using this tool, as it’s one of the only Git commands that have the potential
to lose your work.
Whereas reverting is designed to safely undo a public commit, git reset is designed to
undo local changes to the Staging Index and Working Directory. Because of their distinct
goals, the two commands are implemented differently: resetting completely removes a
changeset, whereas reverting maintains the original changeset and uses a new commit to
apply the undo.
You should never use git reset <commit> when any snapshots after <commit> have
been pushed to a public repository. After publishing a commit, you have to assume that
other developers are reliant upon it.
Removing a commit that other team members have continued developing poses serious
problems for collaboration. When they try to sync up with your repository, it will look like
a chunk of the project history abruptly disappeared. The sequence below demonstrates
what happens when you try to reset a public commit. The origin/master branch is the
central repository’s version of your local master branch.
As soon as you add new commits after the reset, Git will think that your local history has
diverged from origin/master, and the merge commit required to synchronize your
repositories is likely to confuse and frustrate your team.
The point is, make sure that you’re using git reset <commit> on a local experiment that
went wrong—not on published changes. If you need to fix a public commit, the
git revert command was designed specifically for this purpose.
Examples
Remove the specified file from the staging area, but leave the working directory
unchanged. This unstages a file without overwriting any changes.
git reset
Reset the staging area to match the most recent commit, but leave the working directory
unchanged. This unstages all files without overwriting any changes, giving you the
opportunity to re-build the staged snapshot from scratch.
Reset the staging area and the working directory to match the most recent commit. In
addition to unstaging changes, the --hard flag tells Git to overwrite all changes in the
working directory, too. Put another way: this obliterates all uncommitted changes, so
make sure you really want to throw away your local developments before using it.
Move the current branch tip backward to commit, reset the staging area to match, but
leave the working directory alone. All changes made since <commit> will reside in the
working directory, which lets you re-commit the project history using cleaner, more
atomic snapshots.
Move the current branch tip backward to <commit> and reset both the staging area and
the working directory to match. This obliterates not only the uncommitted changes, but
all commits after, as well.
Unstaging a file
The git reset command is frequently encountered while preparing the staged snapshot.
The next example assumes you have two files called hello.py and main.py that you’ve
already added to the repository.
As you can see, git reset helps you keep your commits highly-focused by letting you
unstage changes that aren’t related to the next commit.
The next example shows a more advanced use case. It demonstrates what happens when
you’ve been working on a new experiment for a while, but decide to completely throw it
away after committing a few snapshots.
The git reset HEAD~2 command moves the current branch backward by two commits,
effectively removing the two snapshots we just created from the project history.
Remember that this kind of reset should only be used on unpublished commits. Never
perform the above operation if you’ve already pushed your commits to a shared
repository.
Summary
To review, git reset is a powerful command that is used to undo local changes to the
state of a Git repo. Git reset operates on "The Three Trees of Git". These trees are the
Commit History ( HEAD), the Staging Index, and the Working Directory. There are three
command line options that correspond to the three trees. The options --soft, --mixed,
and --hard can be passed to git reset.
In this article we leveraged several other Git commands to help demonstrate the reset
processes. Learn more about those commands on their individual pages at:
git status, git log, git add, git checkout, git reflog, and git revert.
Git RM
git checkout git clean git revert git reset git rm
A common question when getting started with Git is "How do I tell Git not to track a
file (or files) any more?" The git rm command is used to remove files from a Git
repository. It can be thought of as the inverse of the git add command.
Git rm Overview
The git rm command can be used to remove individual files or a collection of files.
The primary function of git rm is to remove tracked files from the Git index.
Additionally, git rm can be used to remove files from both the staging index and the
working directory. There is no option to remove a file from only the working
directory. The files being operated on must be identical to the files in the current
HEAD. If there is a discrepancy between the HEAD version of a file and the staging
index or working tree version, Git will block the removal. This block is a safety
mechanism to prevent removal of in-progress changes.
Note that git rm does not remove branches. Learn more about using git branches
Usage
<file>…
Specifies the target files to remove. The option value can be an individual file, a
space delimited list of files file1 file2 file3, or a wildcard file glob
(~./directory/*).
-f
--force
The -foption is used to override the safety check that Git makes to ensure that the
files in HEAD match the current content in the staging index and working directory.
-n
--dry-run
The "dry run" option is a safeguard that will execute the git rm command but not
actually delete the files. Instead it will output which files it would have removed.
-r
The -r option is shorthand for 'recursive'. When operating in recursive mode git rm
will remove a target directory and all the contents of that directory.
--
The separator option is used to explicitly distinguish between a list of file names and
the arguments being passed to git rm. This is useful if some of the file names have
syntax that might be mistaken for other options.
--cached
The cached option specifies that the removal should happen only on the staging
index. Working directory files will be left alone.
--ignore-unmatch
This causes the command to exit with a 0 sigterm status even if no files matched.
This is a Unix level status code. The code 0 indicates a successful invocation of the
command. The --ignore-unmatch option can be helpful when using git rm as part
of a greater shell script that needs to fail gracefully.
-q
--quiet
The quiet option hides the output of the git rm command. The command normally
outputs one line for each file removed.
Executing git rm is not a permanent update. The command will update the staging
index and the working directory. These changes will not be persisted until a new
commit is created and the changes are added to the commit history. This means that
the changes here can be "undone" using common Git commands.
A reset will revert the current staging index and working directory back to the HEAD
commit. This will undo a git rm.
git checkout .
A checkout will have the same effect and restore the latest version of a file from
HEAD.
In the event that git rm was executed and a new commit was created which persist
the removal, git reflog can be used to find a ref that is before the git rmexecution.
Learn more about using git reflog.
Discussion
The < file> argument given to the command can be exact paths, wildcard file glob
patterns, or exact directory names. The command removes only paths currently
commited to the Git repository.
The git rm command operates on the current branch only. The removal event is
only applied to the working directory and staging index trees. The file removal is not
persisted to the repository history until a new commit is created.
A Git repository will recognize when a regular shell rm command has been executed
on a file it is tracking. It will update the working directory to reflect the removal. It
will not update the staging index with the removal. An additional git add command
will have to be executed on the removed file paths to add the changes to the staging
index. The git rm command acts a shortcut in that it will update the working
directory and the staging index with the removal.
Examples
git rm Documentation/\*.txt
This example uses a wildcard file glob to remove all *.txt files that are children of
the Documentation directory and any of its subdirectories.
Note that the asterisk * is escaped with slashes in this example; this is a guard that
prevents the shell from expanding the wildcard. The wildcard then expands the
pathnames of files and subdirectories under the Documentation/ directory.
git rm -f git-*.sh
This example uses the force option and targets all wildcard git-*.sh files. The force
option explicitly removes the target files from both the working directory and
staging index.
As stated above in "Why use git rm instead of rm" , git rm is actually a convenience
command that combines the standard shell rm and git add to remove a file from
the working directory and promote that removal to the staging index. A repository
can get into a cumbersome state in the event that several files have been removed
using only the standard shell rm command.
If intentions are to record all the explicitly removed files as part of the next commit,
git commit -a will add all the removal events to the staging index in preparation of
the next commit.
If however, intentions are to persistently remove the files that were removed with the
shell rm, use the following command:
This command will generate a list of the removed files from the working directory
and pipe that list to git rm --cached which will update the staging index.
Git rm summary
git rm is a command that operates on two of the primary Git internal state
management trees: the working directory, and staging index. git rm is used to
remove a file from a Git repository. It is a convenience method that combines the
effect of the default shell rm command with git add. This means that it will first
remove a target from the filesystem and then add that removal event to the staging
index. The command is one of many that can be used for undoing changes in Git.
Rewriting history
Git commit --amend and other methods of rewriting history
Intro
This tutorial will cover various methods of rewriting and altering Git history. Git uses
a few different methods to record changes. We will discuss the strengths and
weaknesses of the different methods and give examples of how to work with them.
This tutorial discusses some of the most common reasons for overwriting committed
snapshots and shows you how to avoid the pitfalls of doing so.
Git's main job is to make sure you never lose a committed change. But it's also
designed to give you total control over your development workflow. This includes
letting you define exactly what your project history looks like; however, it also creates
the potential of losing commits. Git provides its history-rewriting commands under
the disclaimer that using them may result in lost content.
Git has several mechanisms for storing history and saving changes. These
mechanisms include: Commit --amend, git rebase and git reflog. These options
give you powerful work flow customization options. By the end of this tutorial, you'll
be familiar with commands that will let you restructure your Git commits, and be
able to avoid pitfalls that are commonly encountered when rewriting history.
Let's say you just committed and you made a mistake in your commit log message.
Running this command when there is nothing staged lets you edit the previous
commit’s message without altering its snapshot.
Premature commits happen all the time in the course of your everyday development.
It’s easy to forget to stage a file or to format your commit message the wrong way.
The --amend flag is a convenient way to fix these minor mistakes.
Adding the -m option allows you to pass in a new message from the command line
without being prompted to open an editor.
The --no-edit flag will allow you to make the amendment to your commit without
changing its commit message. The resulting commit will replace the incomplete one,
and it will look like we committed the changes to hello.py and main.py in a single
snapshot.
Don’t amend public commits
Amended commits are actually entirely new commits and the previous commit will
no longer be on your current branch. This has the same consequences as resetting a
public snapshot. Avoid amending a commit that other developers have based their
work on. This is a confusing situation for developers to be in and it’s complicated to
recover from.
Recap
To review, git commit --amend lets you take the most recent commit and add new
staged changes to it. You can add or remove changes from the Git staging area to
apply with a --amend commit. If there are no changes staged, a --amend will still
prompt you to modify the last commit message log. Be cautious when using --amend
on commits shared with other team members. Amending a commit that is shared
with another user will potentially require confusing and lengthy merge conflict
resolutions.
To modify older or multiple commits, you can use git rebase to combine a
sequence of commits into a new base commit. In standard mode, git rebase allows
you to literally rewrite history — automatically applying commits in your current
working branch to the passed branch head. Since your new commits will be
replacing the old, it's important to not use git rebase on commits that have been
pushed public, or it will appear that your project history disappeared.
In these or similar instances where it's important to preserve a clean project history,
adding the -i option to git rebase allows you to run rebase interactive. This
gives you the opportunity to alter individual commits in the process, rather than
moving all commits. You can learn more about interactive rebasing and additional
rebase commands on the git rebase page.
Multiple messages
Each regular Git commit will have a log message explaining what happened in the
commit. These messages provide valuable insight into the project history. During a
rebase, you can run a few commands on commits to modify commit messages.
Reword or 'r' will stop rebase playback and let you rewrite the individual
commit message during.
Squash or 's' during rebase playback, any commits marked s will be paused
on and you will be prompted to edit the separate commit messages into a
combined message. More on this in the squash commits section below.
Fixup or 'f' has the same combining effect as squash. Unlike squash, fixup
commits will not interrupt rebase playback to open an editor to combine
commit messages. The commits marked 'f' will have their messages discarded
in-favor of the previous commit's message.
Squash commits for a clean history
The s "squash" command is where we see the true utility of rebase. Squash allows
you to specify which commits you want to merge into the previous commits. This is
what enables a "clean history." During rebase playback, Git will execute the specified
rebase command for each commit. In the case of squash commits, Git will open your
configured text editor and prompt to combine the specified commit messages. This
entire process can be visualized as follows:
Note that the commits modified with a rebase command have a different ID than
either of the original commits. Commits marked with pick will have a new ID if the
previous commits have been rewritten.
Modern Git hosting solutions like Bitbucket now offer "auto squashing" features
upon merge. These features will automatically rebase and squash a branch's commits
for you when utilizing the hosted solutions UI. For more info see "Squash commits
when merging a Git branch with Bitbucket."
Recap
Git rebase gives you the power to modify your history, and interactive rebasing
allows you to do so without leaving a “messy” trail. This creates the freedom to make
and correct errors and refine your work, while still maintaining a clean, linear project
history.
Reference logs, or "reflogs" are a mechanism Git uses to record updates applied to
tips of branches and other commit references. Reflog allows you to go back to
commits even though they are not referenced by any branch or tag. After rewriting
history, the reflog contains information about the old state of branches and allows
you to go back to that state if necessary. Every time your branch tip is updated for
any reason (by switching branches, pulling in new changes, rewriting history or
simply by adding new commits), a new entry will be added to the reflog. In this
section we will take a high level look at the git reflog command and explore some
common uses.
Usage
git reflog
This shows the reflog with relative date information (e.g. 2 weeks ago).
Example
To understand git reflog, let's run through an example.
The reflog above shows a checkout from master to the 2.2 branch and back. From
there, there's a hard reset to an older commit. The latest activity is represented at the
top labeled HEAD@{0}.
If it turns out that you accidentally moved back, the reflog will contain the commit
master pointed to (0254ea7) before you accidentally dropped 2 commits.
Using Git reset, it is now possible to change master back to the commit it was
before. This provides a safety net in case the history was accidentally changed.
It's important to note that the reflog only provides a safety net if changes have been
committed to your local repository and that it only tracks movements of the
repositories branch tip. Additionally reflog entries have an expiration date. The
default expiration time for reflog entries is 90 days.
Summary
In this article we discussed several methods of changing git history, and undoing git
changes. We took a high level look at the git rebase process. Some Key takeaways
are:
Use git commit --amend to make modifications to the most recent commit.
git rebase -i gives much more fine grained control over history
modifications than a standard git rebase.
git rebase
git reflog
git rebase
This document will serve as an in-depth discussion of the git rebase command. The
Rebase command has also been looked at on the setting up a
repository and rewriting history pages. This page will take a more detailed look at
git rebaseconfiguration and execution. Common Rebase use cases and pitfalls will
be covered here.
Rebase is one of two Git utilities that specializes in integrating changes from one
branch onto another. The other change integration utility is git merge. Merge is
always a forward moving change record. Alternatively, rebase has powerful history
rewriting features. For a detailed look at Merge vs. Rebase, visit our Merging vs
Rebasing guide. Rebase itself has 2 main modes: "manual" and "interactive" mode.
We will cover the different Rebase modes in more detail below.
From a content perspective, rebasing is changing the base of your branch from one
commit to another making it appear as if you'd created your branch from a different
commit. Internally, Git accomplishes this by creating new commits and applying
them to the specified base. It's very important to understand that even though the
branch looks the same, it's composed of entirely new commits.
Usage
The primary reason for rebasing is to maintain a linear project history. For example,
consider a situation where the master branch has progressed since you started
working on a feature branch. You want to get the latest updates to the master
branch in your feature branch, but you want to keep your branch's history clean so it
appears as if you've been working off the latest master branch. This gives the later
benefit of a clean merge of your feature branch back into the master branch. Why do
we want to maintain a "clean history"? The benefits of having a clean history become
tangible when performing Git operations to investigate the introduction of a
regression. A more real-world scenario would be:
3. The developer can not identify when the bug was introduced using git logso
the developer executes a git bisect.
4. Because the git history is clean, git bisect has a refined set of commits to
compare when looking for the regression. The developer quickly finds the
commit that introduced the bug and is able to act accordingly.
You have two options for integrating your feature into the master branch: merging
directly or rebasing and then merging. The former option results in a 3-way merge
and a merge commit, while the latter results in a fast-forward merge and a perfectly
linear history. The following diagram demonstrates how rebasing onto the master
branch facilitates a fast-forward merge.
Rebasing is a common way to integrate upstream changes into your local repository.
Pulling in upstream changes with Git merge results in a superfluous merge commit
every time you want to see how the project has progressed. On the other hand,
rebasing is like saying, “I want to base my changes on what everybody has already
done.”
Git rebase in standard mode will automatically take the commits in your current
working branch and apply them to the head of the passed branch.
git rebase
This automatically rebases the current branch onto <base>, which can be any kind of
commit reference (for example an ID, a branch name, a tag, or a relative reference to
HEAD).
Running git rebase with the -i flag begins an interactive rebasing session. Instead
of blindly moving all of the commits to the new base, interactive rebasing gives you
the opportunity to alter individual commits in the process. This lets you clean up
history by removing, splitting, and altering an existing series of commits. It's like
Git commit --amend on steroids.
This rebases the current branch onto <base> but uses an interactive rebasing
session. This opens an editor where you can enter commands (described below) for
each commit to be rebased. These commands determine how individual commits will
be transferred to the new base. You can also reorder the commit listing to change
the order of the commits themselves. Once you've specified commands for each
commit in the rebase, Git will begin playing back commits applying the rebase
commands. The rebasing edit commands are as follows:
git rebase -- p leaves the commit as is. It will not modify the commit's
message or content and will still be an individual commit in the branches
history.
Recap
Interactive rebasing gives you complete control over what your project history looks
like. This affords a lot of freedom to developers, as it lets them commit a "messy"
history while they're focused on writing code, then go back and clean it up after the
fact.
Most developers like to use an interactive rebase to polish a feature branch before
merging it into the main code base. This gives them the opportunity to squash
insignificant commits, delete obsolete ones, and make sure everything else is in
order before committing to the “official” project history. To everybody else, it will
look like the entire feature was developed in a single series of well-planned commits.
The real power of interactive rebasing can be seen in the history of the resulting
master branch. To everybody else, it looks like you're a brilliant developer who
implemented the new feature with the perfect amount of commits the first time
around. This is how interactive rebasing can keep a project's history clean and
meaningful.
Configuration options
There are a few rebase properties that can be set using git config. These options
will alter the git rebase output look and feel.
error
Stops the rebase and prints removed commit warning messages
ignore
Set by default this ignores any missing commit warnings
The --onto command enables a more powerful form or rebase that allows passing
specific refs to be the tips of a rebase.
Let’s say we have an example repo with branches like:
o---o---o---o---o master
\
o---o---o---o---o featureA
\
o---o---o featureB
o---o---o featureB
/
o---o---o---o---o master
\
o---o---o---o---o featureA
One caveat to consider when working with Git Rebase is merge conflicts may
become more frequent during a rebase workflow. This occurs if you have a long-
lived branch that has strayed from master. Eventually you will want to rebase against
master and at that time it may contain many new commits that your branch changes
may conflict with. This is easily remedied by rebasing your branch frequently against
master, and making more frequent commits. The --continue and --abortcommand
line arguments can be passed to git rebase to advance or reset the the rebase
when dealing with conflicts.
A more serious rebase caveat is lost commits from interactive history rewriting.
Running rebase in interactive mode and executing subcommands like squash or
drop will remove commits from your branche's immediate log. At first glance this can
appear as though the commits are permanently gone. Using git reflog these
commits can be restored and the entire rebase can be undone. For more info on
using git reflog to find lost commits, visit our Git reflog documentation page.
Git Rebase itself is not seriously dangerous. The real danger cases arise when
executing history rewriting interactive rebases and force pushing the results to a
remote branch that's shared by other users. This is a pattern that should be avoided
as it has the capability to overwrite other remote users' work when they pull.
If another user has rebased and force pushed to the branch that you’re committing
to, a git pull will then overwrite any commits you have based off that previous
branch with the tip that was force pushed. Luckily, using git reflog you can get the
reflog of the remote branch. On the remote branch's reflog you can find a ref before
it was rebased. You can then rebase your branch against that remote ref using the
--onto option as discussed above in the Advanced Rebase Application section.
Summary
In this article we covered git rebase usage. We discussed basic and advanced use
cases and more advanced examples. Some key discussion points are:
We looked at git rebase usage with other tools like git reflog, git fetch, and
git push. Visit their corresponding pages for further information.
Rewriting history
Git commit --amend and other methods of rewriting history
Intro
This tutorial will cover various methods of rewriting and altering Git history. Git uses
a few different methods to record changes. We will discuss the strengths and
weaknesses of the different methods and give examples of how to work with them.
This tutorial discusses some of the most common reasons for overwriting committed
snapshots and shows you how to avoid the pitfalls of doing so.
Git's main job is to make sure you never lose a committed change. But it's also
designed to give you total control over your development workflow. This includes
letting you define exactly what your project history looks like; however, it also creates
the potential of losing commits. Git provides its history-rewriting commands under
the disclaimer that using them may result in lost content.
Git has several mechanisms for storing history and saving changes. These
mechanisms include: Commit --amend, git rebase and git reflog. These options
give you powerful work flow customization options. By the end of this tutorial, you'll
be familiar with commands that will let you restructure your Git commits, and be
able to avoid pitfalls that are commonly encountered when rewriting history.
Let's say you just committed and you made a mistake in your commit log message.
Running this command when there is nothing staged lets you edit the previous
commit’s message without altering its snapshot.
Premature commits happen all the time in the course of your everyday development.
It’s easy to forget to stage a file or to format your commit message the wrong way.
The --amend flag is a convenient way to fix these minor mistakes.
Adding the -m option allows you to pass in a new message from the command line
without being prompted to open an editor.
The --no-edit flag will allow you to make the amendment to your commit without
changing its commit message. The resulting commit will replace the incomplete one,
and it will look like we committed the changes to hello.py and main.py in a single
snapshot.
Don’t amend public commits
Amended commits are actually entirely new commits and the previous commit will
no longer be on your current branch. This has the same consequences as resetting a
public snapshot. Avoid amending a commit that other developers have based their
work on. This is a confusing situation for developers to be in and it’s complicated to
recover from.
Recap
To review, git commit --amend lets you take the most recent commit and add new
staged changes to it. You can add or remove changes from the Git staging area to
apply with a --amend commit. If there are no changes staged, a --amend will still
prompt you to modify the last commit message log. Be cautious when using --amend
on commits shared with other team members. Amending a commit that is shared
with another user will potentially require confusing and lengthy merge conflict
resolutions.
To modify older or multiple commits, you can use git rebase to combine a
sequence of commits into a new base commit. In standard mode, git rebase allows
you to literally rewrite history — automatically applying commits in your current
working branch to the passed branch head. Since your new commits will be
replacing the old, it's important to not use git rebase on commits that have been
pushed public, or it will appear that your project history disappeared.
In these or similar instances where it's important to preserve a clean project history,
adding the -i option to git rebase allows you to run rebase interactive. This
gives you the opportunity to alter individual commits in the process, rather than
moving all commits. You can learn more about interactive rebasing and additional
rebase commands on the git rebase page.
Multiple messages
Each regular Git commit will have a log message explaining what happened in the
commit. These messages provide valuable insight into the project history. During a
rebase, you can run a few commands on commits to modify commit messages.
Reword or 'r' will stop rebase playback and let you rewrite the individual
commit message during.
Squash or 's' during rebase playback, any commits marked s will be paused
on and you will be prompted to edit the separate commit messages into a
combined message. More on this in the squash commits section below.
Fixup or 'f' has the same combining effect as squash. Unlike squash, fixup
commits will not interrupt rebase playback to open an editor to combine
commit messages. The commits marked 'f' will have their messages discarded
in-favor of the previous commit's message.
Squash commits for a clean history
The s "squash" command is where we see the true utility of rebase. Squash allows
you to specify which commits you want to merge into the previous commits. This is
what enables a "clean history." During rebase playback, Git will execute the specified
rebase command for each commit. In the case of squash commits, Git will open your
configured text editor and prompt to combine the specified commit messages. This
entire process can be visualized as follows:
Note that the commits modified with a rebase command have a different ID than
either of the original commits. Commits marked with pick will have a new ID if the
previous commits have been rewritten.
Modern Git hosting solutions like Bitbucket now offer "auto squashing" features
upon merge. These features will automatically rebase and squash a branch's commits
for you when utilizing the hosted solutions UI. For more info see "Squash commits
when merging a Git branch with Bitbucket."
Recap
Git rebase gives you the power to modify your history, and interactive rebasing
allows you to do so without leaving a “messy” trail. This creates the freedom to make
and correct errors and refine your work, while still maintaining a clean, linear project
history.
Reference logs, or "reflogs" are a mechanism Git uses to record updates applied to
tips of branches and other commit references. Reflog allows you to go back to
commits even though they are not referenced by any branch or tag. After rewriting
history, the reflog contains information about the old state of branches and allows
you to go back to that state if necessary. Every time your branch tip is updated for
any reason (by switching branches, pulling in new changes, rewriting history or
simply by adding new commits), a new entry will be added to the reflog. In this
section we will take a high level look at the git reflog command and explore some
common uses.
Usage
git reflog
This shows the reflog with relative date information (e.g. 2 weeks ago).
Example
To understand git reflog, let's run through an example.
The reflog above shows a checkout from master to the 2.2 branch and back. From
there, there's a hard reset to an older commit. The latest activity is represented at the
top labeled HEAD@{0}.
If it turns out that you accidentally moved back, the reflog will contain the commit
master pointed to (0254ea7) before you accidentally dropped 2 commits.
Using Git reset, it is now possible to change master back to the commit it was
before. This provides a safety net in case the history was accidentally changed.
It's important to note that the reflog only provides a safety net if changes have been
committed to your local repository and that it only tracks movements of the
repositories branch tip. Additionally reflog entries have an expiration date. The
default expiration time for reflog entries is 90 days.
Summary
In this article we discussed several methods of changing git history, and undoing git
changes. We took a high level look at the git rebase process. Some Key takeaways
are:
Use git commit --amend to make modifications to the most recent commit.
git rebase -i gives much more fine grained control over history
modifications than a standard git rebase.
git rebase
git reflog
git reflog
This page provides a detailed discussion of the git reflog command. Git keeps
track of updates to the tip of branches using a mechanism called reference logs, or
"reflogs." Many Git commands accept a parameter for specifying a reference or "ref",
which is a pointer to a commit. Common examples include:
git checkout
git reset
git merge
Reflogs track when Git refs were updated in the local repository. In addition to
branch tip reflogs, a special reflog is maintained for the Git stash. Reflogs are stored
in directories under the local repository's .git directory. git reflog directories can
be found at .git/logs/refs/heads/., .git/logs/HEAD, and also
.git/logs/refs/stash if the git stash has been used on the repo.
Basic usage
git reflog
This will output the HEAD reflog. You should see output similar to:
Reflog references
By default, git reflog will output the reflog of the HEAD ref. HEAD is a symbolic
reference to the currently active branch. Reflogs are available for other refs as well.
The syntax to access a git ref is name@{qualifier}. In addition to HEAD refs, other
branches, tags, remotes, and the Git stash can be referenced as well.
You can get a complete reflog of all refs by executing:
To see the reflog for a specific branch pass that branch name to git reflog show
Executing this example will show a reflog for the otherbranch branch. The following
example assumes you have previously stashed some changes using the git stash
command.
This will output a reflog for the Git stash. The returned ref pointers can be passed to
other Git commands:
When executed, this example code will display Git diff output comparing the
stash@{0} changes against the otherbranch@{0} ref.
Timed reflogs
Every reflog entry has a timestamp attached to it. These timestamps can be
leveraged as the qualifier token of Git ref pointer syntax. This enables filtering Git
reflogs by time. The following are some examples of available time qualifiers:
1.minute.ago
1.hour.ago
1.day.ago
yesterday
1.week.ago
1.month.ago
1.year.ago
2011-05-17.09:00:00
This example will diff the current master branch against master 1 day ago. This
example is very useful if you want to know changes that have occurred within a time
frame.
By default, the reflog expiration date is set to 90 days. An expire time can be
specified by passing a command line argument --expire=time to git reflog expire
or by setting a git configuration name of gc.reflogExpire.
Git never really loses anything, even when performing history rewriting operations
like rebasing or commit amending. For the next example let's assume that we have
made some new changes to our repo. Our git log --pretty=oneline looks like the
following:
338fbcb41de10f7f2e54095f5649426cb4bf2458 extended content
1e63ceab309da94256db8fb1f35b1678fb74abd4 bunch of content
c49257493a95185997c87e0bc3a9481715270086 flesh out intro
eff544f986d270d7f97c77618314a06f024c7916 migrate existing
bf871fd762d8ef2e146d7f0226e81a92f91975ad Add Git Reflog ou
35aee4a4404c42128bee8468a9517418ed0eb3dc initial commit ad
With the addition of the new commit. The log now looks like:
During the rebase we mark commits for squash with the s rebase subcommand.
During the rebase, we squash a few commits into the most recent "some WIP
changes" commit.
If we examine git log at this point it appears that we no longer have the commits
that were marked for squashing. What if we want to operate on one of the squashed
commits? Maybe to remove its changes from history? This is an opportunity to
leverage the reflog.
git reflog
37656e1 HEAD@{0}: rebase -i (finish): returning to refs/he
37656e1 HEAD@{1}: rebase -i (start): checkout origin/maste
37656e1 HEAD@{2}: commit: some WIP changes
We can see there are reflog entries for the start and finish of the rebase and prior to
those is our "some WIP changes" commit. We can pass the reflog ref to git reset
and reset to a commit that was before the rebase.
Summary
In this tutorial we discussed the git reflog command. Some key points covered
were:
We briefly mentioned that git reflog can be used with other git commands like git
checkout, git reset, and git merge. Learn more at their respective pages. For
additional discussion on refs and the reflog, learn more here.
git syncing
git remote git fetch git push git pull
SVN uses a single centralized repository to serve as the communication hub for
developers, and collaboration takes place by passing changesets between the
developers’ working copies and the central repository. This is different from Git's
distributed collaboration model, which gives every developer their own copy of the
repository, complete with its own local history and branch structure. Users typically
need to share a series of commits rather than a single changeset. Instead of
committing a changeset from a working copy to the central repository, Git lets you
share entire branches between repositories.
The git remote command is one piece of the broader system which is responsible
for syncing changes. Records registered through the git remote command are used
in conjunction with the git fetch, git push, and git pull commands. These
commands all have their own syncing responsibilities which can be explored on the
corresponding links.
Git remote
The git remote command lets you create, view, and delete connections to other
repositories. Remote connections are more like bookmarks rather than direct links
into other repositories. Instead of providing real-time access to another repository,
they serve as convenient names that can be used to reference a not-so-convenient
URL.
For example, the following diagram shows two remote connections from your repo
into the central repo and another developer’s repo. Instead of referencing them by
their full URLs, you can pass the origin and john shortcuts to other Git commands.
git remote
git remote -v
Same as the above command, but include the URL of each connection.
When you clone a repository with git clone, it automatically creates a remote
connection called origin pointing back to the cloned repository. This is useful for
developers creating a local copy of a central repository, since it provides an easy way
to pull upstream changes or publish local commits. This behavior is also why most
Git-based projects call their central repository origin.
Repository URLs
Git supports many ways to reference a remote repository. Two of the easiest ways to
access a remote repo are via the HTTP and the SSH protocols. HTTP is an easy way to
allow anonymous, read-only access to a repository. For example:
http://host/path/to/repo.git
But, it’s generally not possible to push commits to an HTTP address (you wouldn’t
want to allow anonymous pushes anyways). For read-write access, you should use
SSH instead:
ssh://user@host/path/to/repo.git
You’ll need a valid SSH account on the host machine, but other than that, Git
supports authenticated access via SSH out of the box. Modern secure 3rd party
hosting solutions like Bitbucket.com will provide these URLs for you.
The git remote command is one of many Git commands that takes additional
appended 'subcommands'. Below is an examination of the commonly used
git remote subcommands.
Adds a record to ./.git/config for remote named <name> at the repository url
<url>.
Accepts a -f option, that will git fetch <name> immediately after the remote record
is created.
Accepts a --tags option, that will git fetch <name> immediately and import every
tag from the remote repository.
REMOVE or RM <NAME>
Accepts --push, push URLs are queried rather than fetch URLs.
SHOW <NAME>
PRUNE <NAME>
Deletes any local branches for <NAME> that are not present on the remote
repository.
Accepts a --dry-run option which will list what branches are set to be pruned, but
will not actually prune them.
By default, the git remote command will list previously stored remote connections
to other repositories. This will produce single line output that lists the names of
"bookmark" name of remote repos.
$ git remote
origin
upstream
other_users_repo
Invoking git remote with the -v option will print the list of bookmarked repository
names and additionally, the corresponding repository URL. The -v option stands for
"verbose". Below is example output of verbose git remote output.
git remote -v
origin git@bitbucket.com:origin_user/reponame.git (fetch)
origin git@bitbucket.com:origin_user/reponame.git (push)
upstream https://bitbucket.com/upstream_user/reponame.g
upstream https://bitbucket.com/upstream_user/reponame.g
other_users_repo https://bitbucket.com/other_users_repo
other_users_repo https://bitbucket.com/other_users_repo
The git remote add command will create a new connection record to a remote
repository. After adding a remote, you’ll be able to use <name> as a convenient
shortcut for <url> in other Git commands. For more information on the accepted
URL syntax, view the "Repository URLs" section below. This command will create a
new record within the repository's ./.git/config. An example of this config file
update follows:
Inspecting a Remote
The show subcommand can be appended to git remote to give detailed output on
the configuration of a remote. This output will contain a list of branches associated
with the remote and also the endpoints attached for fetching and pushing.
Once a remote record has been configured through the use of the git remote
command, the remote name can be passed as an argument to other Git commands
to communicate with the remote repo. Both git fetch, and git pull can be used to
read from a remote repository. Both commands have different operations that are
explained in further depth on their respective links.
This example will upload the local state of <branch-name> to the remote repository
specified by <remote-name>.
Renaming and Removing Remotes
The command git remote rm will remove the connection to the remote repository
specified by the <name> parameter. To demonstrate let us 'undo' the remote
addition from our last example. If we execute git remote rm remote_test, and then
examine the contents of ./.git/config we can see that the [remote "remote_test"]
record is no longer there.
git fetch
git remote git fetch git push git pull
The git fetch command downloads commits, files, and refs from a remote
repository into your local repo. Fetching is what you do when you want to see what
everybody else has been working on. It’s similar to svn update in that it lets you see
how the central history has progressed, but it doesn’t force you to actually merge
the changes into your repository. Git isolates fetched content as a from existing local
content, it has absolutely no effect on your local development work. Fetched content
has to be explicitly checked out using the git checkout command. This makes
fetching a safe way to review commits before integrating them with your local
repository.
When downloading content from a remote repo, git pull and git fetchcommands
are available to accomplish the task. You can consider git fetch the 'safe' version of
the two commands. It will download the remote content but not update your local
repo's working state, leaving your current work intact. git pull is the more
aggressive alternative, it will download the remote content for the active local branch
and immediately execute git merge to create a merge commit for the new remote
content. If you have pending changes in progress this will cause conflicts and kickoff
the merge conflict resolution flow.
To better understand how git fetch works let us discuss how Git organizes and
stores commits. Behind the scenes, in the repository's ./.git/objects directory, Git
stores all commits, local and remote. Git keeps remote and local branch commits
distinctly separate through the use of branch refs. The refs for local branches are
stored in the ./.git/refs/heads/. Executing the git branch command will output a
list of the local branch refs. The following is an example of git branch output with
some demo branch names.
git branch
master
feature1
debug2
ls ./.git/refs/heads/
master
feature1
debug2
Remote branches are just like local branches, except they map to commits from
somebody else’s repository. Remote branches are prefixed by the remote they
belong to so that you don’t mix them up with local branches. Like local branches, Git
also has refs for remote branches. Remote branch refs live in the
./.git/refs/remotes/ directory. The next example code snippet shows the branches
you might see after fetching a remote repo named conveniently named remote-
repo:
git branch -r
# origin/master
# origin/feature1
# origin/debug2
# remote-repo/master
# remote-repo/other-feature
This output displays the local branches we had previously examined but now
displays them prefixed with origin/. Additionally, we now see the remote branches
prefixed with remote-repo. You can check out a remote branch just like a local one,
but this puts you in a detached HEAD state (just like checking out an old commit).
You can think of them as read-only branches. To view your remote branches, simply
pass the -r flag to the git branch command.
You can inspect remote branches with the usual git checkout and git log
commands. If you approve the changes a remote branch contains, you can merge it
into a local branch with a normal git merge. So, unlike SVN, synchronizing your local
repository with a remote repository is actually a two-step process: fetch, then merge.
The git pull command is a convenient shortcut for this process.
Fetch all of the branches from the repository. This also downloads all of the required
commits and files from the other repository.
Same as the above command, but only fetch the specified branch.
A power move which fetches all registered remotes and their branches:
The --dry-run option will perform a demo run of the command. I will output
examples of actions it will take during the fetch but not apply them.
Firstly we will need to configure the remote repo using the git remote command.
Here we have created a reference to the coworker's repo using the repo URL. We will
now pass that remote name to git fetch to download the contents.
You are in 'detached HEAD' state. You can look around, mak
changes and commit them, and you can discard any commits y
state without impacting any branches by performing another
The output from this checkout operation indicates that we are in a detached HEAD
state. This is expected and means that our HEAD ref is pointing to a ref that is not in
sequence with our local history. Being that HEAD is pointed at the
coworkers/feature_branch ref, we can create a new local branch from that ref. The
'detached HEAD' output shows us how to do this using the git checkout command:
Here we have created a new local branch named local_feature_branch this puts
updates HEAD to point at the latest remote content and we can continue
development on it from this point.
The following example walks through the typical workflow for synchronizing your
local repository with the central repository's master branch.
The commits from these new remote branches are shown as squares instead of
circles in the diagram below. As you can see, git fetch gives you access to the
entire branch structure of another repository.
To see what commits have been added to the upstream master, you can run a
git log using origin/master as a filter:
To approve the changes and merge them into your local master branch with the
following commands:
The origin/master and master branches now point to the same commit, and you are
synchronized with the upstream developments.
Push the specified branch to <remote>, along with all of the necessary commits and
internal objects. This creates a local branch in the destination repository. To prevent
you from overwriting commits, Git won’t let you push when it results in a non-fast-
forward merge in the destination repository.
Same as the above command, but force the push even if it results in a non-fast-
forward merge. Do not use the --force flag unless you’re absolutely sure you know
what you’re doing.
Tags are not automatically pushed when you push a branch or use the --all option.
The --tags flag sends all of your local tags to the remote repository.
git push is most commonly used to publish an upload local changes to a central
repository. After a local repository has been modified a push is executed to share the
modifications with remote team members.
The above diagram shows what happens when your local master has progressed
past the central repository’s master and you publish changes by running
git push origin master. Notice how git push is essentially the same as running
git merge master from inside the remote repository.
git push is one component of many used in the overall Git "syncing" process. The
syncing commands operate on remote branches which are configured using the
git remote command. git push can be considered and 'upload' command whereas,
git fetch and git pull can be thought of as 'download' commands. Once
changesets have been moved via a download or upload a git merge may be
performed at the destination to integrate the changes.
Force Pushing
Git prevents you from overwriting the central repository’s history by refusing push
requests when they result in a non-fast-forward merge. So, if the remote history has
diverged from your history, you need to pull the remote branch and merge it into
your local one, then try pushing again. This is similar to how SVN makes you
synchronize with the central repository via svn update before committing a
changeset.
The --force flag overrides this behavior and makes the remote repository’s branch
match your local one, deleting any upstream changes that may have occurred since
you last pulled. The only time you should ever need to force push is when you realize
that the commits you just shared were not quite right and you fixed them with a
git commit --amend or an interactive rebase. However, you must be absolutely
certain that none of your teammates have pulled those commits before using the
--force option.
Examples
The following example describes one of the standard methods for publishing local
contributions to the central repository. First, it makes sure your local master is up-to-
date by fetching the central repository’s copy and rebasing your changes on top of
them. The interactive rebase is also a good opportunity to clean up your commits
before sharing them. Then, the git push command sends all of the commits on your
local master to the central repository.
Since we already made sure the local master was up-to-date, this should result in a
fast-forward merge, and git push should not complain about any of the non-fast-
forward issues discussed above.
The git commit command accepts a --amend option which will update the previous
commit. A commit is often amended to update the commit message or add new
changes. Once a commit is amended a git push will fail because Git will see the
amended commit and the remote commit as diverged content. The --forceoption
must be used to push an amended commit.
The above will delete the remote branch named branch_name passing a branch
name prefixed with a colon to git push will delete the remote branch.
git pull
git remote git fetch git push git pull
The git pull command is used to fetch and download content from a remote
repository and immediately update the local repository to match that content.
Merging remote upstream changes into your local repository is a common task in
Git-based collaboration work flows. The git pull command is actually a
combination of two other commands, git fetch followed by git merge. In the first
stage of operation git pull will execute a git fetch scoped to the local branch
that HEAD is pointed at. Once the content is downloaded, git pull will enter a
merge workflow. A new merge commit will be-created and HEAD updated to point at
the new commit.
How it works
The git pull command first runs git fetch which downloads content from the
specified remote repository. Then a git merge is executed to merge the remote
content refs and heads into a new local merge commit. To better demonstrate the
pull and merging process let us consider the following example. Assume we have a
repository with a master branch and a remote origin.
In this scenario, git pull will download all the changes from the point where the
local and master diverged. In this example, that point is E. git pull will fetch the
diverged remote commits which are A-B-C. The pull process will then create a new
local merge commit containing the content of the new diverged remote commits.
In the above diagram, we can see the new commit H. This commit is a new merge
commit that contains the contents of remote A-B-C commits and has a combined
log message. This example is one of a few git pull merging strategies. A --rebase
option can be passed to git pull to use a rebase merging strategy instead of a
merge commit. The next example will demonstrate how a rebase pull works. Assume
that we are at a starting point of our first diagram, and we have executed
git pull --rebase.
In this diagram, we can now see that a rebase pull does not create the new H
commit. Instead, the rebase has copied the remote commits A--B--C and appended
them to the local origin/master commit history.
Common Options
Fetch the specified remote’s copy of the current branch and immediately merge it
into the local copy. This is the same as git fetch <remote> followed by
git merge origin/<current-branch>.
Similar to the default invocation, fetches the remote content but does not create a
new merge commit.
Same as the previous pull Instead of using git merge to integrate the remote branch
with the local one, use git rebase.
git pull --verbose
Gives verbose output during a pull which displays the content being downloaded
and the merge details.
You can think of git pull as Git's version of svn update. It’s an easy way to
synchronize your local repository with upstream changes. The following diagram
explains each step of the pulling process.
You start out thinking your repository is synchronized, but then git fetch reveals
that origin's version of master has progressed since you last checked it. Then
git merge immediately integrates the remote master into the local one.
git pull is one of many commands that claim the responsibility of 'syncing' remote
content. The git remote command is used to specify what remote endpoints the
syncing commands will operate on. The git push command is used to upload
content to a remote repository.
The git fetch command can be confused with git pull. They are both used to
download remote content. An important safety distinction can me made between
git pull and get fetch. git fetch can be considered the "safe" option whereas,
git pull can be considered unsafe. git fetch will download the remote content
and not alter the state of the local repository. Alternatively, git pull will download
remote content and immediately attempt to change the local state to match that
content. This may unintentionally cause the local repository to get in a conflicted
state.
In fact, pulling with --rebase is such a common workflow that there is a dedicated
configuration option for it:
After running that command, all git pull commands will integrate via git rebase
instead of git merge.
The following examples demonstrate how to use git pull in common scenarios:
Default Behavior
git pull
This simply moves your local changes onto the top of what everybody else has
already contributed.
Making a Pull Request
How it works Example Where to go from here
Pull requests are a feature that makes it easier for developers to collaborate
using Bitbucket. They provide a user-friendly web interface for discussing proposed
changes before integrating them into the official project.
In their simplest form, pull requests are a mechanism for a developer to notify team
members that they have completed a feature. Once their feature branch is ready, the
developer files a pull request via their Bitbucket account. This lets everybody
involved know that they need to review the code and merge it into the master
branch.
But, the pull request is more than just a notification—it’s a dedicated forum for
discussing the proposed feature. If there are any problems with the changes,
teammates can post feedback in the pull request and even tweak the feature by
pushing follow-up commits. All of this activity is tracked directly inside of the pull
request.
Compared to other collaboration models, this formal solution for sharing commits
makes for a much more streamlined workflow. SVN and Git can both automatically
send notification emails with a simple script; however, when it comes to discussing
changes, developers typically have to rely on email threads. This can become
haphazard, especially when follow-up commits are involved. Pull requests put all of
this functionality into a friendly web interface right next to your Bitbucket
repositories.
Anatomy of a Pull Request
When you file a pull request, all you’re doing is requesting that another developer
(e.g., the project maintainer) pulls a branch from your repository into their repository.
This means that you need to provide 4 pieces of information to file a pull request:
the source repository, the source branch, the destination repository, and the
destination branch.
How it works
4. The rest of the team reviews the code, discusses it, and alters it.
5. The project maintainer merges the feature into the official repository and
closes the pull request.
The rest of this section describes how pull requests can be leveraged against
different collaboration workflows.
After receiving the pull request, the project maintainer has to decide what to do. If
the feature is ready to go, they can simply merge it into master and close the pull
request. But, if there are problems with the proposed changes, they can post
feedback in the pull request. Follow-up commits will show up right next to the
relevant comments.
It’s also possible to file a pull request for a feature that is incomplete. For example, if
a developer is having trouble implementing a particular requirement, they can file a
pull request containing their work-in-progress. Other developers can then provide
suggestions inside of the pull request, or even fix the problem themselves with
additional commits.
The mechanics of pull requests in the Gitflow Workflow are the exact same as the
previous section: a developer simply files a pull request when a feature, release, or
hotfix branch needs to be reviewed, and the rest of the team will be notified via
Bitbucket.
Features are generally merged into the develop branch, while release and hotfix
branches are merged into both develop and master. Pull requests can be used to
formally manage all of these merges.
Since each developer has their own public repository, the pull request’s source
repository will differ from its destination repository. The source repository is the
developer’s public repository and the source branch is the one that contains the
proposed changes. If the developer is trying to merge the feature into the main
codebase, then the destination repository is the official project and the destination
branch is master.
Pull requests can also be used to collaborate with other developers outside of the
official project. For example, if a developer was working on a feature with a
teammate, they could file a pull request using the teammate’s Bitbucket repository
for the destination instead of the official project. They would then use the same
feature branch for the source and destination branches.
The two developers could discuss and develop the feature inside of the pull request.
When they’re done, one of them would file another pull request asking to merge the
feature into the official master branch. This kind of flexibility makes pull requests very
powerful collaboration tool in the Forking workflow.
Example
The example below demonstrates how pull requests can be used in the Forking
Workflow. It is equally applicable to developers working in small teams and to a
third-party developer contributing to an open source project.
In the example, Mary is a developer, and John is the project maintainer. Both of them
have their own public Bitbucket repositories, and John’s contains the official project.
To start working in the project, Mary first needs to fork John’s Bitbucket repository.
She can do this by signing in to Bitbucket, navigating to John’s repository, and
clicking the Fork button.
After filling out the name and description for the forked repository, she will have a
server-side copy of the project.
Next, Mary needs to clone the Bitbucket repository that she just forked. This will give
her a working copy of the project on her local machine. She can do this by running
the following command:
Keep in mind that git clone automatically creates an origin remote that points
back to Mary’s forked repository.
Before she starts writing any code, Mary needs to create a new branch for the
feature. This branch is what she will use as the source branch of the pull request.
After her feature is complete, Mary pushes the feature branch to her own Bitbucket
repository (not the official repository) with a simple git push:
This makes her changes available to the project maintainer (or any collaborators who
might need access to them).
After Bitbucket has her feature branch, Mary can create the pull request through her
Bitbucket account by navigating to her forked repository and clicking the Pull
requestbutton in the top-right corner. The resulting form automatically sets Mary’s
repository as the source repository, and it asks her to specify the source branch, the
destination repository, and the destination branch.
Mary wants to merge her feature into the main codebase, so the source branch is her
feature branch, the destination repository is John’s public repository, and the
destination branch is master. She’ll also need to provide a title and description for
the pull request. If there are other people who need to approve the code besides
John, she can enter them in the Reviewers field.
After she creates the pull request, a notification will be sent to John via his Bitbucket
feed and (optionally) via email.
John can access all of the pull requests people have filed by clicking on the Pull
request tab in his own Bitbucket repository. Clicking on Mary’s pull request will show
him a description of the pull request, the feature’s commit history, and a diff of all
the changes it contains.
If he thinks the feature is ready to merge into the project, all he has to do is hit
the Merge button to approve the pull request and merge Mary’s feature into his
masterbranch.
But, for this example, let’s say John found a small bug in Mary’s code, and needs her
to fix it before merging it in. He can either post a comment to the pull request as a
whole, or he can select a specific commit in the feature’s history to comment on.
To correct the error, Mary adds another commit to her feature branch and pushes it
to her Bitbucket repository, just like she did the first time around. This commit is
automatically added to the original pull request, and John can review the changes
again, right next to his original comment.
John accepts the pull request
Finally, John accepts the changes, merges the feature branch into master, and closes
the pull request. The feature is now integrated into the project, and any other
developers working on it can pull it into their own local repositories using the
standard git pull command.
You should now have all of the tools you need to start integrating pull requests into
your existing workflow. Remember, pull requests are not a replacement for any of
the Git-based collaboration workflows, but rather a convenient addition to them that
makes collaboration more accessible to all of your team members.
Git Branch
The diagram above visualizes a repository with two isolated lines of development,
one for a little feature, and one for a longer-running feature. By developing them in
branches, it’s not only possible to work on both of them in parallel, but it also keeps
the main master branch free from questionable code.
The implementation behind Git branches is much more lightweight than other
version control system models. Instead of copying files from directory to directory,
Git stores a branch as a reference to a commit. In this sense, a branch represents the
tip of a series of commits—it's not a container for commits. The history for a branch
is extrapolated through the commit relationships.
As you read, remember that Git branches aren't like SVN branches. Whereas SVN
branches are only used to capture the occasional large-scale development effort, Git
branches are an integral part of your everyday workflow. The following content will
expand on the internal Git branching architecture.
How it works
The git branch command lets you create, list, rename, and delete branches. It
doesn’t let you switch between branches or put a forked history back together again.
For this reason, git branch is tightly integrated with the git checkout and
git merge commands.
Common Options
git branch
Create a new branch called <branch>. This does not check out the new branch.
Delete the specified branch. This is a “safe” operation in that Git prevents you from
deleting the branch if it has unmerged changes.
Force delete the specified branch, even if it has unmerged changes. This is the
command to use if you want to permanently throw away all of the commits
associated with a particular line of development.
git branch -a
Creating Branches
It's important to understand that branches are just pointers to commits. When you
create a branch, all Git needs to do is create a new pointer, it doesn’t change the
repository in any other way. If you start with a repository that looks like this:
The repository history remains unchanged. All you get is a new pointer to the current
commit:
Note that this only creates the new branch. To start adding commits to it, you need
to select it with git checkout, and then use the standard git add and git commit
commands.
So far these examples have all demonstrated local branch operations. The
git branch command also works on remote branches. In order to operate on remote
branches, a remote repo must first be configured and added to the local repo config.
This command will push a copy of the local branch crazy-experiment to the remote
repo <remote>.
Deleting Branches
Once you’ve finished working on a branch and have merged it into the main code
base, you’re free to delete the branch without losing any history:
However, if the branch hasn’t been merged, the above command will output an error
message:
This protects you from losing access to that entire line of development. If you really
want to delete the branch (e.g., it’s a failed experiment), you can use the capital -D
flag:
The previous commands will delete a local copy of a branch. The branch may still
exist in remote repos. To delete a remote branch execute the following.
Or
This will push a delete signal to the remote origin repository that triggers a delete of
the remote crazy-experiment branch.
Summary
In this document we discussed Git's branching behavior and the git branch
command. The git branch commands primary functions are to create, list, rename
and delete branches. To operate further on the resulting branches the command is
commonly used with other commands like git checkout. Learn more about
git checkout branch operations; such as switching branches and merging branches,
on the git checkout page.
Compared to other VCSs, Git's branch operations are inexpensive and frequently
used. This flexibility enables powerful Git workflow customization. For more info on
Git workflows visit our extended workflow discussion pages: The
Feature Branch Workflow, GitFlow Workflow, and Forking Workflow.
Git Checkout
Checking out branches is similar to checking out old commits and files in that the
working directory is updated to match the selected branch/revision; however, new
changes are saved in the project history—that is, it’s not a read-only operation.
The git checkout command lets you navigate between the branches created by
git branch. Checking out a branch updates the files in the working directory to
match the version stored in that branch, and it tells Git to record all new commits on
that branch. Think of it as a way to select which line of development you’re working
on.
Having a dedicated branch for each new feature is a dramatic shift from a traditional
SVN workflow. It makes it ridiculously easy to try new experiments without the fear
of destroying existing functionality, and it makes it possible to work on many
unrelated features at the same time. In addition, branches also facilitate several
collaborative workflows.
The git checkout command may occasionally be confused with git clone. The
difference between the two commands is that clone works to fetch code from a
remote repository, alternatively checkout works to switch between versions of code
already on the local system.
Assuming the repo you're working in contains pre-existing branches, you can switch
between these branches using git checkout. To find out what branches are available
and what the current branch name is, execute git branch.
New Branches
Git checkout works hand-in-hand with git branch. The git branch command can
be used to create a new branch. When you want to start a new feature, you create a
new branch off master using git branch new_branch. Once created you can then
use git checkout new_branch to switch to that branch. Additionally, The
git checkout command accepts a -b argument that acts as a convenience method
which will create the new branch and immediately switch to it. You can work on
multiple features in a single repository by switching between them with
git checkout.
The above example simultaneously creates and checks out <new-branch>. The -b
option is a convenience flag that tells Git to run git branch <new-branch> before
running git checkout <new-branch>.
By default git checkout -b will base the new-branch off the current HEAD. An
optional additional branch parameter can be passed to git checkout. In the above
example, <existing-branch> is passed which then bases new-branch off of
existing-branch instead of the current HEAD.
Switching Branches
Git tracks a history of checkout operations in the reflog. You can execute git reflog
to view the history.
In modern versions of Git, you can then checkout the remote branch like a local
branch.
Additionally you can checkout a new local branch and reset it to the remote
branches last commit.
Detached HEADS
Now that we’ve seen the three main uses of git checkout on branches, it's
important to discuss the “detached HEAD” state. Remember that the HEAD is Git’s way
of referring to the current snapshot. Internally, the git checkout command simply
updates the HEAD to point to either the specified branch or commit. When it points
to a branch, Git doesn't complain, but when you check out a commit, it switches into
a “detached HEAD” state.
This is a warning telling you that everything you’re doing is “detached” from the rest
of your project’s development. If you were to start developing a feature while in a
detached HEAD state, there would be no branch allowing you to get back to it. When
you inevitably check out another branch (e.g., to merge your feature in), there would
be no way to reference your feature:
The point is, your development should always take place on a branch—never on a
detached HEAD. This makes sure you always have a reference to your new commits.
However, if you’re just looking at an old commit, it doesn’t really matter if you’re in a
detached HEAD state or not.
Summary
Merging is Git's way of putting a forked history back together again. The git merge
command lets you take the independent lines of development created by
git branch and integrate them into a single branch.
Note that all of the commands presented below merge into the current branch.
The current branch will be updated to reflect the merge, but the target branch will be
completely unaffected. Again, this means that git merge is often used in conjunction
with git checkout for selecting the current branch and git branch -d for deleting
the obsolete target branch.
How it works
Git merge will combine multiple sequences of commits into one unified history. In
the most frequent use cases, git merge is used to combine two branches.
The following examples in this document will focus on this branch merging
pattern. In these scenarios, git merge takes two commit pointers, usually the
branch tips, and will find a common base commit between them. Once Git finds a
common base commit it will create a new "merge commit" that combines the
changes of each queued merge commit sequence.
Say we have a new branch feature that is based off the master branch. We now want
to merge this feature branch into master.
Invoking this command will merge the specified branch feature into the
current branch, we'll assume master. Git will determine the merge
algorithm automatically (discussed below).
Merge commits are unique against other commits in the fact that they have
two parent commits. When creating a merge commit Git will attempt to auto
magically merge the separate histories for you. If Git encounters a piece of data that
is changed in both histories it will be unable to automatically combine them.
This scenario is a version control conflict and Git will need user intervention
to continue.
Preparing to merge
Before performing a merge there are a couple of preparation steps to take to ensure
the merge goes smoothly.
Execute git status to ensure that HEAD is pointing to the correct merge-receiving
branch. If needed, execute git checkout <receiving> to switch to the receiving
branch. In our case we will execute git checkout master.
Make sure the receiving branch and the merging branch are up-to-date with
the latest remote changes. Execute git fetch to pull the latest remote commits.
Once the fetch is completed ensure the master branch has the latest updates
by executing git pull.
Merging
Once the previously discussed "preparing to merge" steps have been taken a
merge can be initiated by executing git merge <branch name> where <branch name>
is the name of the branch that will be merged into the receiving branch.
A fast-forward merge can occur when there is a linear path from the current branch
tip to the target branch. Instead of “actually” merging the branches, all Git has to do
to integrate the histories is move (i.e., “fast forward”) the current branch tip up to the
target branch tip. This effectively combines the histories, since all of the commits
reachable from the target branch are now available through the current one. For
example, a fast forward merge of some-feature into master would look something
like the following:
However, a fast-forward merge is not possible if the branches have diverged. When there
is not a linear path to the target branch, Git has no choice but to combine them via a 3-
way merge. 3-way merges use a dedicated commit to tie together the two histories. The
nomenclature comes from the fact that Git uses three commits to generate the merge
commit: the two branch tips and their common ancestor.
While you can use either of these merge strategies, many developers like to use fast-
forward merges (facilitated through rebasing) for small features or bug fixes, while
reserving 3-way merges for the integration of longer-running features. In the latter case,
the resulting merge commit serves as a symbolic joining of the two branches.
Our first example demonstrates a fast-forward merge. The code below creates a new
branch, adds two commits to it, then integrates it into the main line with a fast-forward
merge.
# Start a new feature
git checkout -b new-feature master
# Edit some files
git add <file>
git commit -m "Start a feature"
# Edit some files
git add <file>
git commit -m "Finish a feature"
# Merge in the new-feature branch
git checkout master
git merge new-feature
git branch -d new-feature
This is a common workflow for short-lived topic branches that are used more as an
isolated development than an organizational tool for longer-running features.
Also note that Git should not complain about the git branch -d, since new-feature is
now accessible from the master branch.
In the event that you require a merge commit during a fast forward merge for record
keeping purposes you can execute git merge with the --no-ffoption.
This command merges the specified branch into the current branch, but always generates
a merge commit (even if it was a fast-forward merge). This is useful for documenting all
merges that occur in your repository.
3-way merge
The next example is very similar, but requires a 3-way merge because master progresses
while the feature is in-progress. This is a common scenario for large features or when
several developers are working on a project simultaneously.
Note that it’s impossible for Git to perform a fast-forward merge, as there is no way to
move master up to new-feature without backtracking.
For most workflows, new-feature would be a much larger feature that took a long time to
develop, which would be why new commits would appear on master in the meantime. If
your feature branch was actually as small as the one in the above example, you would
probably be better off rebasing it onto master and doing a fast-forward merge. This
prevents superfluous merge commits from cluttering up the project history.
Resolving conflict
If the two branches you're trying to merge both changed the same part of the same file,
Git won't be able to figure out which version to use. When such a situation occurs, it
stops right before the merge commit so that you can resolve the conflicts manually.
The great part of Git's merging process is that it uses the familiar edit/stage/commit
workflow to resolve merge conflicts. When you encounter a merge conflict, running the
git status command shows you which files need to be resolved. For example, if both
branches modified the same section of hello.py, you would see something like the
following:
On branch master
Unmerged paths:
(use "git add/rm ..." as appropriate to mark resolution)
both modified: hello.py
When Git encounters a conflict during a merge, It will edit the content of the affected files
with visual indicators that mark both sides of the conflicted content. These visual markers
are: <<<<<<<, =======, and >>>>>>>. Its helpful to search a project for these
indicators during a merge to find where conflicts need to be resolved.
Generally the content before the ======= marker is the receiving branch and the part
after is the merging branch.
Once you've identified conflicting sections, you can go in and fix up the merge to your
liking. When you're ready to finish the merge, all you have to do is run git add on the
conflicted file(s) to tell Git they're resolved. Then, you run a normal git commit to
generate the merge commit. It’s the exact same process as committing an ordinary
snapshot, which means it’s easy for normal developers to manage their own merges.
Note that merge conflicts will only occur in the event of a 3-way merge. It’s not possible
to have conflicting changes in a fast-forward merge.
Summary
1. Git merging combines sequences of commits into one unified history of commits.
2. There are two main ways Git will merge: Fast Forward and Three way
3. Git can automatically merge commits unless there are changes that conflict in both
commit sequences.
This document integrated and referenced other Git commands like: git branch, git pull,
and git fetch. Visit their corresponding stand-alone pages for more information.
Git merge conflicts
Version control systems are all about managing contributions between multiple
distributed authors ( usually developers ). Sometimes multiple developers may try to
edit the same content. If Developer A tries to edit code that Developer B is editing a
conflict may occur. To alleviate the occurrence of conflicts developers will work in
separate isolated branches. The git merge command's primary responsibility is to
combine separate branches and resolve any conflicting edits.
Merging and conflicts are a common part of the Git experience. Conflicts in other
version control tools like SVN can be costly and time-consuming. Git makes merging
super easy. Most of the time, Git will figure out how to automatically integrate new
changes.
Conflicts generally arise when two people have changed the same lines in a file, or if
one developer deleted a file while another developer was modifying it. In these
cases, Git cannot automatically determine what is correct. Conflicts only affect the
developer conducting the merge, the rest of the team is unaware of the conflict. Git
will mark the file as being conflicted and halt the merging process. It is then the
developers' responsibility to resolve the conflict.
A merge can enter a conflicted state at two separate points. When starting and
during a merge process. The following is a discussion of how to address each of
these conflict scenarios.
In order to get real familiar with merge conflicts, the next section will simulate a
conflict to later examine and resolve. The example will be using a Unix-like
command-line Git interface to execute the example simulation.
$ mkdir git-merge-test
$ cd git-merge-test
$ git init .
$ echo "this is some content to mess with" > merge.txt
$ git add merge.txt
$ git commit -am"we are commiting the inital content"
[master (root-commit) d48e74c] we are commiting the inital
1 file changed, 1 insertion(+)
create mode 100644 merge.txt
This code example executes a sequence of commands that accomplish the following.
Now we have a new repo with one branch master and a file merge.txt with content
in it. Next, we will create a new branch to use as the conflicting merge.
This chain of commands checks out the master branch, appends content to
merge.txt, and commits it. This now puts our example repo in a state where we have
2 new commits. One in the master branch and one in the
new_branch_to_merge_later branch. At this time lets
git merge new_branch_to_merge_later and see what happen!
BOOM 💥. A conflict appears. Thanks, Git for letting us know about this!
As we have experienced from the proceeding example, Git will produce some
descriptive output letting us know that a CONFLICT has occcured. We can gain
further insight by running the git status command
$ git status
On branch master
You have unmerged paths.
(fix conflicts and run "git commit")
(use "git merge --abort" to abort the merge)
Unmerged paths:
(use "git add <file>..." to mark resolution)
The output from git status indicates that there are unmerged paths due to a
conflict. The merge.text file now appears in a modified state. Let's examine the file
and see whats modified.
$ cat merge.txt
<<<<<<< HEAD
this is some content to mess with
content to append
=======
totally different content to merge later
>>>>>>> new_branch_to_merge_later
Here we have used the cat command to put out the contents of the merge.txt file.
We can see some strange new additions
<<<<<<< HEAD
=======
>>>>>>> new_branch_to_merge_later
Think of these new lines as "conflict dividers". The ======= line is the "center" of the
conflict. All the content between the center and the <<<<<<< HEAD line is content that
exists in the current branch master which the HEAD ref is pointing to. Alternatively all
content between the center and >>>>>>> new_branch_to_merge_later is content that
is present in our merging branch.
The most direct way to resolve a merge conflict is to edit the conflicted file. Open
the merge.txt file in your favorite editor. For our example lets simply remove all the
conflict dividers. The modified merge.txt content should then look like:
Once the file has been edited use git add merge.txt to stage the new merged
content. To finalize the merge create a new commit by executing:
Git will see that the conflict has been resolved and creates a new merge commit to
finalize the merge.
General tools
git status
The status command is in frequent use when a working with Git and during a merge
it will help identify conflicted files.
Passing the --merge argument to the git log command will produce a log with a list
of commits that conflict between the merging branches.
git diff
git checkout
reset can be used to undo changes to the working directory and staging area.
Tools for when git conflicts arise during a merge
Executing git merge with the --abort option will exit from the merge process and
return the branch to the state before the merge began.
git reset
Git reset can be used during a merge conflict to reset conflicted files to a know
good state
Summary
Merge conflicts can be an intimidating experience. Luckily, Git offers powerful tools
to help navigate and resolve conflicts. Git can handle most merges on its own with
automatic merging features. A conflict arises when two separate branches have
made edits to the same line in a file, or when a file has been deleted in one branch
but edited in the other. Conflicts will most likely happen when working in a team
environment.
There are many tools to help resolve merge conflicts. Git has plenty of command line
tools we discussed here. For more detailed information on these tools visit stand-
alone pages for git log, git reset, git status, git checkout, and git reset. In
addition to the Git, many third-party tools offer streamlined merge conflict support
features.
Comparing Workflows
A Git Workflow is a recipe or recommendation for how to use Git to accomplish work
in a consistent and productive manner. Git workflows encourage users to leverage
Git effectively and consistently. Git offers a lot of flexibility in how users manage
changes. Given Git's focus on flexibility, there is no standardized process on how to
interact with Git. When working with a team on a Git managed project, it’s important
to make sure the team is all in agreement on how the flow of changes will be
applied. To ensure the team is on the same page, an agreed upon Git workflow
should be developed or selected. There are several publicized Git workflows that
may be a good fit for your team. Here, we’ll be discussing some of these workflow
options.
The array of possible workflows can make it hard to know where to begin when
implementing Git in the workplace. This page provides a starting point by surveying
the most common Git workflows for software teams.
As you read through, remember that these workflows are designed to be guidelines
rather than concrete rules. We want to show you what’s possible, so you can mix and
match aspects from different workflows to suit your individual needs.
When evaluating a workflow for your team, it's most important that you consider
your team’s culture. You want the workflow to enhance the effectiveness of your
team and not be a burden that limits productivity. Some things to consider when
evaluating a Git workflow are:
Does this workflow impose any new unnecessary cognitive overhead to the
team?
Centralized Workflow
The Centralized Workflow is a great Git workflow for teams transitioning from SVN.
Like Subversion, the Centralized Workflow uses a central repository to serve as the
single point-of-entry for all changes to the project. Instead of trunk, the default
development branch is called master and all changes are committed into this
branch. This workflow doesn’t require any other branches besides master.
Transitioning to a distributed version control system may seem like a daunting task,
but you don’t have to change your existing workflow to take advantage of Git. Your
team can develop projects in the exact same way as they do with Subversion.
However, using Git to power your development workflow presents a few advantages
over SVN. First, it gives every developer their own local copy of the entire project.
This isolated environment lets each developer work independently of all other
changes to a project - they can add commits to their local repository and completely
forget about upstream developments until it's convenient for them.
Second, it gives you access to Git’s robust branching and merging model. Unlike
SVN, Git branches are designed to be a fail-safe mechanism for integrating code and
sharing changes between repositories. The Centralized Workflow is similar to other
workflows in its utilization of a remote server-side hosted repository that developers
push and pull form. Compared to other workflows, the Centralized Workflow has no
defined pull request or forking patterns. A Centralized Workflow is generally better
suited for teams migrating from SVN to Git and smaller size teams.
How it works
Developers start by cloning the central repository. In their own local copies of the
project, they edit files and commit changes as they would with SVN; however, these
new commits are stored locally - they’re completely isolated from the central
repository. This lets developers defer synchronizing upstream until they’re at a
convenient break point.
To publish changes to the official project, developers "push" their local masterbranch
to the central repository. This is the equivalent of svn commit, except that it adds all
of the local commits that aren’t already in the central master branch.
First, someone needs to create the central repository on a server. If it’s a new project,
you can initialize an empty repository. Otherwise, you’ll need to import an existing
Git or SVN repository.
Be sure to use a valid SSH username for user, the domain or IP address of your
server for host, and the location where you'd like to store your repo for
/path/to/repo.git. Note that the .git extension is conventionally appended to the
repository name to indicate that it’s a bare repository.
When you clone a repository, Git automatically adds a shortcut called origin that
points back to the “parent” repository, under the assumption that you'll want to
interact with it further on down the road.
Remember that since these commands create local commits, John can repeat this
process as many times as he wants without worrying about what’s going on in the
central repository. This can be very useful for large features that need to be broken
down into simpler, more atomic chunks.
Managing conflicts
The central repository represents the official project, so its commit history should be
treated as sacred and immutable. If a developer’s local commits diverge from the
central repository, Git will refuse to push their changes because this would overwrite
official commits.
Before the developer can publish their feature, they need to fetch the updated
central commits and rebase their changes on top of them. This is like saying, “I want
to add my changes to what everyone else has already done.” The result is a perfectly
linear history, just like in traditional SVN workflows.
If local changes directly conflict with upstream commits, Git will pause the rebasing
process and give you a chance to manually resolve the conflicts. The nice thing
about Git is that it uses the same git status and git add commands for both
generating commits and resolving merge conflicts. This makes it easy for new
developers to manage their own merges. Plus, if they get themselves into trouble, Git
makes it very easy to abort the entire rebase and try again (or go find help).
Example
Let’s take a general example at how a typical small team would collaborate using this
workflow. We’ll see how two developers, John and Mary, can work on separate
features and share their contributions via a centralized repository.
In his local repository, John can develop features using the standard Git commit
process: edit, stage, and commit.
Remember that since these commands create local commits, John can repeat this
process as many times as he wants without worrying about what’s going on in the
central repository.
Meanwhile, Mary is working on her own feature in her own local repository using the
same edit/stage/commit process. Like John, she doesn’t care what’s going on in the
central repository, and she really doesn’t care what John is doing in his local
repository, since all local repositories are private.
Once John finishes his feature, he should publish his local commits to the central
repository so other team members can access it. He can do this with the git push
command, like so:
Remember that origin is the remote connection to the central repository that Git
created when John cloned it. The master argument tells Git to try to make the
origin’s master branch look like his local master branch. Since the central
repository hasn’t been updated since John cloned it, this won’t result in any conflicts
and the push will work as expected.
But, since her local history has diverged from the central repository, Git will refuse
the request with a rather verbose error message:
This prevents Mary from overwriting official commits. She needs to pull John’s
updates into her repository, integrate them with her local changes, and then try
again.
Mary can use git pull to incorporate upstream changes into her repository. This
command is sort of like svn update—it pulls the entire upstream commit history into
Mary’s local repository and tries to integrate it with her local commits:
The --rebase option tells Git to move all of Mary’s commits to the tip of the master
branch after synchronising it with the changes from the central repository, as shown
below:
The pull would still work if you forgot this option, but you would wind up with a
superfluous “merge commit” every time someone needed to synchronize with the
central repository. For this workflow, it’s always better to rebase instead of
generating a merge commit.
Rebasing works by transferring each local commit to the updated master branch one
at a time. This means that you catch merge conflicts on a commit-by-commit basis
rather than resolving all of them in one massive merge commit. This keeps your
commits as focused as possible and makes for a clean project history. In turn, this
makes it much easier to figure out where bugs were introduced and, if necessary, to
roll back changes with minimal impact on the project.
If Mary and John are working on unrelated features, it’s unlikely that the rebasing
process will generate conflicts. But if it does, Git will pause the rebase at the current
commit and output the following message, along with some relevant instructions:
The great thing about Git is that anyone can resolve their own merge conflicts. In our
example, Mary would simply run a git status to see where the problem is.
Conflicted files will appear in the Unmerged paths section:
# Unmerged paths:
# (use "git reset HEAD <some-file>..." to unstage)
# (use "git add/rm <some-file>..." as appropriate to mark
#
# both modified: <some-file>
Then, she’ll edit the file(s) to her liking. Once she’s happy with the result, she can
stage the file(s) in the usual fashion and let git rebase do the rest:
And that’s all there is to it. Git will move on to the next commit and repeat the
process for any other commits that generate conflicts.
If you get to this point and realize and you have no idea what’s going on, don’t
panic. Just execute the following command and you’ll be right back to where you
started:
After she’s done synchronizing with the central repository, Mary will be able to
publish her changes successfully:
The Centralized Workflow is great for small teams. The conflict resolution process
detailed above can form a bottleneck as your team scales in size. If your team is
comfortable with the Centralized Workflow but wants to streamline its collaboration
efforts, it's definitely worth exploring the benefits of the Feature Branch Workflow. By
dedicating an isolated branch to each feature, it’s possible to initiate in-depth
discussions around new additions before integrating them into the official project.
Other common workflows
The Centralized Workflow is essentially a building block for other Git workflows.
Most popular Git workflows will have some sort of centralized repo that individual
developers will push and pull from. Below we will briefly discuss some other popular
Git workflows. These extended workflows offer more specialized patterns in regard
to managing branches for feature development, hot fixes, and eventual release.
Feature branching
Gitflow Workflow
Forking Workflow
Guidelines
There is no one size fits all Git workflow. As previously stated, it’s important to
develop a Git workflow that is a productivity enhancement for your team. In addition
to team culture, a workflow should also complement business culture. Git features
like branches and tags should complement your business’s release schedule. If your
team is using task tracking project management software you may want to use
branches that correspond with tasks in progress. In addition, some guidelines to
consider when deciding on a workflow are:
Short-lived branches
The longer a branch lives separate from the production branch, the higher the risk
for merge conflicts and deployment challenges. Short-lived branches promote
cleaner merges and deploys.
Summary
To read about the next Git workflow check out our comprehensive breakdown of
the Feature Branch Workflow.
Git Feature Branch Workflow
The core idea behind the Feature Branch Workflow is that all feature
development should take place in a dedicated branch instead of the
master branch. This encapsulation makes it easy for multiple
developers to work on a particular feature without disturbing the
main codebase. It also means the master branch will never contain
broken code, which is a huge advantage for continuous integration
environments.
How it works
This switches the repo to the master branch, pulls the latest commits
and resets the repo's local copy of master to match the latest version.
Create a new-branch
Use a separate branch for each feature or issue you work on. After
creating a branch, check it out locally so that any changes you make
will be on that branch.
This checks out a branch called new-feature based on master, and the
-b flag tells Git to create the branch if it doesn’t already exist.
git status
git add <some-file>
git commit
Resolve feedback
Now teammates comment and approve the pushed commits. Resolve
their comments locally, commit, and push the suggested changes to
Bitbucket. Your updates appear in the pull request.
Merge your pull request
Before you merge, you may have to resolve merge conflicts if others
have made changes to the repo. When your pull request is approved
and conflict-free, you can add your code to the master branch. Merge
from the pull request in Bitbucket.
Pull requests
Example
git status
git add <some-file>
git commit
Mary adds a few commits to her feature over the course of the
morning. Before she leaves for lunch, it’s a good idea to push her
feature branch up to the central repository. This serves as a
convenient backup, but if Mary was collaborating with other
developers, this would also give them access to her initial commits.
When Mary gets back from lunch, she completes her feature. Before
merging it into master, she needs to file a pull request letting the rest
of the team know she's done. But first, she should make sure the
central repository has her most recent commits:
git push
Then, she files the pull request in her Git GUI asking to merge
marys-feature into master, and team members will be notified
automatically. The great thing about pull requests is that they show
comments right next to their related commits, so it's easy to ask
questions about specific changesets.
Bill receives the pull request
Bill gets the pull request and takes a look at marys-feature. He
decides he wants to make a few changes before integrating it into the
official project, and he and Mary have some back-and-forth via the
pull request.
To make the changes, Mary uses the exact same process as she did to
create the first iteration of her feature. She edits, stages, commits, and
pushes updates to the central repository. All her activity shows up in
the pull request, and Bill can still make comments along the way.
Once Bill is ready to accept the pull request, someone needs to merge
the feature into the stable project (this can be done by either Bill or
Mary):
Summary
Gitflow Workflow is a Git workflow design that was first published and
made popular by Vincent Driessen at nvie. The Gitflow Workflow
defines a strict branching model designed around the project release.
This provides a robust framework for managing larger projects.
Getting Started
How it works
This branch will contain the complete history of the project, whereas
master will contain an abridged version. Other developers should
now clone the central repository and create a tracking branch for
develop.
When using the git-flow extension library, executing git flow initon
an existing repo will create the develop branch:
$ git branch
* develop
master
Feature Branches
Each new feature should reside in its own branch, which can
be pushed to the central repository for backup/collaboration. But,
instead of branching off of master, feature branches use developas
their parent branch. When a feature is complete, it gets merged back
into develop. Features should never interact directly with master.
Continue your work and use Git like you normally would.
Release Branches
Once the release is ready to ship, it will get merged it into masterand
develop, then the release branch will be deleted. It’s important to
merge back into develop because critical updates may have been
added to the release branch and they need to be accessible to new
features. If your organization stresses code review, this would be an
ideal place for a pull request.
Hotfix Branches
Having a dedicated line of development for bug fixes lets your team
address issues without interrupting the rest of the workflow or
waiting for the next release cycle. You can think of maintenance
branches as ad hoc release branches that work directly with master.
A hotfix branch can be created using the following methods:
Example
How it works
When they're ready to publish a local commit, they push the commit
to their own public repository—not the official one. Then, they file a
pull request with the main repository, which lets the project
maintainer know that an update is ready to be integrated. The pull
request also serves as a convenient discussion thread if there are
issues with the contributed code. The following is a step-by-step
example of this workflow.
8. The developer opens a pull request from the new branch to the
'official' repository.
9. The pull request gets approved for merge and is merged into
the original server-side repository
Forking vs cloning
It's important to note that "forked" repositories and "forking" are not
special operations. Forked repositories are created using the
standard git clone command. Forked repositories are generally
"server-side clones" and usually managed and hosted by a 3rd party
Git service like Bitbucket. There is no unique Git command to create
forked repositories. A clone operation is essentially a copy of a
repository and its history.
Fork a repository
Adding a remote
Whereas other Git workflows use a single origin remote that points to
the central repository, the Forking Workflow requires two remotes—
one for the official repository, and one for the developer’s personal
server-side repository. While you can call these remotes anything you
want, a common convention is to use origin as the remote for your
forked repository (this will be created automatically when you run
git clone) and upstream for the official repository.
You’ll need to create the upstream remote yourself using the above
command. This will let you easily keep your local repository up-to-
date as the official project progresses. Note that if your upstream
repository has authentication enabled (i.e., it's not open source), you'll
need to supply a username, like so:
In the developer's local copy of the forked repository they can edit
code, commit changes, and create branches just like in other Git
workflows:
All of their changes will be entirely private until they push it to their
public repository. And, if the official project has moved forward, they
can access new commits with git pull:
git pull upstream master
This diverges from the other workflows in that the origin remote
points to the developer’s personal server-side repository, not the
main codebase.
Second, they need to notify the project maintainer that they want to
merge their feature into the official codebase. Bitbucket provides a
“pull request” button that leads to a form asking you to specify which
branch you want to merge into the official repository. Typically, you’ll
want to integrate your feature branch into the upstream remote’s
master branch.
Summary
7. Using Bitbucket you open up a pull request for the new branch
against the original repo at bitbucket.org/userA/open-project
Git is all about working with divergent history. Its git merge and git rebase
commands offer alternative ways to integrate commits from different branches, and
both options come with their own advantages. In this article, we’ll discuss how and
when a basic git merge operation can be replaced with a rebase.
Learn more »
The git reset, git checkout, and git revert commands are all similar in that they
undo some type of change in your repository. But, they all affect different
combinations of the working directory, staged snapshot, and commit history. This
article clearly defines how these commands differ and when each of them should be
used in the standard Git workflows.
Learn more »
The git log command is what makes your project history useful. Without it, you
wouldn’t be able to access any of your commits. But, if you’re like most aspiring Git
users, you’ve probably only scratched the surface of what’s possible with git log.
This article walks you through its advanced formatting and filtering options, giving
you the power to extract all sorts of interesting information from your Git repository.
Learn more »
Git Hooks
If you want to perform custom actions when a certain event takes place in a Git
repository, hooks are your tool of choice. They let you normalize commit messages,
automate testing suites, notify continuous integration systems, and much more.
After this article, you’ll understand the many ways in which Git hooks can streamline
your workflow.
Learn more »
A ref is Git’s internal way of referring to a commit. You’re already familiar with many
categories of refs, including commit hashes and branch names. But, there are many
other types of refs, and virtually every Git command utilizes them in some form or
another. You’ll walk away from this article with an intimate knowledge of Git’s inner
workings.
Learn more »
Merging vs. Rebasing
Conceptual Overview The Golden Rule of
Rebasing Workflow Walkthrough Summary
The git rebase command has a reputation for being magical Git voodoo that
beginners should stay away from, but it can actually make life much easier for a
development team when used with care. In this article, we’ll compare git rebasewith
the related git merge command and identify all of the potential opportunities to
incorporate rebasing into the typical Git workflow.
Conceptual Overview
The first thing to understand about git rebase is that it solves the same problem as
git merge. Both of these commands are designed to integrate changes from one
branch into another branch—they just do it in very different ways.
Consider what happens when you start working on a new feature in a dedicated
branch, then another team member updates the master branch with new commits.
This results in a forked history, which should be familiar to anyone who has used Git
as a collaboration tool.
Now, let’s say that the new commits in master are relevant to the feature that you’re
working on. To incorporate the new commits into your feature branch, you have
two options: merging or rebasing.
This creates a new “merge commit” in the feature branch that ties together the
histories of both branches, giving you a branch structure that looks like this:
On the other hand, this also means that the feature branch will have an extraneous
merge commit every time you need to incorporate upstream changes. If master is
very active, this can pollute your feature branch’s history quite a bit. While it’s
possible to mitigate this issue with advanced git log options, it can make it hard for
other developers to understand the history of the project.
This moves the entire feature branch to begin on the tip of the master branch,
effectively incorporating all of the new commits in master. But, instead of using a
merge commit, rebasing re-writes the project history by creating brand new commits
for each commit in the original branch.
The major benefit of rebasing is that you get a much cleaner project history. First, it
eliminates the unnecessary merge commits required by git merge. Second, as you
can see in the above diagram, rebasing also results in a perfectly linear project
history—you can follow the tip of feature all the way to the beginning of the project
without any forks. This makes it easier to navigate your project with commands like
git log, git bisect, and gitk.
But, there are two trade-offs for this pristine commit history: safety and traceability. If
you don’t follow the Golden Rule of Rebasing, re-writing project history can be
potentially catastrophic for your collaboration workflow. And, less importantly,
rebasing loses the context provided by a merge commit—you can’t see when
upstream changes were incorporated into the feature.
Interactive Rebasing
Interactive rebasing gives you the opportunity to alter commits as they are moved to
the new branch. This is even more powerful than an automated rebase, since it offers
complete control over the branch’s commit history. Typically, this is used to clean up
a messy history before merging a feature branch into master.
To begin an interactive rebasing session, pass the i option to the git rebase
command:
This will open a text editor listing all of the commits that are about to be moved:
This listing defines exactly what the branch will look like after the rebase is
performed. By changing the pick command and/or re-ordering the entries, you can
make the branch’s history look like whatever you want. For example, if the 2nd
commit fixes a small problem in the 1st commit, you can condense them into a
single commit with the fixup command:
Eliminating insignificant commits like this makes your feature’s history much easier
to understand. This is something that git merge simply cannot do.
Once you understand what rebasing is, the most important thing to learn is
when notto do it. The golden rule of git rebase is to never use it on public branches.
For example, think about what would happen if you rebased master onto your
feature branch:
The rebase moves all of the commits in master onto the tip of feature. The problem
is that this only happened in your repository. All of the other developers are still
working with the original master. Since rebasing results in brand new commits, Git
will think that your master branch’s history has diverged from everybody else’s.
The only way to synchronize the two master branches is to merge them back
together, resulting in an extra merge commit and two sets of commits that contain
the same changes (the original ones, and the ones from your rebased branch).
Needless to say, this is a very confusing situation.
So, before you run git rebase, always ask yourself, “Is anyone else looking at this
branch?” If the answer is yes, take your hands off the keyboard and start thinking
about a non-destructive way to make your changes (e.g., the git revertcommand).
Otherwise, you’re safe to re-write history as much as you like.
Force-Pushing
If you try to push the rebased master branch back to a remote repository, Git will
prevent you from doing so because it conflicts with the remote master branch. But,
you can force the push to go through by passing the --force flag, like so:
This overwrites the remote master branch to match the rebased one from your
repository and makes things very confusing for the rest of your team. So, be very
careful to use this command only when you know exactly what you’re doing.
One of the only times you should be force-pushing is when you’ve performed a local
cleanup after you’ve pushed a private feature branch to a remote repository (e.g., for
backup purposes). This is like saying, “Oops, I didn’t really want to push that original
version of the feature branch. Take the current one instead.” Again, it’s important
that nobody is working off of the commits from the original version of the feature
branch.
Workflow Walkthrough
Rebasing can be incorporated into your existing Git workflow as much or as little as
your team is comfortable with. In this section, we’ll take a look at the benefits that
rebasing can offer at the various stages of a feature’s development.
The first step in any workflow that leverages git rebase is to create a dedicated
branch for each feature. This gives you the necessary branch structure to safely
utilize rebasing:
Local Cleanup
One of the best ways to incorporate rebasing into your workflow is to clean up local,
in-progress features. By periodically performing an interactive rebase, you can make
sure each commit in your feature is focused and meaningful. This lets you write your
code without worrying about breaking it up into isolated commits—you can fix it up
after the fact.
When calling git rebase, you have two options for the new base: The feature’s
parent branch (e.g., master), or an earlier commit in your feature. We saw an
example of the first option in the Interactive Rebasing section. The latter option is
nice when you only need to fix up the last few commits. For example, the following
command begins an interactive rebase of only the last 3 commits.
git checkout feature
git rebase -i HEAD~3
By specifying HEAD~3 as the new base, you’re not actually moving the branch—you’re
just interactively re-writing the 3 commits that follow it. Note that this
will notincorporate upstream changes into the feature branch.
If you want to re-write the entire feature using this method, the git merge-base
command can be useful to find the original base of the feature branch. The
following returns the commit ID of the original base, which you can then pass to
git rebase:
This use of interactive rebasing is a great way to introduce git rebase into your
workflow, as it only affects local branches. The only thing other developers will see is
your finished product, which should be a clean, easy-to-follow feature branch
history.
But again, this only works for private feature branches. If you’re collaborating with
other developers via the same feature branch, that branch is public, and you’re not
allowed to re-write its history.
There is no git merge alternative for cleaning up local commits with an interactive
rebase.
This use of git rebase is similar to a local cleanup (and can be performed
simultaneously), but in the process it incorporates those upstream commits from
master.
Keep in mind that it’s perfectly legal to rebase onto a remote branch instead of
master. This can happen when collaborating on the same feature with another
developer and you need to incorporate their changes into your repository.
For example, if you and another developer named John added commits to the
feature branch, your repository might look like the following after fetching the
remote feature branch from John’s repository:
You can resolve this fork the exact same way as you integrate upstream changes
from master: either merge your local feature with john/feature, or rebase your
local feature onto the tip of john/feature.
Note that this rebase doesn’t violate the Golden Rule of Rebasing because only your
local feature commits are being moved—everything before that is untouched. This
is like saying, “add my changes to what John has already done.” In most
circumstances, this is more intuitive than synchronizing with the remote branch via a
merge commit.
By default, the git pull command performs a merge, but you can force it to
integrate the remote branch with a rebase by passing it the --rebase option.
Any changes from other developers need to be incorporated with git mergeinstead
of git rebase.
For this reason, it’s usually a good idea to clean up your code with an interactive
rebase before submitting your pull request.
And that’s all you really need to know to start rebasing your branches. If you would
prefer a clean, linear history free of unnecessary merge commits, you should reach
for git rebase instead of git merge when integrating changes from another branch.
On the other hand, if you want to preserve the complete history of your project and
avoid the risk of re-writing public commits, you can stick with git merge. Either
option is perfectly valid, but at least now you have the option of leveraging the
benefits of git rebase.
Resetting, Checking Out & Reverting
The git reset, git checkout, and git revert commands are some of the most
useful tools in your Git toolbox. They all let you undo some kind of change in your
repository, and the first two commands can be used to manipulate either commits or
individual files.
Because they’re so similar, it’s very easy to mix up which command should be used in
any given development scenario. In this article, we’ll compare the most common
configurations of git reset, git checkout, and git revert. Hopefully, you’ll walk
away with the confidence to navigate your repository using any of these commands.
It helps to think about each command in terms of their effect on the three state
management mechanisms of a Git repository: the working directory, the staged
snapshot, and the commit history. These components are sometimes know as "The
three trees" of Git. We explore the three trees in depth on the git reset page. Keep
these mechanisms in mind as you read through this article.
A revert is an operation that takes a specified commit and creates a new commit
which inverses the specified commit. git revert can only be run at a commit level
scope and has no file level functionality.
A reset is an operation that takes a specified commit and resets the "three trees" to
match the state of the repository at that specified commit. A reset can be invoked in
three different modes which correspond to the three trees.
Checkout and reset are generally used for making local or private 'undos'. They
modify the history of a repository that can cause conflicts when pushing to remote
shared repositories. Revert is considered a safe operation for 'public undos' as it
creates new history which can be shared remotely and doesn't overwrite history
remote team members may be dependent on.
The table below sums up the most common use cases for all of these commands. Be
sure to keep this reference handy, as you’ll undoubtedly need to use at least some of
them during your Git career.
Commit-
git checkout Switch between branches or inspect old snapshots
level
Commit-
git revert Undo commits in a public branch
level
The parameters that you pass to git reset and git checkout determine their scope.
When you don’t include a file path as a parameter, they operate on whole commits.
That’s what we’ll be exploring in this section. Note that git revert has no file-level
counterpart.
The two commits that were on the end of hotfix are now dangling, or orphaned
commits. This means they will be deleted the next time Git performs a garbage
collection. In other words, you’re saying that you want to throw away these commits.
This can be visualized as the following:
This usage of git reset is a simple way to undo changes that haven’t been shared
with anyone else. It’s your go-to command when you’ve started working on a feature
and find yourself thinking, “Oh crap, what am I doing? I should just start over.”
In addition to moving the current branch, you can also get git reset to alter the
staged snapshot and/or the working directory by passing it one of the following
flags:
--soft – The staged snapshot and working directory are not altered in any
way.
--mixed – The staged snapshot is updated to match the specified commit, but
the working directory is not affected. This is the default option.
--hard – The staged snapshot and the working directory are both updated to
match the specified commit.
It’s easier to think of these modes as defining the scope of a git reset operation.
For further detailed information visit the git reset page.
Internally, all the above command does is move HEAD to a different branch and
update the working directory to match. Since this has the potential to overwrite local
changes, Git forces you to commit or stash any changes in the working directory that
will be lost during the checkout operation. Unlike git reset, git checkout doesn’t
move any branches around.
You can also check out arbitrary commits by passing the commit reference instead of
a branch. This does the exact same thing as checking out a branch: it moves the
HEAD reference to the specified commit. For example, the following command will
check out the grandparent of the current commit:
You can also think of git revert as a tool for undoing committed changes, while
git reset HEAD is for undoing uncommitted changes.
Like git checkout, git revert has the potential to overwrite files in the working
directory, so it will ask you to commit or stash changes that would be lost during the
revert operation.
File-level Operations
The git reset and git checkout commands also accept an optional file path as a
parameter. This dramatically alters their behavior. Instead of operating on entire
snapshots, this forces them to limit their operations to a single file.
The --soft, --mixed, and --hard flags do not have any effect on the file-level
version of git reset, as the staged snapshot is always updated, and the working
directory is never updated.
For example, the following command makes foo.py in the working directory match
the one from the 2nd-to-last commit:
Just like the commit-level invocation of git checkout, this can be used to inspect old
versions of a project—but the scope is limited to the specified file.
If you stage and commit the checked-out file, this has the effect of “reverting” to the
old version of that file. Note that this removes all of the subsequent changes to the
file, whereas the git revert command undoes only the changes introduced by the
specified commit.
Like git reset, this is commonly used with HEAD as the commit reference. For
instance, git checkout HEAD foo.py has the effect of discarding unstaged changes
to foo.py. This is similar behavior to git reset HEAD --hard, but it operates only on
the specified file.
Summary
You should now have all the tools you could ever need to undo changes in a Git
repository. The git reset, git checkout, and git revert commands can be
confusing, but when you think about their effects on the working directory, staged
snapshot, and commit history, it should be easier to discern which command fits the
development task at hand.
Advanced Git log
Formatting Log Output Filtering the Commit
History Summary
The purpose of any version control system is to record changes to your code. This
gives you the power to go back into your project history to see who contributed
what, figure out where bugs were introduced, and revert problematic changes. But,
having all of this history available is useless if you don’t know how to navigate it.
That’s where the git log command comes in.
By now, you should already know the basic git log command for displaying commits.
But, you can alter this output by passing many different parameters to git log.
The advanced features of git log can be split into two categories: formatting how
each commit is displayed, and filtering which commits are included in the output.
Together, these two skills give you the power to go back into your project and find
any information that you could possibly need.
First, this article will take a look at the many ways in which git log’s output can be
formatted. Most of these come in the form of flags that let you request more or less
information from git log.
If you don’t like the default git log format, you can use git config’s aliasing
functionality to create a shortcut for any of the formatting options discussed below.
Please see in The git config Command for how to set up an alias.
Oneline
The --oneline flag condenses each commit to a single line. By default, it displays
only the commit ID and the first line of the commit message. Your typical
git log --oneline output will look something like this:
Decorating
Many times it’s useful to know which branch or tag each commit is associated with.
The --decorate flag makes git log display all of the references (e.g., branches, tags,
etc) that point to each commit.
This can be combined with other configuration options. For example, running
git log --oneline --decorate will format the commit history like so:
0e25143 (HEAD, master) Merge branch 'feature'
ad8621a (feature) Fix a bug in the feature
16b36c6 Add a new feature
23ad9ad (tag: v0.9) Add the initial code base
This lets you know that the top commit is also checked out (denoted by HEAD) and
that it is also the tip of the master branch. The second commit has another branch
pointing to it called feature, and finally the 4th commit is tagged as v0.9.
Branches, tags, HEAD, and the commit history are almost all of the information
contained in your Git repository, so this gives you a more complete view of the
logical structure of your repository.
Diffs
The git log command includes many options for displaying diffs with each commit.
Two of the most common options are --stat and -p.
The --stat option displays the number of insertions and deletions to each file
altered by each commit (note that modifying a line is represented as 1 insertion and
1 deletion). This is useful when you want a brief summary of the changes introduced
by each commit. For example, the following commit added 67 lines to the hello.py
file and removed 38 lines:
commit f2a238924e89ca1d4947662928218a06d39068c3
Author: John <john@example.com>
Date: Fri Jun 25 17:30:28 2014 -0500
Add a new feature
hello.py | 105 ++++++++++++++++++++++++-----------------
1 file changed, 67 insertion(+), 38 deletions(-)
The amount of + and - signs next to the file name show the relative number of
changes to each file altered by the commit. This gives you an idea of where the
changes for each commit can be found.
If you want to see the actual changes introduced by each commit, you can pass the
-p option to git log. This outputs the entire patch representing that commit:
commit 16b36c697eb2d24302f89aa22d9170dfe609855b
Author: Mary <mary@example.com>
Date: Fri Jun 25 17:31:57 2014 -0500
Fix a bug in the feature
diff --git a/hello.py b/hello.py
index 18ca709..c673b40 100644
--- a/hello.py
+++ b/hello.py
@@ -13,14 +13,14 @@ B
-print("Hello, World!")
+print("Hello, Git!")
For commits with a lot of changes, the resulting output can become quite long and
unwieldy. More often than not, if you’re displaying a full patch, you’re probably
searching for a specific change. For this, you want to use the pickaxe option.
The Shortlog
The git shortlog command is a special version of git log intended for creating
release announcements. It groups each commit by author and displays the first line
of each commit message. This is an easy way to see who’s been working on what.
For example, if two developers have contributed 5 commits to a project, the
git shortlog output might look like the following:
Mary (2):
Fix a bug in the feature
Fix a serious security hole in our framework
John (3):
Add the initial code base
Add a new feature
Merge branch 'feature'
By default, git shortlog sorts the output by author name, but you can also pass the
-n option to sort by the number of commits per author.
Graphs
The --graph option draws an ASCII graph representing the branch structure of the
commit history. This is commonly used in conjunction with the --oneline and
--decorate commands to make it easier to see which commit belongs to which
branch:
For a simple repository with just 2 branches, this will produce the following:
The asterisk shows which branch the commit was on, so the above graph tells us that
the 23ad9ad and 16b36c6 commits are on a topic branch and the rest are on the
master branch.
While this is a nice option for simple repositories, you’re probably better off with a
more full-featured visualization tool like gitk or Sourcetree for projects that are
heavily branched.
Custom Formatting
For all of your other git log formatting needs, you can use the
--pretty=format:"<string>" option. This lets you display each commit however you
want using printf-style placeholders.
For example, the %cn, %h and %cd characters in the following command are replaced
with the committer name, abbreviated commit hash, and the committer date,
respectively.
Aside from letting you view only the information that you’re interested in, the
--pretty=format:"<string>" option is particularly useful when you’re trying to pipe
git log output into another command.
Formatting how each commit gets displayed is only half the battle of learning
git log. The other half is understanding how to navigate the commit history. The
rest of this article introduces some of the advanced ways to pick out specific
commits in your project history using git log. All of these can be combined with
any of the formatting options discussed above.
By Amount
The most basic filtering option for git log is to limit the number of commits that are
displayed. When you’re only interested in the last few commits, this saves you the
trouble of viewing all the commits in a page.
You can limit git log’s output by including the -<n> option. For example, the
following command will display only the 3 most recent commits.
git log -3
By Date
If you’re looking for a commit from a specific time frame, you can use the --afteror
--before flags for filtering commits by date. These both accept a variety of date
formats as a parameter. For example, the following command only shows commits
that were created after July 1st, 2014 (inclusive):
You can also pass in relative references like "1 week ago" and "yesterday":
To search for a commits that were created between two dates, you can provide both
a --before and --after date. For instance, to display all the commits added
between July 1st, 2014 and July 4th, 2014, you would use the following:
By Author
When you’re only looking for commits created by a particular user, use the --author
flag. This accepts a regular expression, and returns all commits whose author
matches that pattern. If you know exactly who you’re looking for, you can use a plain
old string instead of a regular expression:
This displays all commits whose author includes the name John. The author name
doesn’t need to be an exact match—it just needs to contain the specified phrase.
You can also use regular expressions to create more complex searches. For example,
the following command searches for commits by either Mary or John.
Note that the author’s email is also included with the author’s name, so you can use
this option to search by email, too.
By Message
To filter commits by their commit message, use the --grep flag. This works just like
the --author flag discussed above, but it matches against the commit message
instead of the author.
For example, if your team includes relevant issue numbers in each commit message,
you can use something like the following to pull out all of the commits related to
that issue:
You can also pass in the -i parameter to git log to make it ignore case differences
while pattern matching.
By File
Many times, you’re only interested in changes that happened to a particular file. To
show the history related to a file, all you have to do is pass in the file path. For
example, the following returns all commits that affected either the foo.py or the
bar.py file:
By Content
It’s also possible to search for commits that introduce or remove a particular line of
source code. This is called a pickaxe, and it takes the form of -S"<string>". For
example, if you want to know when the string Hello, World! was added to any file in
the project, you would use the following command:
If you want to search using a regular expression instead of a string, you can use the
-G"<regex>" flag instead.
This is a very powerful debugging tool, as it lets you locate all of the commits that
affect a particular line of code. It can even show you when a line was copied or
moved to another file.
By Range
You can pass a range of commits to git log to show only the commits contained in
that range. The range is specified in the following format, where <since> and
<until> are commit references:
This command is particularly useful when you use branch references as the
parameters. It’s a simple way to show the differences between 2 branches. Consider
the following command:
The master..feature range contains all of the commits that are in the feature
branch, but aren’t in the master branch. In other words, this is how far feature has
progressed since it forked off of master. You can visualize this as follows:
Note that if you switch the order of the range ( feature..master), you will get all of
the commits in master, but not in feature. If git log outputs commits for both
versions, this tells you that your history has diverged.
You can prevent git log from displaying these merge commits by passing the
--no-merges flag:
On the other hand, if you’re only interested in the merge commits, you can use the
--merges flag:
Summary
You should now be fairly comfortable using git log’s advanced parameters to
format its output and select which commits you want to display. This gives you the
power to pull out exactly what you need from your project history.
These new skills are an important part of your Git toolkit, but remember that git log
is often used in conjunction other Git commands. Once you’ve found the commit
you’re looking for, you typically pass it off to git checkout, git revert, or some
other tool for manipulating your commit history. So, be sure to keep on learning
about Git’s advanced features.
Git Hooks
Conceptual Overview Local Hooks Server-side
Hooks Summary
Git hooks are scripts that run automatically every time a particular event occurs in a
Git repository. They let you customize Git’s internal behavior and trigger
customizable actions at key points in the development life cycle.
Common use cases for Git hooks include encouraging a commit policy, altering the
project environment depending on the state of the repository, and implementing
continuous integration workflows. But, since scripts are infinitely customizable, you
can use Git hooks to automate or optimize virtually any aspect of your development
workflow.
In this article, we’ll start with a conceptual overview of how Git hooks work. Then,
we’ll survey some of the most popular hooks for use in both local and server-side
repositories.
Conceptual Overview
All Git hooks are ordinary scripts that Git executes when certain events occur in the
repository. This makes them very easy to install and configure.
Hooks can reside in either local or server-side repositories, and they are only
executed in response to actions in that repository. We’ll take a concrete look at
categories of hooks later in this article. The configuration discussed in the rest of this
section applies to both local and server-side hooks.
Installing Hooks
Hooks reside in the .git/hooks directory of every Git repository. Git automatically
populates this directory with example scripts when you initialize a repository. If you
take a look inside .git/hooks, you’ll find the following files:
applypatch-msg.sample pre-push.sample
commit-msg.sample pre-rebase.sample
post-update.sample prepare-commit-msg.sample
pre-applypatch.sample update.sample
pre-commit.sample
These represent most of the available hooks, but the .sample extension prevents
them from executing by default. To “install” a hook, all you have to do is remove the
.sample extension. Or, if you’re writing a new script from scratch, you can simply add
a new file matching one of the above filenames, minus the .sample extension.
#!/bin/sh
echo "# Please include a useful commit message!" > $1
Hooks need to be executable, so you may need to change the file permissions of the
script if you’re creating it from scratch. For example, to make sure that
prepare-commit-msg is executable, you would run the following command:
chmod +x prepare-commit-msg
You should now see this message in place of the default commit message every time
you run git commit. We’ll take a closer look at how this actually works in the Prepare
Commit Message section. For now, let’s just revel in the fact that we can customize
some of Git’s internal functionality.
The built-in sample scripts are very useful references, as they document the
parameters that are passed in to each hook (they vary from hook to hook).
Scripting Languages
The built-in scripts are mostly shell and PERL scripts, but you can use any scripting
language you like as long as it can be run as an executable. The shebang line (
#!/bin/sh) in each script defines how your file should be interpreted. So, to use a
different language, all you have to do is change it to the path of your interpreter.
#!/usr/bin/env python
import sys, os
commit_msg_filepath = sys.argv[1]
with open(commit_msg_filepath, 'w') as f:
f.write("# Please include a useful commit message!")
Notice how the first line changed to point to the Python interpreter. And, instead of
using $1 to access the first argument passed to the script, we used sys.argv[1]
(again, more on this in a moment).
This is a very powerful feature for Git hooks because it lets you work in whatever
language you’re most comfortable with.
Scope of Hooks
Hooks are local to any given Git repository, and they are not copied over to the new
repository when you run git clone. And, since hooks are local, they can be altered
by anybody with access to the repository.
This has an important impact when configuring hooks for a team of developers. First,
you need to find a way to make sure hooks stay up-to-date amongst your team
members. Second, you can’t force developers to create commits that look a certain
way—you can only encourage them to do so.
Maintaining hooks for a team of developers can be a little tricky because the
.git/hooks directory isn’t cloned with the rest of your project, nor is it under version
control. A simple solution to both of these problems is to store your hooks in the
actual project directory (above the .git directory). This lets you edit them like any
other version-controlled file. To install the hook, you can either create a symlink to it
in .git/hooks, or you can simply copy and paste it into the .git/hooksdirectory
whenever the hook is updated.
That said, it is possible to reject commits that do not conform to some standard
using server-side hooks. We’ll talk more about this later in the article.
Local Hooks
Local hooks affect only the repository in which they reside. As you read through this
section, remember that each developer can alter their own local hooks, so you can’t
use them as a way to enforce a commit policy. They can, however, make it much
easier for developers to adhere to certain guidelines.
pre-commit
prepare-commit-msg
commit-msg
post-commit
post-checkout
pre-rebase
The first 4 hooks let you plug into the entire commit life cycle, and the final 2 let you
perform some extra actions or safety checks for the git checkout and git rebase
commands, respectively.
All of the pre- hooks let you alter the action that’s about to take place, while the
post- hooks are used only for notifications.
We’ll also see some useful techniques for parsing hook arguments and requesting
information about the repository using lower-level Git commands.
Pre-Commit
The pre-commit script is executed every time you run git commit before Git asks the
developer for a commit message or generates a commit object. You can use this
hook to inspect the snapshot that is about to be committed. For example, you may
want to run some automated tests that make sure the commit doesn’t break any
existing functionality.
#!/bin/sh
# Check if this is the initial commit
if git rev-parse --verify HEAD >/dev/null 2>&1
then
echo "pre-commit: About to create a new commit..."
against=HEAD
else
echo "pre-commit: About to create the first commit..."
against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
fi
# Use git diff-index to check for whitespace errors
echo "pre-commit: Testing for whitespace errors..."
if ! git diff-index --check --cached $against
then
echo "pre-commit: Aborting commit due to whitespace errors
exit 1
else
echo "pre-commit: No whitespace errors :)"
exit 0
fi
In order to use git diff-index, we need to figure out which commit reference we’re
comparing the index to. Normally, this is HEAD; however, HEAD doesn’t exist when
creating the initial commit, so our first task is to account for this edge case. We do
this with git rev-parse --verify, which simply checks whether or not the argument
( HEAD) is a valid reference. The >/dev/null 2>&1 portion silences any output from
git rev-parse. Either HEAD or an empty commit object is stored in the against
variable for use with git diff-index. The 4b825d... hash is a magic commit ID that
represents an empty commit.
This is just one example of the pre-commit hook. It happens to use existing Git
commands to run tests on the changes introduced by the proposed commit, but you
can do anything you want in pre-commit including executing other scripts, running a
3rd-party test suite, or checking code style with Lint.
Prepare Commit Message
The prepare-commit-msg hook is called after the pre-commit hook to populate the
text editor with a commit message. This is a good place to alter the automatically
generated commit messages for squashed or merged commits.
1. The name of a temporary file that contains the message. You change the
commit message by altering this file in-place.
2. The type of commit. This can be message ( -m or -F option), template ( -t
option), merge (if the commit is a merge commit), or squash (if the commit is
squashing other commits).
3. The SHA1 hash of the relevant commit. Only given if -c, -C, or --amendoption
was given.
We already saw a simple example that edited the commit message, but let’s take a
look at a more useful script. When using an issue tracker, a common convention is to
address each issue in a separate branch. If you include the issue number in the
branch name, you can write a prepare-commit-msg hook to automatically include it in
each commit message on that branch.
#!/usr/bin/env python
import sys, os, re
from subprocess import check_output
# Collect the parameters
commit_msg_filepath = sys.argv[1]
if len(sys.argv) > 2:
commit_type = sys.argv[2]
else:
commit_type = ''
if len(sys.argv) > 3:
commit_hash = sys.argv[3]
else:
commit_hash = ''
print "prepare-commit-msg: File: %s\nType: %s\nHash: %s" %
# Figure out which branch we're on
branch = check_output(['git', 'symbolic-ref', '--short', '
print "prepare-commit-msg: On branch '%s'" % branch
# Populate the commit message with the issue #, if there i
if branch.startswith('issue-'):
print "prepare-commit-msg: Oh hey, it's an issue branch."
result = re.match('issue-(.*)', branch)
issue_number = result.group(1)
with open(commit_msg_filepath, 'r+') as f:
content = f.read()
f.seek(0, 0)
f.write("ISSUE-%s %s" % (issue_number, content))
First, the above prepare-commit-msg hook shows you how to collect all of the
parameters that are passed to the script. Then, it calls
git symbolic-ref --short HEAD to get the branch name that corresponds to HEAD. If
this branch name starts with issue-, it re-writes the commit message file contents to
include the issue number in the first line. So, if your branch name is issue-224, this
will generate the following commit message.
ISSUE-224
# Please enter the commit message for your changes. Lines
# with '#' will be ignored, and an empty message aborts th
# On branch issue-224
# Changes to be committed:
# modified: test.txt
One thing to keep in mind when using prepare-commit-msg is that it runs even when
the user passes in a message with the -m option of git commit. This means that the
above script will automatically insert the ISSUE-[#] string without letting the user
edit it. You can handle this case by seeing if the 2nd parameter ( commit_type) is
equal to message.
However, without the -m option, the prepare-commit-msg hook does allow the user
to edit the message after its generated, so this is really more of a convenience script
than a way to enforce a commit message policy. For that, you need the commit-msg
hook discussed in the next section.
Commit Message
The commit-msg hook is much like the prepare-commit-msg hook, but it’s
called after the user enters a commit message. This is an appropriate place to warn
developers that their message doesn’t adhere to your team’s standards.
The only argument passed to this hook is the name of the file that contains the
message. If it doesn’t like the message that the user entered, it can alter this file in-
place (just like with prepare-commit-msg) or it can abort the commit entirely by
exiting with a non-zero status.
For example, the following script checks to make sure that the user didn’t delete the
ISSUE-[#] string that was automatically generated by the prepare-commit-msghook
in the previous section.
#!/usr/bin/env python
import sys, os, re
from subprocess import check_output
# Collect the parameters
commit_msg_filepath = sys.argv[1]
# Figure out which branch we're on
branch = check_output(['git', 'symbolic-ref', '--short', '
print "commit-msg: On branch '%s'" % branch
# Check the commit message if we're on an issue branch
if branch.startswith('issue-'):
print "commit-msg: Oh hey, it's an issue branch."
result = re.match('issue-(.*)', branch)
issue_number = result.group(1)
required_message = "ISSUE-%s" % issue_number
with open(commit_msg_filepath, 'r') as f:
content = f.read()
if not content.startswith(required_message):
print "commit-msg: ERROR! The commit message must start wi
sys.exit(1)
While this script is called every time the user creates a commit, you should avoid
doing much outside of checking the commit message. If you need to notify other
services that a snapshot was committed, you should use the post-commit hook
instead.
Post-Commit
The post-commit hook is called immediately after the commit-msg hook. It can’t
change the outcome of the git commit operation, so it’s used primarily for
notification purposes.
The script takes no parameters and its exit status does not affect the commit in any
way. For most post-commit scripts, you’ll want access to the commit that was just
created. You can use git rev-parse HEAD to get the new commit’s SHA1 hash, or you
can use git log -1 HEAD to get all of its information.
For example, if you want to email your boss every time you commit a snapshot
(probably not the best idea for most workflows), you could add the following
post-commit hook.
#!/usr/bin/env python
import smtplib
from email.mime.text import MIMEText
from subprocess import check_output
# Get the git log --stat entry of the new commit
log = check_output(['git', 'log', '-1', '--stat', 'HEAD'])
# Create a plaintext email message
msg = MIMEText("Look, I'm actually doing some work:\n\n%s"
msg['Subject'] = 'Git post-commit hook notification'
msg['From'] = 'mary@example.com'
msg['To'] = 'boss@example.com'
# Send the message
SMTP_SERVER = 'smtp.example.com'
SMTP_PORT = 587
session = smtplib.SMTP(SMTP_SERVER, SMTP_PORT)
session.ehlo()
session.starttls()
session.ehlo()
session.login(msg['From'], 'secretPassword')
session.sendmail(msg['From'], msg['To'], msg.as_string())
session.quit()
It’s possible to use post-commit to trigger a local continuous integration system, but
most of the time you’ll want to be doing this in the post-receive hook. This runs on
the server instead of the user’s local machine, and it also runs every
time anydeveloper pushes their code. This makes it a much more appropriate place
to perform your continuous integration.
Post-Checkout
The post-checkout hook works a lot like the post-commit hook, but it’s called
whenever you successfully check out a reference with git checkout. This is nice for
clearing out your working directory of generated files that would otherwise cause
confusion.
This hook accepts three parameters, and its exit status has no affect on the
git checkout command.
3. A flag telling you if it was a branch checkout or a file checkout. The flag will
be 1 and 0, respectively.
A common problem with Python developers occurs when generated .pyc files stick
around after switching branches. The interpreter sometimes uses these .pyc instead
of the .py source file. To avoid any confusion, you can delete all .pyc files every
time you check out a new branch using the following post-checkout script:
#!/usr/bin/env python
import sys, os, re
from subprocess import check_output
# Collect the parameters
previous_head = sys.argv[1]
new_head = sys.argv[2]
is_branch_checkout = sys.argv[3]
if is_branch_checkout == "0":
print "post-checkout: This is a file checkout. Nothing to
sys.exit(0)
print "post-checkout: Deleting all '.pyc' files in working
for root, dirs, files in os.walk('.'):
for filename in files:
ext = os.path.splitext(filename)[1]
if ext == '.pyc':
os.unlink(os.path.join(root, filename))
The current working directory for hook scripts is always set to the root of the
repository, so the os.walk('.') call iterates through every file in the repository.
Then, we check its extension and delete it if it’s a .pyc file.
You can also use the post-checkout hook to alter your working directory based on
which branch you have checked out. For example, you might use a plugins branch
to store all of your plugins outside of the core codebase. If these plugins require a
lot of binaries that other branches do not, you can selectively build them only when
you’re on the plugins branch.
Pre-Rebase
The pre-rebase hook is called before git rebase changes anything, making it a
good place to make sure something terrible isn’t about to happen.
This hook takes 2 parameters: the upstream branch that the series was forked from,
and the branch being rebased. The second parameter is empty when rebasing the
current branch. To abort the rebase, exit with a non-zero status.
For example, if you want to completely disallow rebasing in your repository, you
could use the following pre-rebase script:
#!/bin/sh
# Disallow all rebasing
echo "pre-rebase: Rebasing is dangerous. Don't do it."
exit 1
Now, every time you run git rebase, you’ll see this message:
Server-side Hooks
Server-side hooks work just like local ones, except they reside in server-side
repositories (e.g., a central repository, or a developer’s public repository). When
attached to the official repository, some of these can serve as a way to enforce policy
by rejecting certain commits.
There are 3 server-side hooks that we’ll be discussing in the rest of this article:
pre-receive
update
post-receive
All of these hooks let you react to different stages of the git push process.
The output from server-side hooks are piped to the client’s console, so it’s very easy
to send messages back to the developer. But, you should also keep in mind that
these scripts don’t return control of the terminal until they finish executing, so you
should be careful about performing long-running operations.
Pre-Receive
The pre-receive hook is executed every time somebody uses git push to push
commits to the repository. It should always reside in the remote repository that is the
destination of the push, not in the originating repository.
The hook runs before any references are updated, so it’s a good place to enforce any
kind of development policy that you want. If you don’t like who is doing the pushing,
how the commit message is formatted, or the changes contained in the commit, you
can simply reject it. While you can’t stop developers from making malformed
commits, you can prevent these commits from entering the official codebase by
rejecting them with pre-receive.
The script takes no parameters, but each ref that is being pushed is passed to the
script on a separate line on standard input in the following format:
You can see how this hook works using a very basic pre-receive script that simply
reads in the pushed refs and prints them out.
#!/usr/bin/env python
import sys
import fileinput
# Read in each ref that the user is trying to update
for line in fileinput.input():
print "pre-receive: Trying to push ref: %s" % line
# Abort the push
# sys.exit(1)
Again, this is a little different than the other hooks because information is passed to
the script via standard input instead of as command-line arguments. After placing
the above script in the .git/hooks directory of a remote repository and pushing the
master branch, you’ll see something like the following in your console:
b6b36c697eb2d24302f89aa22d9170dfe609855b 85baa88c22b52ddd2
You can use these SHA1 hashes, along with some lower-level Git commands, to
inspect the changes that are going to be introduced. Some common use cases
include:
Checking that the user has the correct permissions to make the intended
changes (mostly used for centralized Git workflows)
Update
The update hook is called after pre-receive, and it works much the same way. It’s
still called before anything is actually updated, but it’s called separately for each ref
that was pushed. That means if the user tries to push 4 branches, update is executed
4 times. Unlike pre-receive, this hook doesn’t need to read from standard input.
Instead, it accepts the following 3 arguments:
This is the same information passed to pre-receive, but since update is invoked
separately for each ref, you can reject some refs while allowing others.
#!/usr/bin/env python
import sys
branch = sys.argv[1]
old_commit = sys.argv[2]
new_commit = sys.argv[3]
print "Moving '%s' from %s to %s" % (branch, old_commit, n
# Abort pushing only this branch
# sys.exit(1)
The above update hook simply outputs the branch and the old/new commit hashes.
When pushing more than one branch to the remote repository, you’ll see the print
statement execute for each branch.
Post-Receive
The post-receive hook gets called after a successful push operation, making it a
good place to perform notifications. For many workflows, this is a better place to
trigger notifications than post-commit because the changes are available on a public
server instead of residing only on the user’s local machine. Emailing other developers
and triggering a continuous integration system are common use cases for
post-receive.
The script takes no parameters, but is sent the same information as pre-receive via
standard input.
Summary
In this article, we learned how Git hooks can be used to alter internal behavior and
receive notifications when certain events occur in a repository. Hooks are ordinary
scripts that reside in the .git/hooks repository, which makes them very easy to
install and customize.
We also looked at some of the most common local and server-side hooks. These let
us plug in to the entire development life cycle. We now know how to perform
customizable actions at every stage in the commit creation process, as well as the
git push process. With a little bit of scripting knowledge, this lets you do virtually
anything you can imagine with a Git repository.
Refs and the Reflog
Hashes Refs Packed Refs Special RefsRefspecs Relative
Refs The Reflog Summary
Git is all about commits: you stage commits, create commits, view old commits, and
transfer commits between repositories using many different Git commands. The
majority of these commands operate on a commit in some form or another, and
many of them accept a commit reference as a parameter. For example, you can use
git checkout to view an old commit by passing in a commit hash, or you can use it
to switch branches by passing in a branch name.
By understanding the many ways to refer to a commit, you make all of these
commands that much more powerful. In this chapter, we’ll shed some light on the
internal workings of common commands like git checkout, git branch, and
git push by exploring the many methods of referring to a commit.
We’ll also learn how to revive seemingly “lost” commits by accessing them through
Git’s reflog mechanism.
Hashes
The most direct way to reference a commit is via its SHA-1 hash. This acts as the
unique ID for each commit. You can find the hash of all your commits in the git log
output.
commit 0c708fdec272bc4446c6cabea4f0022c2b616eba
Author: Mary Johnson <mary@example.com>
Date: Wed Jul 9 16:37:42 2014 -0500
Some commit message
When passing the commit to other Git commands, you only need to specify enough
characters to uniquely identify the commit. For example, you can inspect the above
commit with git show by running the following command:
This is particularly useful when writing custom scripts that accept a commit
reference. Instead of parsing the commit reference manually, you can let
git rev-parse normalize the input for you.
Refs
Refs are stored as normal text files in the .git/refs directory, where .git is usually
called .git. To explore the refs in one of your repositories, navigate to .git/refs.
You should see the following structure, but it will contain different files depending on
what branches, tags, and remotes you have in your repo:
.git/refs/
heads/
master
some-feature
remotes/
origin/
master
tags/
v0.9
The heads directory defines all of the local branches in your repository. Each
filename matches the name of the corresponding branch, and inside the file you’ll
find a commit hash. This commit hash is the location of the tip of the branch. To
verify this, try running the following two commands from the root of the Git
repository:
The commit hash returned by the cat command should match the commit ID
displayed by git log.
To change the location of the master branch, all Git has to do is change the contents
of the refs/heads/master file. Similarly, creating a new branch is simply a matter of
writing a commit hash to a new file. This is part of the reason why Git branches are
so lightweight compared to SVN.
The tags directory works the exact same way, but it contains tags instead of
branches. The remotes directory lists all remote repositories that you created with
git remote as separate subdirectories. Inside each one, you’ll find all the remote
branches that have been fetched into your repository.
Specifying Refs
When passing a ref to a Git command, you can either define the full name of the ref,
or use a short name and let Git search for a matching ref. You should already be
familiar with short names for refs, as this is what you’re using each time you refer to
a branch by name.
The some-feature argument in the above command is actually a short name for the
branch. Git resolves this to refs/heads/some-feature before using it. You can also
specify the full ref on the command line, like so:
This avoids any ambiguity regarding the location of the ref. This is necessary, for
instance, if you had both a tag and a branch called some-feature. However, if you’re
using proper naming conventions, ambiguity between tags and branches shouldn’t
generally be a problem.
Packed Refs
For large repositories, Git will periodically perform a garbage collection to remove
unnecessary objects and compress refs into a single file for more efficient
performance. You can force this compression with the garbage collection command:
git gc
This moves all of the individual branch and tag files in the refs folder into a single
file called packed-refs located in the top of the .git directory. If you open up this
file, you’ll find a mapping of commit hashes to refs:
00f54250cf4e549fdfcafe2cf9a2c90bc3800285 refs/heads/featur
0e25143693cfe9d5c2e83944bbaf6d3c4505eb17 refs/heads/master
bb883e4c91c870b5fed88fd36696e752fb6cf8e6 refs/tags/v0.9
On the outside, normal Git functionality won’t be affected in any way. But, if you’re
wondering why your .git/refs folder is empty, this is where the refs went.
Special Refs
In addition to the refs directory, there are a few special refs that reside in the top-
level .git directory. They are listed below:
MERGE_HEAD – The commit(s) that you’re merging into the current branch with
git merge.
CHERRY_PICK_HEAD – The commit that you’re cherry-picking.
These refs are all created and updated by Git when necessary. For example, The
git pull command first runs git fetch, which updates the FETCH_HEAD reference.
Then, it runs git merge FETCH_HEAD to finish pulling the fetched branches into the
repository. Of course, you can use all of these like any other ref, as I’m sure you’ve
done with HEAD.
These files contain different content depending on their type and the state of your
repository. The HEAD ref can contain either a symbolic ref, which is simply a
reference to another ref instead of a commit hash, or a commit hash. For example,
take a look at the contents of HEAD when you’re on the master branch:
This will output ref: refs/heads/master, which means that HEAD points to the
refs/heads/master ref. This is how Git knows that the master branch is currently
checked out. If you were to switch to another branch, the contents of HEAD would be
updated to reflect the new branch. But, if you were to check out a commit instead of
a branch, HEAD would contain a commit hash instead of a symbolic ref. This is how
Git knows that it’s in a detached HEAD state.
For the most part, HEAD is the only reference that you’ll be using directly. The others
are generally only useful when writing lower-level scripts that need to hook into Git’s
internal workings.
Refspecs
Refspecs can be used with the git push command to give a different name to the
remote branch. For example, the following command pushes the master branch to
the origin remote repo like an ordinary git push, but it uses qa-master as the
name for the branch in the origin repo. This is useful for QA teams that need to
push their own branches to a remote repo.
You can also use refspecs for deleting remote branches. This is a common situation
for feature-branch workflows that push the feature branches to a remote repo (e.g.,
for backup purposes). The remote feature branches still reside in the remote repo
after they are deleted from the local repo, so you get a build-up of dead feature
branches as your project progresses. You can delete them by pushing a refspec that
has an empty <src> parameter, like so:
By adding a few lines to the Git configuration file, you can use refspecs to alter the
behavior of git fetch. By default, git fetch fetches all of the branches in the
remote repository. The reason for this is the following section of the .git/configfile:
[remote "origin"]
url = https://git@github.com:mary/example-repo.git
fetch = +refs/heads/*:refs/remotes/origin/*
The fetch line tells git fetch to download all of the branches from the originrepo.
But, some workflows don’t need all of them. For example, many continuous
integration workflows only care about the master branch. To fetch only the master
branch, change the fetch line to match the following:
[remote "origin"]
url = https://git@github.com:mary/example-repo.git
fetch = +refs/heads/master:refs/remotes/origin/master
You can also configure git push in a similar manner. For instance, if you want to
always push the master branch to qa-master in the origin remote (as we did
above), you would change the config file to:
[remote "origin"]
url = https://git@github.com:mary/example-repo.git
fetch = +refs/heads/master:refs/remotes/origin/master
push = refs/heads/master:refs/heads/qa-master
Refspecs give you complete control over how various Git commands transfer
branches between repositories. They let you rename and delete branches from your
local repository, fetch/push to branches with different names, and configure
git push and git fetch to work with only the branches that you want.
Relative Refs
You can also refer to commits relative to another commit. The ~ character lets you
reach parent commits. For example, the following displays the grandparent of HEAD:
But, when working with merge commits, things get a little more complicated. Since
merge commits have more than one parent, there is more than one path that you
can follow. For 3-way merges, the first parent is from the branch that you were on
when you performed the merge, and the second parent is from the branch that you
passed to the git merge command.
The ~ character will always follow the first parent of a merge commit. If you want to
follow a different parent, you need to specify which one with the ^ character. For
example, if HEAD is a merge commit, the following returns the second parent of HEAD.
You can use more than one ^ character to move more than one generation. For
instance, this displays the grandparent of HEAD (assuming it’s a merge commit) that
rests on the second parent.
To clarify how ~ and ^ work, the following figure shows you how to reach any
commit from A using relative references. In some cases, there are multiple ways to
reach a commit.
Relative refs can be used with the same commands that a normal ref can be used.
For example, all of the following commands use a relative reference:
The Reflog
The reflog is Git’s safety net. It records almost every change you make in your
repository, regardless of whether you committed a snapshot or not. You can think of
it as a chronological history of everything you’ve done in your local repo. To view the
reflog, run the git reflog command. It should output something that looks like the
following:
The HEAD{<n>} syntax lets you reference commits stored in the reflog. It works a lot
like the HEAD~<n> references from the previous section, but the <n> refers to an entry
in the reflog instead of the commit history.
You can use this to revert to a state that would otherwise be lost. For example, lets
say you just scrapped a new feature with git reset. Your reflog might look
something like this:
The three commits before the git reset are now dangling, which means that there
is no way to reference them—except through the reflog. Now, let’s say you realize
that you shouldn’t have thrown away all of your work. All you have to do is check out
the HEAD@{1} commit to get back to the state of your repository before you ran
git reset.
This puts you in a detached HEAD state. From here, you can create a new branch and
continue working on your feature.
Summary
We also took a look at the reflog, which is a way to reference commits that are not
available through any other means. This is a great way to recover from those little
“Oops, I shouldn’t have done that” situations.
The point of all this was to be able to pick out exactly the commit that you need in
any given development scenario. It’s very easy to leverage the skills you learned in
this article against your existing Git knowledge, as some of the most common
commands accept refs as arguments, including git log, git show, git checkout,
git reset, git revert, git rebase, and many others.
Git LFS
Git LFS does this by replacing large files in your repository with tiny pointer files.
During normal usage, you'll never see these pointer files as they are handled
automatically by Git LFS:
1. When you add a file to your repository, Git LFS replaces its contents with a
pointer, and stores the file contents in a local Git LFS cache.
2. When you push new commits to the server, any Git LFS files referenced by the
newly pushed commits are transferred from your local Git LFS cache to the
remote Git LFS store tied to your Git repository.
3. When you checkout a commit that contains Git LFS pointers, they are replaced
with files from your local Git LFS cache, or downloaded from the remote Git
LFS store.
Git LFS is seamless: in your working copy you'll only see your actual file content. This
means you can use Git LFS without changing your existing Git workflow; you simply
git checkout, edit, git add, and git commit as normal. git clone and git pull
operations will be significantly faster as you only download the versions of large files
referenced by commits that you actually check out, rather than every version of the
file that ever existed.
To use Git LFS, you will need a Git LFS aware host such as Bitbucket
Cloud or Bitbucket Server. Repository users will need to have the Git LFS command-
line client installed, or a Git LFS aware GUI client such as Sourcetree. Fun fact: Steve
Streeting, the Atlassian developer who invented Sourcetree, is also a major
contributor to the Git LFS project, so Sourcetree and Git LFS work together rather
well.
Speeding up clones
Speeding up pulls
Tracking files with Git LFS
c. Install Sourcetree, a free Git GUI client that comes bundled with Git LFS.
2. Once git-lfs is on your path, run git lfs install to initialize Git LFS (you can skip
this step if you installed Sourcetree):
You'll only need to run git lfs install once. Once initialized for your system,
Git LFS will bootstrap itself automatically when you clone a repository
containing Git LFS content.
# initialize Git
$ mkdir Atlasteroids
$ cd Atlasteroids
$ git init
Initialized empty Git repository in /Users/tpettersen/Atla
# initialize Git LFS
$ git lfs install
Updated pre-push hook.
Git LFS initialized.
This installs a special pre-push Git hook in your repository that will transfer Git LFS
files to the server when you git push.
There are four PNGs in this repository being tracked by Git LFS. When running git
clone, Git LFS files are downloaded one at a time as pointer files are checked out of
your repository.
Speeding up clones
If you're cloning a repository with a large number of LFS files, the explicit
git lfs clone command offers far better performance:
Rather than downloading Git LFS files one at a time, the git lfs clone command
waits until the checkout is complete, and then downloads any required Git LFS files
as a batch. This takes advantage of parallelized downloads, and dramatically reduces
the number of HTTP requests and processes spawned (which is especially important
for improving performance on Windows).
$ git pull
Updating 4784e9d..7039f0a
Downloading Assets/Sprites/powerup.png (21.14 KB)
Fast-forward
Assets/Sprites/powerup.png | 3 +
Assets/Sprites/powerup.png.meta | 4133 +++++++++++++++++++
2 files changed, 4136 insertions(+)
create mode 100644 Assets/Sprites/projectiles-spritesheet.
create mode 100644 Assets/Sprites/projectiles-spritesheet.
No explicit commands are needed to retrieve Git LFS content. However, if the
checkout fails for an unexpected reason, you can download any missing Git LFS
content for the current commit with git lfs pull:
Speeding up pulls
Like git lfs clone, git lfs pull downloads your Git LFS files as a batch. If you
know a large number of files have changed since the last time you pulled, you may
wish to disable the automatic Git LFS download during checkout, and then batch
download your Git LFS content with an explicit git lfs pull. This can be done by
overriding your Git config with the -c option when you invoke git pull:
Since that's rather a lot of typing, you may wish to create a simple Git alias to
perform a batched Git and Git LFS pull for you:
This will greatly improve performance when a large number of Git LFS files need to
be downloaded (again, especially on Windows).
The patterns supported by Git LFS are the same as those supported by .gitignore,
for example:
These patterns are relative to the directory in which you ran the git lfs track
command. To keep things simple, it is best to run git lfs track from the root of
your repository. Note that Git LFS does not support negative patterns like .gitignore
does.
After running git lfs track, you'll notice a new file named .gitattributes in the
directory you ran the command from. .gitattributes is a Git mechanism for
binding special behaviors to certain file patterns. Git LFS automatically creates or
updates .gitattributes files to bind tracked file patterns to the Git LFS filter.
However, you will need to commit any changes to the .gitattributes file to your
repository yourself:
For ease of maintenance, it is simplest to keep all Git LFS patterns in a single
.gitattributes file by always running git lfs track from the root of your
repository. However, you can display a list of all patterns that are currently tracked
by Git LFS (and the .gitattributes files they are defined in) by invoking
git lfs track with no arguments:
You can stop tracking a particular pattern with Git LFS by simply removing the
appropriate line from your .gitattributes file, or by running the git lfs untrack
command:
After running git lfs untrack you will again have to commit the changes to
.gitattributes yourself.
$ git push
Git LFS: (3 of 3 files) 4.68 MB / 4.68 MB
Counting objects: 8, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 1.16 KiB | 0 bytes/s, done.
Total 8 (delta 1), reused 0 (delta 0)
To git@bitbucket.org:tpettersen/atlasteroids.git
7039f0a..b3684d3 master -> master
If transferring the LFS files fails for some reason, the push will be aborted and you
can safely try again. Like Git, Git LFS storage is content addressable: content is stored
against a key which is a SHA-256 hash of the content itself. This means it is always
safe to re-attempt transferring Git LFS files to the server; you can't accidentally
overwrite a Git LFS file's contents with the wrong version.
For example, to move all Git and Git LFS repository from a remote named github to
a remote named bitbucket 😉 :
This is useful for batch downloading new Git LFS content while you're out at lunch,
or if you're planning on reviewing work from your teammates and will not be able to
download content later on due to limited internet connectivity. For example, you
may wish to run git lfs fetch --recent before jumping on a plane!
Git LFS considers any branch or tag containing a commit newer than seven days as
recent. You can configure the number of days considered as recent by setting the
lfs.fetchrecentrefsdays property:
By default, git lfs fetch --recent will only download Git LFS content for the
commit at the tip of a recent branch or tag.
However you can configure Git LFS to download content for earlier commits on
recent branches and tags by configuring the lfs.fetchrecentcommitsdaysproperty:
Use this setting with care: if you have fast moving branches, this can result in
a hugeamount of data being downloaded. However it can be useful if you need to
review interstitial changes on a branch, cherry picking commits across branches, or
rewrite history.
As discussed in Moving a Git LFS repository between hosts, you can also elect to
fetch all Git LFS content for your repository with git lfs fetch --all:
This will delete any local Git LFS files that are considered old. An old file is any
file not referenced by:
a commit that has not yet been pushed (to origin, or whatever
lfs.pruneremotetocheck is set to)
a recent commit
Unlike Git's built-in garbage collection, Git LFS content is not pruned automatically,
so running git lfs prune on a regular basis is a good idea to keep your local
repository size down.
You can test out what effect a prune operation will have with
git lfs prune --dry-run:
The long hexadecimal strings output by --verbose mode are SHA-256 hashes (also
known as Object IDs, or OIDs) of the Git LFS objects to be pruned. You can use the
techniques described in Finding paths or commits that reference a Git LFS object to
find our more about the objects that will be pruned.
Or you can enable remote verification for just the context repository by omitting the
--global option from the command above.
In Bitbucket Cloud, you can view and delete Git LFS files via Repository Settings >
Git LFS:
Note that each Git LFS file is indexed by its SHA-256 OID; the paths that reference
each file are not visible via the UI. This is because there could be many different
paths at many different commits that may refer to a given object, so looking them
up would be a very slow process.
To determine what a given Git LFS file actually contains, you have three options:
look at the file preview image and file type in the left hand column of the
Bitbucket Git LFS UI
download the file using the link in the right hand column of the Bitbucket Git
LFS UI -search for commits referencing the Git LFS object's SHA-256 OID, as
discussed in the next section
This git log incantation generates a patch ( -p) from commits on any branch ( --all
) that add or remove a line ( -S) containing the specified string (a Git LFS SHA-256
OID).
The patch shows you the commit and the path to the LFS object, as well as who
added it, and when it was committed. You can simply checkout the commit, and Git
LFS will download the file if needed and place it in your working copy.
If you suspect that a particular Git LFS object is in your current HEAD, or on a
particular branch, you can use git grep to find the file path that references it:
You can replace HEAD or power-ups with any ref, commit, or tree that contains the Git
LFS object.
You can exclude a pattern or subdirectory using git lfs fetch -X (or --exclude):
Alternatively, you may want to only include a particular pattern or subdirectory. For
example, an audio engineer could fetch just ogg and wav files with git lfs fetch -I
(or --include):
If you combine includes and excludes, only files that match an include pattern anddo
not match an exclude pattern will be fetched. For example, you can fetch everything
in your Assets directory except gifs with:
$ git lfs fetch -I "Assets/**" -X "*.gif"
Excludes and includes support the same patterns as git lfs track and .gitignore.
You can make these patterns permanent for a particular repository by setting the
lfs.fetchinclude and lfs.fetchexclude config properties:
These settings can also be applied to every repository on your system by appending
the --global option.
Until then, the best way to avoid merge conflicts is to communicate with team
members before making changes to a binary file that they are likely to be modifying
at the same time as you.
In order to understand the effects of git prune we need to simulate a scenario
where a commit becomes unreachable. The following is a sequence of command line
executions that will simulate this experience.
~ $ cd git-prune-demo/
~/git-prune-demo $ git init .
Initialized empty Git repository in /Users/kev/Dropbox/git
~/git-prune-demo $ echo "hello git prune" > hello.txt
~/git-prune-demo $ git add hello.txt
~/git-prune-demo $ git commit -am "added hello.txt"
We now have a 2 commit history in this demo repo. We can verify by using git log:
commit 994b122045cf4bf0b97139231b4dd52ea2643c7e
Author: kevzettler <kevzettler@gmail.com>
Date: Sun Sep 30 09:43:41 2018 -0700
added hello.txt
The git log output displays the 2 commits and corresponding commit messages
about the edits made to hello.txt. The next step is for us to make one of the
commits unreachable. We will do this by utilizing the git reset command. We reset
the state of the repo to the first commit. the "added hello.txt" commit.
If we now use git log to examine the state of the repository we can see that we
only have one commit
added hello.txt
The demo repository is now in a state that contains a detached commit. The second
commit we made with the message "added another line to hello.txt" is no longer
displayed in the git log output and is now detached. It may appear as though we
have lost or deleted the commit, but Git is very strict about not deleting history. We
can confirm it is still available, but detached, by using git checkout to visit it directly:
You are in 'detached HEAD' state. You can look around, mak
changes and commit them, and you can discard any commits y
state without impacting any branches by performing another
commit 994b122045cf4bf0b97139231b4dd52ea2643c7e
Author: kevzettler <kevzettler@gmail.com>
Date: Sun Sep 30 09:43:41 2018 -0700
added hello.txt
When we check out the detached commit, Git is thoughtful enough to give us a
detailed message explaining that we are in a detached state. If we examine the log
here we can see that the "added another line to hello.txt" commit is now back in the
log output! Now that we know the repository is in a good simulation state with a
detached commit we can practice using git prune. First though, let us return to the
master branch using git checkout
When returning to master via git checkout, Git is again thoughtful enough to let us
know that we are leaving a detached commit behind. It's now time to prune the
detached commit! Next, we will execute git prune but we must be sure to pass
some options to it. --dry-run and --verbose will display output indicating what is
set to be pruned but not actually prune it.
This command will most likely return empty output. Empty output implies that the
prune will not actually delete anything. Why would this happen? Well, the commit is
most likely not fully detached. Somewhere Git is still maintaining a reference to it.
This is a prime example of why git prune is not to be used stand-alone outside of
git gc. This is also a good example of how it is hard to fully lose data with Git.
Most likely Git is storing a reference to our detached commit in the reflog. We can
investigate by running git reflog. You should see some output describing the
sequence of actions we took to get here. For more info on git reflog visit the
git reflog page. In addition to preserving history in the reflog, Git has internal
expiration dates on when it will prune detached commits. Again, these are all
implementation details that git gc handles and git prune should not be used
standalone.
The above command will force expire all entries to the reflog that are older than
now. This is a brutal and dangerous command that you should never have to use as
casual Git user. We are executing this command to demonstrate a successful
git prune. With the reflog totally wiped we can now execute git prune.
This command should output a list of Git SHA object references that looks like the
above.
Usage
git prune has a short list of options that we covered in the overview section.
-n --dry-run
--progress
--expire <time>
<head>…
Discussion
The git prune command is intended to be invoked as a child command to git gc. It
is highly unlikely you will ever need to invoke git prune in a day to day software
engineering capacity. Other commands are needed to understand the effects of
git prune. Some commands used in this article were git log, git reflog, and
git checkout.