Are you using Git for the first time and feeling the challenge of the initial Git concepts?
I have mentored a number of people learning Git, and have found a few core concepts most important to first understand. After discussing these concepts, most Git learners are able to perform daily work with Git. This post summarizes the concepts and provides some general information to help with learning and understanding Git.
Using Git’s branch management, alternate commit approaches, multiple remote repos, and more come with time, but not without first understanding basic Git workflow.
This diagram shows the basic actions/commands and the movement of the files/content with Git. Refer to this diagram with the below sections.
Concept 1: Repositories
Git has two repository types: local and remote. The local repo is on your computer for only your direct use. The remote repo is typically elsewhere and for your indirect use. Git supports multiple remote repositories.
Typically, we work in teams and need to work on a codebase together. The codebase is on a “central” server and people retrieve files from it and commit to it.
Git refers to the centralized server as a “remote repository”. The remote repo is usually not on your machine and is the one shared by the team. The team “pushes” commits to it when ready to share with the team. While one of your remote repos could be another team member’s local repo, in a corporate environment at least one (or the only!) is typically a Git repo on a server anointed as the central/official repo.
Note that a remote repo is optional. When not sharing code with others, there is technically no need for a remote repo (you may want one for backup or CI). There is also no need for a remote repo if your local repo is considered the central one by all team members (which means your local repo is their remote repo).
The local repo is on your computer and has all the files and their commit history, enabling full diffs, history review, and committing when offline. This is one of the key features of a “distributed” version control system (DVCS), locally having the full repository history.
Creating a Local Repo from a Remote Repo
Use the Git clone command to create a local repo with all of the remote repo’s history. Only use this command once to create the local repo from a remote. Git is very conservative about overwriting files – in this case, the clone command will stop with an error message when the directory you specify to create the local repo in is not empty.
Concept 2: Committing is a Multi-Step Process
Git is a three step process to share your files with the team:
- Add. This copies new or updated files to the “stage” or “index” (you will see doc and info that use both terms).
- Commit. This copies your staged files to the local repo.
- Push. This copies your files from the local repo to the remote repo (only the changes the remote repos does not have).
1. Add/Stage (Index) All Files to Commit
Git commits files from the “staged files” list, also referred to as “indexed files”. Contrary to most other SCMs, Git requires staging all files before committing, not just new files. To put files into the Git stage area, use either the stage or add command (stage is a synonym for the add command).
The add command is more commonly used than the stage command, but I have found that stage is more clear to most new Git users. New Git users tend to misunderstand the Git add command as they relate it to other SCM add commands, such as Subversion and Perforce, which use add solely to place new files under source control. However, it doesn’t take long to remember the Git meaning of add, and most then begin using add as well.
Note that, while the stage command is available from the command line, most Git GUI programs use the word Add for the action, not stage. Even an error message for the stage command mentions add:
$ git stage
Nothing specified, nothing added.
Maybe you wanted to say 'git add .'?
So if you use the stage command instead of add, don’t let the error message confuse you!
$ git add filename.txt
$ git add .
Git commits all staged files together as an atomic commit to the local repo.
$ git commit -m "This is a basic commit message."
Use the push command to share the local repository commits to the remote repository. Without any arguments, push uses the configured remote repo for the current branch.
$ git push
Concept 3: File Diffs in Workspace, Stage, and Repo
Files can exist in three locations with Git. The same file can have different content in each location:
- Committed in the repo: the HEAD version, the contents as the file was last committed.
- Staged in the index: edits made or the file removed, added to the index, ready to commit.
- Workspace: Work in progress (usually most files are unchanged, having the same content as the committed version).
Locations 2 and 3 are “changes in-progress”.
A very important concept with this is Git does not auto-update a staged file with additional edits. When making additional edits to a file after staging it, the staged file does not contain the additional edits. You must once again “git add” the file for the staged one to have them. This feature becomes useful in multiple ways, particularly with Git’s feature to stage only some of a file’s changes in prep for commit.
While it is more of an advanced Git concept, Git actually tracks file content, not whole files. It recognizes the same content in multiple files, and easily tracks content movement from one file to another. Refer to multiple options to the add command for useful features, such as -i, -p, and -u
The Git diff command options allow comparing between the three locations (and more).
- Specifying no option compares the workspace to the staged/index version.
- Specifying –cached or –staged compares the staged/index version with a committed version.
- Additional diff options include comparing with any two files on disk, comparing two commits, and any file with any commit.
# compare unstaged changes with staged and committed
$ git diff
non-pushed changes (unstaged, staged, committed) with committed/HEAD# as of last fetch, so possibly want to fetch first
$ git diff origin/master
Besides experimenting with the diff command, refer to the diff command help for more info.
For beginning use of Git, this section summarizes some basic commands.
Obtaining Files from a Remote Repo
The clone command creates a new local repo from the remote repo. Use this command only once to initially pull the files and history from the remote repo.
The fetch command retrieves updated files from the remote repo that are not yet in your local repo.
The merge command merges the contents from the local repo into the workspace.
The pull command is simply a fetch followed by a merge.
Add / Stage
The add or stage command adds the file in its current state to the Git stage area (this includes a deleted file).
The commit command commits the staged files to the local repo.
Sharing Files to a Remote Repo
The push command updates a remote repo by copying the file updates from the local repo to the remote repo. Note that this is done on a per branch basis.
An advanced usage is push/pull with another person’s repo. Git sees it as just another remote repo. However, this features allows collaboration before committing (via pushing) to the official remote/central repo.