Managing Condor Source Trees With Git

See also: RandomGitWisdom

Add this to your path: /unsup/git/bin

Gitweb, think bonsai, is available at:

Managing HTCondor source, manual and externals trees with GIT

This document explains common source code management tasks for HTCondor, and how you can accomplish them with the git tools.

As of August 2011, HTCondor's source and manual are in one repository. The source and manual reside in /p/condor/repository/CONDOR_SRC.git ,

Preliminary setup

There are a few one-time tasks you must do before actively develop with git.

Adding git to your path

To use git, you don't need to set any environment variables, but you must add git to your path. We're using (as of January 2011) a newer version than is installed by default on the currently-supported CSL Linux machinse. Just add /unsup/git/bin to the beginning of your path. You may also choose to add /s/tcl-8.4.5/bin to your path so you can run the git GUI (aptly named git-gui).

Tell git some things about you

To make patches and diffs a bit more self-documenting, you should now tell git your email address and common name. We also fix the renamelimit option to allow git to look farther when merging old or very complex branches.

$ git config --global "Your Name"
$ git config --global ""
$ git config --global diff.renamelimit 100000
# Season the following to taste
$ git config --global core.editor "/usr/ucb/vi"

This edits the per-user git config file that is global across all repos. git config edits the ~/.gitconfig file , which is an .ini style file.

The following will make "git push" not automatically push all branches that have commits in them, which can be unexpected:

git config --global push.default nothing

On Windows you must run:

git config --global core.autocrlf false

This stops git on Windows from changing the line endings. Without this as you clone onto a windows box files are modified when the clone is made. One of the most visible effects of this is patches failing

Cloning the main HTCondor source repository

The first thing you need to do is to clone the main git source repo. This gives you a whole copy of all the HTCondor source code and its history on your machine. As of February 2, 2009, CONDOR_SRC.git is about 339 MB in size.

$ cd /p/condor/workspaces/your_login
$ git clone /p/condor/repository/CONDOR_SRC.git

Note that /p/condor/workspaces/your_login/CONDOR_SRC/src contains the latest checked out HTCondor source. The metadata for the whole tree is stored in /p/condor/workspaces/your_login/CONDOR_SRC/.git . The files under /p/condor/workspaces/your_login/CONDOR_SRC , known as your workspace, will change as you switch from branch to branch.

If your machine doesn't have AFS, you can have git use ssh to interact with the main repo. Your clone command would look like this, for <name> use a CSL maintained computer you can log into. HTCondor staff can use or Students can use first floor Linux lab machines like or

$ git clone ssh://<name>

The main source repository now includes the documentation sources as well.

Create names for the stable and development branches in your local repo

Git differentiates between branches in your local repo and those in remote repos, for us that is the central HTCondor repo. In addition to creating entirely local branches, git lets you create local branches that track remote branches. Tracking means when something changes on the remote branch you can have those changes reflected in your repository. Having a local branch that tracks a remote branch is also the best way to commit changes to a remote branch. In general, HTCondor developers should always keep local branches that track the stable and development branches in the central HTCondor repository. Assuming the stable series is 7.0, and the development is 7.1, issue the following commands:

$ cd /p/condor/workspaces/your_login/CONDOR_SRC

# Create a local V7_0-branch to track the central V7_0-branch
$ git branch V7_0-branch origin/V7_0-branch

# You will need to use your "master" branch as a V7_1-branch, due to
# git limitations

You are now good to go.  It wouldn't hurt to clean up any unreachable objects nows, it doesn't take long, and it makes the repo smaller and faster.

$ git gc --prune

Getting the latest upstream code

There are two ways to update your local repository. Which one to use depends on whether you have committed changes in the meantime.

No changes: pull

The command to update your local repository is:

git pull

It will pull down the latest repository information from the origin remote file (which points at where you initially cloned the repository from), then merge. If the new changes don't conflict with anything locally, the merge will "fast-forward" your branch reference to the new head. If new changes do conflict, it will try to do an automatic merge using a couple of different schemes, and then automatically commits the merge if it's satisfied. If you notice that your pull resulted in a merge (in the output of pull), you might want to use gitk to see if the merge did what you expected it to. If it isn't satisfied, then you have to resolve the conflict manually.

You've made changes: fetch and rebase

If you have committed local changes, then git-pull will create a merge commit, which will pollute the change list when you later push upstream. You can avoid seeing these changes with a git log --no-merges command, or you can "rebase" your local changes.

git fetch
git rebase origin/some_branch
Absolutely do not replace this character with a space!

Instead of merging, this attempts to insert the changes you pull from the origin repository before your local changes, avoiding the merge message. You need to be sure to rebase on the proper branch. You should also be sure your changes are truly local; i.e., you have not pushed any of your local changes to another repository, and that no one has pulled from you. For instance, if you are on your V7_0-branch and want to rebase against changes in origin/V7_0-branch you issue git rebase origin/V7_0-branch .

You can do

git remote set-head origin -d
to set up your repository so that git rebase origin some_branch will be an error rather than doing the rebase operation. The problem is that origin gets interpreted in this context as origin/HEAD which usually points to master . So you will be rebasing onto master, which is almost certainly not what you want, especially if the branch you are rebasing is something like V7_7_1-branch !

Working on a single person project

The most common HTCondor use case is working on a feature all by yourself. We can easily share this work later with others if need be, so don't worry about whether to choose this approach or the team approach described later.

First, you'll want to make a branch for this work, so you'll need to decide where to branch from, either the development or stable series. Let's say that you want to branch from the 7.0 branch, named V7_0-branch . You'll need to name the branch, too. This is a local name, so you need not worry about name collisions. Let's call the branch GreatFeature . For historical reasons (cough CVS), all of our released branches end in -branch, but there's no need to keep that convention for local branches.

# Checkout your local 7.0 branch, the one that tracks the central
# repo's V7_0-branch, and merge in all the latest changes from the
# central repository. The "git pull" is analogous to a "cvs update".
$ git checkout V7_0-branch
$ git pull

# Make the local branch, which by default will be off the your current
# location, which happens to be the end of the V7_0-branch
$ git branch GreatFeature

# Then checkout the local branch
$ git checkout GreatFeature

# If you forgot where you were:
$ git status

# Go larval.  Hack.

# Tell git about files you've added
$ git add great_feature.C

# What have I done?
$ git diff

# When your feature gets to a save point, do a local commit.  Note
# "-a" means all changed files.  Make sure this is what you want. VERY
# IMPORTANT: You will be prompted for a commit message.  The first
# line of the commit is frequently displayed as the "name" of this
# commit, so make the first line a good summary

$ git commit -a

# Hack some more, Commit again

$ git commit -a

# Crap.  What did I just do?  Show me the diff that I just committed:

$ git diff HEAD^

# I really messed up src/, remove all my changes

$ git checkout HEAD src/

# What was I thinking?  Remove all traces of the last commit:

$ git reset --hard HEAD^

# My favorite command
$ git blame path/to/file

# All done, ready to merge
# Final commit to working branch:

$ git commit -a

# switch to merge-to branch
$ git checkout V7_0-branch
# pull it to make it up-to-date
$ git pull
# And do the merge.  All files merged cleanly are committed to the
# local V7_0-branch
$ git merge GreatFeature

# And, while you are still on the V7_0-branch, send your changes to
# the central shared repo:

$ git push origin V7_0-branch

# Since we are finished with the GreatFeature branch, let's delete it.
# The -d means validate that the branch was merged fully into HEAD.

$ git branch -d GreatFeature

Making a single person project into a multiple person project

Taking your local project and making it available to others via the central repository involves creating a new public branch. See GitNewBranchRules for details on this.

Working on a project shared with more than one person

The second most common HTCondor use case is working with someone else on a shared branch, which happens to be what the stable and development branches are. This might be the most common use case simply because making branches has been complicated in the past.

If you are going to be using a shared branch that already exists you can skip the branch setup steps and move right on to the steps for working on the branch, which are amazingly similar to what you do when working on your own local branch.

Create the branch and do bookkeeping

Before going any further, you need to make the branch do some bookkeeping. See GitNewBranchRules for details on this.

Work on a branch someone else created

# Get the latest repository information
$ git fetch

# Make a local copy of the public branch
$ git branch V7_1-GreatFeature-branch origin/V7_1-GreatFeature-branch

# Check it out.
$ git checkout V7_1-GreatFeature-branch

# Hack hack hack

# Commit your changes, these only go to your local copy of the branch
$ git commit -a

# Hack some more

# Commit new changes

# Now you're ready to send your changes to the central repository for
# others working on the V7_1-GreatFeature-branch to see

# Make sure you are up to date on the branch, like "cvs update"
$ git pull

# Fix any conflicts and commit
$ git commit -a

# Push your changes to the central repo
$ git push origin V7_1-GreatFeature-branch

Creating a new development or stable series

The process to create a new development or stable series, i.e. new branches, is just like the process for creating branches for a share project (above). The only difference is how you decide on the branch's name. Instead of using the formula the includes some meaningful name as part of the branch name, you just update the version number. See GitNewBranchRules for details on this.

Migrating old changes from a previous CVS workspace into GIT

Migrations should have already been done, but just in case any are still lingering, this documention is now at MigratingCvsWorkspaceToGit .

Merging code from the stable branch to the development branch

Merging is a very common operation with git, and should be something that you end up doing fairly often. It is not something that you should be wary of. The steps for merging the stable branch into the development branch are essentially the same as for merging any two branches together, except it is likely one of the few times where you may run into conflicts.

To start out, be sure you have a copy of both the development branch and stable branch in your repository, steps for creating them are in Preliminary setup . The next thing you want to do is make sure you have the most recent copies of the branches you are tracking from the central repository.

# Fetch all updates for the stable, development and any other branches you are tracking from the central repository
$ git fetch

Once git fetch finishes you want to checkout the branch you are merging to, which in this case is the development branch.

# Checkout the development branch, it is the destination of the merge
$ git checkout master

Before proceeding be sure that you do not have any uncommitted changes on the master. If there are changes either copy them to a separate branch for the time being (to be described sometime) or push them to the central repository. Generally, you will not be developing much on master anyway. You will probably do development on a feature branch, which you merge into the master and push to the central repository.

Git has a number of merging algorithms, but, generally, we do not care about any except the default algorithm. There are numerous resources online and git manual pages that explain the different approaches, but the default is more than enough for us.

# Perform a merge of V7_0-branch into master, remember we're on the master
$ git merge V7_0-branch

Often the merge will have no conflicts, but lets go over what you need to do when you end up with conflicts. First, some much needed background about how git works. When you have a branch checked out in your workspace git maintains what it calls an index . The index is essentially a record of what will go into the next commit you make. You can think of it as a record of what you have changed in a workspace, including added and removed files. In CVS, the index is split over the CVS/Entries file and the arguments you pass to cvs commit . The CVS/Entires file keeps track of added and removed files while cvs commit a b c explicitly names the files, a b c, that have changed for the commit. When you work with git in day to day development you might not be aware of the index, but commands like git add and git commit -a work with it for you. Even git diff is actually giving you a diff of what is in the index and what is in your workspace. git diff HEAD gives you a diff since the last commit. Also, git status tells you about the state of the index, listing files that have changed and are included in the next commit along with files that are untracked and files that are changed but not included in the next commit. When performing a merge you want to be aware of what is in the merge's index.

$ git status

# On branch master
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#       deleted:    Distfile
#       deleted:    Imakefile
#       modified:   NOTICE.txt
#       deleted:    nmi_tools/analysis/README
#       modified:   src/CVS_Tags
#       modified:   src/condor_tests/Imakefile
#       modified:   src/condor_tests/
#       new file:   src/condor_tests/
#       new file:   src/condor_tests/
#       new file:   src/condor_tests/
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#       unmerged:   nmi_glue/build/post_all
#       modified:   nmi_glue/build/post_all
#       unmerged:   nmi_glue/submit_info
#       modified:   nmi_glue/submit_info
#       unmerged:   nmi_glue/test/platform_post
#       modified:   nmi_glue/test/platform_post
#       unmerged:   src/CVS_Tags
#       modified:   src/CVS_Tags
#       unmerged:   src/condor_scripts/
#       modified:   src/condor_scripts/
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#       nmi_glue/test/post_all~HEAD

The output above has been slightly abbreviated. There are three sections and each is important to understand. The first is the status of the index. It lists all the changes the merge was able to make for you automatically. The second section contains a list of files where conflicts occurred. If you've done stable->development merges in the past you are likely familiar with some of these conflicts, namely src/CVS_Tags . The third section of the output is a list of files that the index knows nothing about. In this example it is just one file that the merge created for us.

Now it is time to go through the Changes to be committed: list and make sure you want everything in it to happen. Keep in mind that the modified: files are those where the merge was successful, but you can still check to make sure they look good.

# If you don't want to delete nmi_tools/analysis/README, it is marked
# for deletion because it existed in master but not V7_0-branch
$ git checkout master nmi_tools/analysis/README

# If you want to delete a file that is marked as a "new file"
$ git rm src/condor_tests/

Next you should process the files with conflicts, those in the Changed but not updated: list.

# Edit src/CVS_Tags and resolve the conflict
$ emacs src/CVS_Tags

# If there wasn't enough info in the file you can try
$ git diff

# Or get a copy of the original file from each branch
$ mkdir tmp71
$ git archive master src/CVS_Tags | tar xvC tmp71
$ mkdir tmp70
$ git archive V7_0-branch src/CVS_Tags | tar xvC tmp70
$ less tmp71/src/CVS_Tags

# Once you have resolved the conflicts in src/CVS_Tags add it back
# into the commit

$ git add src/CVS_Tags

Once you resolve all conflicts you'll see the once conflicted files listed as modified: in the index, check with git status . At this point you are ready to commit your merge and push it to the central repository. After running it through NMI, of course.

# Commit merge
$ git commit -m "Merged 7.0 source into 7.1"

# Make sure the new 7.1 branch builds and tests

# Push the merge into the central repository for all to see
$ git push origin master

Checking out an older revision of HTCondor

To check out an older revision, suppose HTCondor 7.1.4, one does this:

$ # git checkout -b <temporary branch name> <tag name>
$ git checkout -b V7_1_4 V7_1_4

The above command creates a temporary branch named V7_1_4 and you will automatically be sitting on that branch. You can now inspect or build the code.

To delete the branch, checkout a different branch, then:

$ git branch -D V7_1_4