Skip to content

Latest commit

 

History

History
233 lines (174 loc) · 9.39 KB

Readme.md

File metadata and controls

233 lines (174 loc) · 9.39 KB

Up To Schedule - Back To Plan for Mistakes - Forward To Mobility: Using Version Control at Work and Home

Github and Remote Version Control


Based on material by Katy Huff, Anthony Scopatz, Sri Hari Krishna Narayanan, and Matt Gidden

GitHub.com?

GitHub is a site where many people store their open (and closed) source code repositories. It provides tools for browsing, collaborating on and documenting code. Your home institution may have a repository hosting system of it's own. To find out, ask your system administrator. GitHub, much like other forge hosting services (launchpad, bitbucket, googlecode, sourceforge etc.) provides :

  • landing page support
  • wiki support
  • network graphs and time histories of commits
  • code browser with syntax highlighting
  • issue (ticket) tracking
  • user downloads
  • varying permissions for various groups of users
  • commit triggered mailing lists
  • other service hooks (twitter, etc.)

NOTE Public repos have public licenses by default. If you don't want to share (in the most liberal sense) your stuff with the world, you can make a private repo. While that often costs money on Github, they now have education discounts.

GitHub password

Setting up GitHub requires a GitHub user name and password. Please take a moment to create a free GitHub account (if you want an education discount or to start paying, you can add that to your account some other day).

git remote : Steps for Forking a Repository

From Github:

Creating a fork is producing a personal copy of someone else's project. A fork acts as a bridge between the original repository and your personal copy. You can submit Pull Requests to help make other people's projects better by offering your changes to the original project. Forking is at the core of social coding at GitHub.

So, forking a repository on Github gives you a copy of that repository in your Github account. cloning a repository gives you a copy of that repository on your local machine. In order to connect your local copy and the versions on Github, we tell your local copy that they exist, via git remote.

The git remote command allows you to add, name, rename, list, and delete repositories such as the original one upstream from your fork, others that may be parallel to your fork, and so on.

We'll be continuing our testing exercises using GitHub as the online repository, so you'll need to start off by getting a copy of that repository to work on.

Exercise : Fork Our GitHub Repository

Step 0 : Clean Up Your Local Space

We'll be interacting with remote repositories now, so let's clean up the simplestats folder on your machine.

$ cd
$ rm -r simplestats

Or if you'd like to keep it around

$ cd
$ mv simplestats old-simplestats

Step 1 : Go to our repository from your browser, and click on the Fork button. Choose to fork it to your user name rather than any organizations.

Step 2 : Clone it. From your terminal :

$ git clone https://github.com/YOU/simplestats
$ cd simplestats
$ git remote -v
origin  https://github.com/YOU/simplestats (fetch)
origin  https://github.com/YOU/simplestats (push)

Your local repository is now connected to the remote repository using the alias origin.

Step 3 : Add a connection to the common repository :

$ git remote add upstream https://github.com/UW-Madison-ACI/simplestats
$ git remote -v
origin  https://github.com/YOU/simplestats (fetch)
origin  https://github.com/YOU/simplestats (push)
upstream        https://github.com/UW-Madison-ACI/simplestats (fetch)
upstream        https://github.com/UW-Madison-ACI/simplestats (push)

Remote Naming Conventions

All repositories that are clones begin with a remote called origin, by default. The most common convention is clone from your own fork (origin). If your fork is based on someone else's project (e.g., your research group's), you should then add that repository as a remote named upstream.

git fetch : Fetching the contents of a remote

Now that you have connected your repository to the "upstream" original, it is able to pull in updates from that repository. In this case, if you want your master branch to track updates in the original simplestats repository, you simply git fetch that repository into the master branch of your current repository.

$ git fetch upstream

The fetch command alone merely pulls down information recent changes from the original master (upstream) repository. By itself, the fetch command does not change your local working copy. To update your local working copy to include recent changes in the original (upstream) repository, it is necessary to also merge.

git diff : Examine the differences

Now that you have fetched the upstream repo, you can look at the differences between that and your local copy. To see a summary of which files have changed and by how much:

$ git diff --stat upstream/master

To explore the actual changes:

$ git diff upstream/master

git pull : Pull = Fetch + Merge

The command git pull is the same as executing git fetch followed by git merge. Though it is not recommend for cases in which there are many branches to consider, the pull command is shorter and simpler than fetching and merging as it automates the branch matching. Specifically, to perform the same task as we did in the previous exercise, the pull command would be :

$ git pull origin
Already up-to-date.

When there have been remote changes, the pull will apply those changes to your local branch, unless there are conflicts with your local changes.

git push : Sending Your Commits to Remote Repositories

The git push command pushes commits in a local working copy to a remote repository. The syntax is git push [remote] [local branch]. Before pushing, a developer should always pull (or fetch + merge), so that there is an opportunity to resolve conflicts before pushing to the remote.

Exercise: Update your remote to an upstream change

Assume that your lab group collectively works on a project (like simplestats), and someone has updated the master branch (we can simulate that by a helper doing an update -- helpers?).

It is now your job to:

  • get the upstream changes
  • check what files have changed and how
  • apply them to your local repository
  • apply them to your fork

Aside: git rebase vs git merge

To incorporate upstream changes from the original master repository (in this case UW-Madison-ACI/simplestats) into your local working copy, you must do more than simply fetch the changes. After fetching the changes, your local repo know about the upstream changes, but hasn't combined them with any local changes you may have already made. There are two mechanisms for doing this, with slightly different behavior.

The role of git is to keep track of little bundles of change (each commit). In theory, it doesn't matter in what order these changes are applied, it should end up with the same version of the files. In practice, however, you may want to take some control of this. In particular, when you are combining upstream changes into a branch where you are making local changes, it is almost always better to insert all the upstream changes before your local changes, using rebase. This takes each of your commits, since the point at which the two branches began to differ, and replays them at the end of the upstream branch. If there are conflicts, you will be notified and asked to review them manually.

By contract, merge takes each commit from the upstream branch, since the point at which the two branches began to differ, and replays them at the end of your branch. Again, if there are conflicts, you will be notified and asked to review them manually.

There are lots of details to consider when choosing between rebase and merge, but the simplest guidelines are:

  • rebase when incorporating changes from an authoritative upstream repository
  • merge when incorporating changes from a feature branch or collaborator

The process of rebasing/merging may result in conflicts, so pay attention. This is where version control is both at its most powerful and its most complicated.

Assuming that you have a master branch that you are keeping in-sync with an upstream remote and a feature branch that you are keeping up-to-date with your own origin, the general, best-practice workflow is as follows:

$ git checkout master 
$ git fetch upstream         # get upstream updates to your machine
$ git merge upstream/master  # get your local master branch up-to-date
$ git push origin master     # get your remote master branch up-to-date
$ git checkout feature      
$ git rebase upstream/master # put all your feature commits *on top* of the updates
$ git push -f origin master  # forcibly update your remote feature branch

Up To Schedule - Back To Plan for Mistakes - Forward To Mobility: Using Version Control at Work and Home