Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md

CICF Week 4

The goals for the week 4 lab are to:

Create git repositories. Commit changes. Create and merge branches.
Create a GitHub account
Make a pull request

Tutorial

This tutorial will focus on one particular tool discussed in the lecture, the Git distributed version control software. Focusing on this tool makes sense since it is useful for any situation involving sharing text files between people.

We will first set up some preferences for Git. It needs to know our name and email to tag the changes we make. This is what I entered, change the name and email to be yours.

git config --global user.name "Don Brower"
git config --global user.email "[email protected]"

I think the default editor is set to be "nano", but lets make it explicit:

git config --global core.editor nano

Every Git repository has a notion of a "default branch". This is kind of the natural starting point for browsing the code. Most new git repositories use "main" as this branch name. We will make sure that is also configured for us:

git config --global init.defaultBranch main

Git tracks changes to the files inside an entire directory. Lets make a directory to work in. We will make a new Git repository in this directory for us to practice making commits.

mkdir week4-test
cd week4-test
git init

The "git init" command tells Git that we want to track changes in the week4 directory (and any subdirectories). Make a file. It is good for every project to have a README file to contain the project name and a description for it.

nano README.md

Type some text. I typed the following:

Week 4 Test

A repository to test git commands with.

Exit nano by typing CTRL-S to save and then CTRL-X to exit.

Git classifies the files in our directory as tracked or untracked. Tracked files are stored in the repository and have their changes being tracked by Git. Untracked files are not tracked by Git. We can ask Git to tell us the status of the files in our directory compared with the repository (which is currently empty).

git status

It says that we are on the branch "main", that the repository has no commits and that there is an untracked file present. The untracked file list is a good reminder that a new file has been created and we should either track it, or tell Git to ignore it. We will tell Git to track this file:

git add README.md

The git add command just tells Git that we want to include this file in our next commit, but doesn't actually save it to the repository. The git commit actually saves files to the repository.

git commit

When you run this, a text editor will appear. Git is asking you to type a message to remind you of what this change is doing. The lines starting with a hash # will be removed by Git before saving. Git puts them there as a reminder of what files this commit will be updating.

A commit message should consist of at least one line giving a brief summary of the changes. Then if more information is desired, enter a blank line and then write as much as you care to. Since this is the first commit, nothing has changed yet, so the standard practice is to use "Initial commit". Exit nano with CTRL-X and save the message. You've made a commit.

(Run git status. Has anything changed?)

We can now ask Git for the repository history with git log.

$ git log
commit c9e8d3ec39248fbb60c94c7555c61e2812f144 (HEAD -> main)
Author: Don Brower <[email protected]>
Date:   Fri Jan 27 14:18:27 2023 +0000

    Initial Commit

Every commit has a name consisting of a bunch of random-ish numbers and letters, a date, an author, and 0 or more parent commits. (It is a SHA1 hash of the contents of the commit, including the current date, committer, etc). Since this was the first commit, it doesn't have any parent commits. Most commits will have one parent. A few may have more than one parent ("merge commits").

Lets make a change to the README file. Change 'A' the 'The' and add a second line.

$ git diff
$ git add README.md
$ git commit

This makes a second commit. Notice that we need to "add" the README file, even though it was already "added" and we just changed it. The add refers to adding it to the next commit.

The "diff" is in a format called a "unified diff". Lines that were added are prefixed with a plus sign +, lines that were removed are prefixed with a minus sign -. You can see when each line in a file was changed by using git blame

git blame README.md

Let's add a second file. We will add a batch file that makes a list of numbers.

$ nano make-numbers.sh

Type in:

#!/bin/bash

seq 100 > numbers.txt

Make the file executable and then run it:

$ chmod +x make-numbers.sh
$ ./make-numbers.sh

It will create a numbers.txt file.

$ ls
$ git status

We want to track the script file and non track the output file:

$ git add make-numbers.sh
$ nano .gitignore

In this new file add a single line

numbers.txt

This will tell git to not track and to ignore this file.

$ git status

$ git status
On branch main
Changes to be committed:
  (use "git restore –staged <file>..." to unstage)
         new file:    make-numbers.sh

This shows that the git add does not immediately make the commit happen. Instead it is keeping the file in a "staging area" until the commit command is issued.

$ nano README.md

Use the numbers file for test data

$ git diff
$ git add README.md
$ git commit -m "Add make-numbers script"

Working with GitHub

Most projects are shared using a Git hosting service. These services just keep a copy of the repository. One of the biggest is GitHub, and now we will look at how it is used. There are many open source and open science libraries and applications on GitHub.

First you need to create an account on it if you don't have one already. Accounts are free. Sign up at github.com

Once you have done this we need to add the key on the VM to your GitHub account. Open firefox and sign in. Upper left menu > Settings > SSH and GPG Keys

cat ~/.ssh/id_ed25519.pub

Now lets reset and get a copy of a repository I have prepared:

$ cd
$ git clone [email protected]:dbrower/cicf-2024.git

A clone will make a copy of the repository specified onto the VM. It will also make a subdirectory called "cicf-2024" and check out the most recent commit. Notice that the local copy of the repository on the VM has a "remote" to the repository we cloned it from. This is to make it easier to move changes back to the source repository.

$ cd cicf-2024
$ git remote –v
$ git remote rename origin upstream

Lets see if the python file works.

$ python3 fib.py

It is common to make a branch for a change so that you can make more than one commit. Then later you can ask people to review the entire branch at once. The branch serves as a way to keep the commits that you are making off the main branch for the time being.

$ git checkout -b add-contact-info
$ nano README.md
$ git add README.md
$ git commit

Notice the change is not on the hosting service yet. To do that I need to copy the change there

git push upstream add-contact-info

Usually for a project you will not have access rights to add changes to someone else's repository. The solution is for you to make your own copy of the repository on the hosting service and then put your changes there. This is called "forking a repository" (in the Github parlance). View the repo on the github service and choose to fork it.

Add the fork "remote" to the checked out repository.

git remote add origin _your forked repo_

Now push your changes up to your fork of the repo:

git push origin add-contact-info

View the changes in the browser.

You can then ask owner of the original repository to merge them into their repo. On GitHub, at least, this is called a "pull request". The following command will open the web page for your forced copy of the repository and from there you can choose "Create Pull Request". Choose the "Compare & pull request" button. On the next page make sure the base branch is the main branch in the dbrower copy of the repository. (It will look slightly different from the screenshot since the screenshot is not working from a fork of the repository.)

And then select "Create pull request".

Resources

It is hard to overstate the importance and usefulness of Git in modern software development. However, the user interface leaves a lot to be desired.

Software Carpentry's Version Control with Git
Roger Dudler's git - the simple guide
Comprehensive Reference to Git
There are many Git hosting sites, these are the big three: Github, Gitlab, Bitbucket. There are also many smaller ones, such as SourceHut.
Git is by far the most used version control system, but it isn't the only one: Mercurial, Fossil, Subversion, [CVS] CVS (ancient, old, please never use)
The Biggest and Weirdest Commits in Linux Kernel Git History (2017)

Developers love to design algorithms—including ways that people should organize software development with Git.

(https://www.atlassian.com/git/tutorials/comparing-workflows)
Gitflow
Github flow
There are too many to name. Organizational needs are important, since open source software is developed differently than closed source, and mobile apps are different than webapps. (e.g. Webapps can deploy a new version continuously unlike mobile apps). Scientific software is likewise different since it is usually for a specific purpose.
Example of a GitHub Pull Request and comments

There are lots of coding standards.

The Python PEP 8 standard for the Python standard library.
Google Style Guides for many different languages.
JPL Institutional Coding Standard for the C Programming Language
The Power of 10 Rules
NASA F' Flight Software Framework

Issue tracking is basically a big list but with ability to sort based on metadata.

Github Issue tracker
The big professional grade tracker is Jira, but it is expensive.
An epic treatise on scheduling, bug tracking, and triage

Testing is a huge area to learn about.

PyTest package
Test Driven Development. Developers love to create methodologies.
Checking in on the state of TDD (2023-07-12)
Types of tests
Model Checking can verify correctness of programs to a specification. In this way it is the other side of the coin from testing. They are based on temporal logics. One example of a model checker is TLA+.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week04-git-and-coding-standards

week04-git-and-coding-standards

README.md

CICF Week 4

Tutorial

Week 4 Test

Working with GitHub

Resources

Files

week04-git-and-coding-standards

Directory actions

More options

Directory actions

More options

Latest commit

History

week04-git-and-coding-standards

Folders and files

parent directory

README.md

CICF Week 4

Tutorial

Week 4 Test

Working with GitHub

Resources