-
Notifications
You must be signed in to change notification settings - Fork 82
EBC Git Tutorial
This is currently under construction
This tutorial aims at helping you to get started with version control using Git from an EBC-perspective. If you have suggestions on how to improve this tutorial, please make sure to bring this to our attention using the issue tracker. Don't be shy about this, if something does not become clear in this tutorial, it is likely not your fault, but a usability bug of this tutorial, so the issue tracker is a good place to talk about it.
The intended audience of this tutorial are students with no prior experience in version control systems, but we hope that it is also helpful for every interested reader.
There are many good reasons for using version control and we couldn't agree more with Michael Tiller's What Engineers Need to Know About Version Control. We encourage you to take a minute and read that blog post. (Also, we like the image he uses to illustrate the dangers of not using version control.) In case you feel like tl;dr (which you shouldn't) here is quick summary of the main arguments for using version control:
- Danger: You can lose work by computer crashes or accidentally deleting/not saving. Version control helps to prevent that.
-
History: Especially when writing a thesis or developing code (e.g. in Modelica or Python), many people make safety copies along the way, resulting in file names like
thesis_2016-06-17_v4_corrected_final.docx
(often followed bythesis_2016-06-17_v4_corrected_final2.docx
...) orHeatingSystem_test-4_-new-control_valid.mo
. Version control helps to prevent that. - Collaboration: When working with others, we often waste time exchanging and adapting different versions of files. Version control provides a strong platform to manage collaboration better.
We hope these reasons are motivation enough to read on and learn how version control can help us to improve our workflows.
Unfortunately, before getting our hands on an example that demonstrates how to use Git for our version control we need to understand the concept of Git's version control system and introduce a few terms in order to make sure we speak the same language when talking about Git.
If your computer is running a Windows system, you are used to files being displayed in the explorer. For a local copy of AixLib, this may look something like this:
Let's imagine all these files at a specific moment in time (a "snapshot" of the entire file system within a given parent directory) to be represented by a blue dot that looks like this:
What a version control system like Git does is that it allows us to save, organize and manage many of such snapshots within a repository. We may visualize the concept of such a repository (or repo in short) like this:
Instead of just making changes (intentional or accidental) on the one version of the files that the explorer view offers us and hoping for the best, a version-controlled repo enables (and encourages) us to save such snapshots, giving us a timeline of changes that we can inspect and go back in, if we want to undo some changes. In addition, we can organize many of such timelines in parallel, allowing us to work in parallel with our colleagues or try out things with the confidence of knowing that we can always go back to a stable version saved in another snapshot. In Git, each of these parallel "timelines" is called a branch. A "snapshot" is called a commit. By default, there is one master branch. In projects like AixLib, we try to always have a stable version of our code on the master. Thus, all development and experiments are done in parallel branches (e.g. New feature 1
and Wild test 17b
in the figure above). We can have a practically unlimited number of such branches in parallel, create new ones and delete others. In any case, we can always switch between branches, start new ones off our currently selected commit and merge our developments from one branch back into another (indicated by a green dot in the figure above). We will see how this works in a short time.
You can use Git for version control on your local machine to better manage your files helping with the History issue mentioned above, but in order to also address the Danger of losing your work locally and to work in Collaboration with others, we will need to exchange data with a server. Let's use a similar visualization as introduced above to illustrate this concept:
Git is a distributed version control system. This means that your local computer as well as the server and any other computers working on a given project will have a full copy of the repo on their systems. If you are used to working with SVN (another version control system, but a centralized one), this is a notable difference. In Git, you can start new branches, save your commits and checkout alternative versions all locally without being connected to the server. This is great e.g. for working on a train without Wi-Fi, but we will have to keep in mind to actively synchronize our work with the server. We will get to this.
In addition to the repositories on your local computer and on the server, the figure above shows a part called "working copy". We will use this term to refer to what you are actually seeing in your Windows explorer like we showed in the first figure of this document. You will notice any Git repositories in your file system by a folder named .git
in the top-level directory of the repo (you may have to tell Windows to show you hidden folders for this). When you checkout different branches from your local repo, the working copy will appear to "magically" change to the version that was last committed into this branch. It is absolutely important to understand how the working copy and the repository interact. Please note:
- If you make changes to your local files in your working copy, neither the repo on your local machine nor the one on the server are directly affected by this.
- Changes in your working copy have to be actively committed into your local repository.
- Changes in the working copy can be undone and any commit from the repo can be retrieved into the working copy.
Unfortunately, it will get just one step more complicated before we can start working with Git: We cannot directly commit from our working copy into the local repo. First, we have to stage all files we want to include in our commit. We are aware that this may seem unneccessary in the beginning, but you will get used to it and see that it makes sense in order to bundle different changes into different commits to make your history clearer and easier to understand and retrack.
There are several options where the server repository may be hosted. Obviously, there is GitHub (https://github.com/). GitHub is probably the most popular service for hosting Git repos. All open-source projects are hosted on GitHub for free, closed-source projects have to be paid. We use GitHub for all our open-source projects like AixLib and TEASER (and there's always more to come!).
In addition, RWTH Aachen University runs its own Git Server called RWTH GitLab at https://git.rwth-aachen.de/. You can log in to RWTH GitLab with your TIM account, making it ideal for starting your own projects (e.g. for your thesis).
Another very important aspect about these Git servers is that they offer a web-view of your repository, integrating an issue tracker, pull requests, network-graphs and much more with your files.
If you are using your own device, you can simply download and install Git from https://git-scm.com/.
If you are using an EBC computer, please install Git using the Software-Center on your Desktop. It is listed as Software Freedom Conservancy - git
.
Now you are almost ready to use Git for version control. For this tutorial we will use the command line. Yes, we know that some people will prefer a GUI and we will show you how to get started with a GUI at the end of this tutorial, but to understand the concepts and fall back on when the GUI is behaving strangely, we encourage you to bear with us and the command line. In order to make your machine understand Git, you have to add your Git installation (e.g. at C:\Program Files (x86)\Git\bin;)
to the PATH
environmental variable ("Umgebungsvariable" on a German system):
For this example, we will use the RWTH GitLab Server. But if you have no access to that server, any other Git Server is pretty similar.
After logging in to RWTH GitLab at https://git.rwth-aachen.de/ we can start a new project by clicking on the green button:
We can then fill in project name, description (it says optional, but please always provide a description), chose a privacy setting and create the project:
At this moment, the repo is empty and exists only on the server. Using the visualization scheme from above, the situation is this:
Now we want to start working with this repository. In a limited way, some Git platforms will allow us to modify files
on their web interface, but usually we will want to have the repo locally and work there. Thus, we clone the repo to our local machine. To do that, we use the command git clone <server address of our new repo>
in the command prompt (right-click on the image and chose "show image" for a larger version):
We do get a warning that we seem to have cloned an empty repository, but as this is what we expected, we will not worry about that. Instead, let's go back to the concept view to see what happened:
With git clone https://...
we created a local repository that is linked to the server repo by the address we gave with the command. Now, the repo exists twice, on the server, and on our local machine. Also, the working copy that we can see in the Windows file system shows us the empty repo. we see a directory that is empty except for the .git
folder, that indicates that this directory is in fact a git repo:
Now we can start to demonstrate a few Git features and workflows. First we'll add an empty text file to the working copy (The green symbol on the newly created text file is added by Tortoise Git, a GUI that we will talk about in a short while. Please ignore the symbol for now):
Remember that this change until now only affects the working copy. The repo has noticed the change, but has not yet done anything to save or manage our new file within its history.
We can use git status
to get some information about the current state of our working copy and the repo:
As a response, git tells us that we are on branch master
, that there are untracked files (example1.txt
) and that nothing is currently added to commit, meaning that the stage is currently empty.
Before we can commit the file into the repo, first we will thus have to add it to the stage. On a concept level, we are trying to do this:
To add the file to the stage, we use the command git add example1.txt
. After checking again with git status
, we see that the file is now staged for commit.
Now, finally, we are ready to commit the file to our local repo. Again, this is what we are trying to do:
Before we actually commit the file to the repo, let's take a second to reflect on what this will do. We will save a snapshot into the history of our project. For this example project, the requirements may not be too strict, but let's have a look into a larger project like AixLib. You can see the history of all commits made to the master
branch at https://github.com/RWTH-EBC/AixLib/commits/master. You can browse through all code changes and retrack the history of each individual file. This may be necessary, when trying to determine when and why a change was made that later turned out to cause unwanted side effects. For this, the history built by our commits is a crucial tool. With this in mind, you can see that there are two important aspects for commits:
- A good commit message
- Keeping your commits small, comprehensible and well-structured.
You have to write the commit message for you commit manually. Please write this message in the most useful way for your collaborators and your future self you can come up with. Also, it helps a great deal if you structure your commits in small units. Making many changes to your code today? Make a commit for each task separately rather than one large commit before leaving work and it will be much easier to retrack the history for you and others.
Agreed? Ok, then let's do our first commit. To do this, we have at least to options. First, let's do the faster one. Committing in general is done by using git commit
. We can directly add the commit message with the -m modifier
. For example, we can now type git commit -m "<My commit message>"
(e.g. git commit -m "Add an empty example text file"
). Like so:
So far, we have made a first commit to our local repo. If you have also done these steps on your own, you can see that your repo on the server is still empty. In order to get the two repos back in synch, we will send our local changes to the server repo. In Git-lingo, this is called to push the local changes to the server. The corresponding git command is git push
. But in order to tell our local Git repo, which server and branch to push to, we have to add two more keywords to git push
. The first can be interpreted as the "address" where to send our package of data. Such a connection is called a remote. When cloning from a server, Git automatically sets a remote with the name origin
to the server. You can add many more remotes and juggle your data with multiple server repos, but we will not cover that here. Instead, we are satisfied to push to the server's master
branch by using git push origin master
for the moment. Here is our concept view of this step:
In the console, this will look like this:
As a response, we see that we successfully pushed from our local master
to the server repo's master
Checking back with the server, we see that we were indeed successful:
We already mentioned, that in addition to the repo itself, servers like GitHub and RWTH GitLab offer us additional services. One of the most important such services is the issue tracker. This allows us to define workflows that are better to manage and facilitating collaboration as well as quality control. Often, workflows are defined in a project's Wiki. It is not a bad idea to take a minute and have a look at the workflow definitions of AixLib, the Annex 60 library, BuildingSystems, or IDEAS.
A general idea for working on existing projects (let's assume our example project is by now also quite "existing") is to first create an issue on the project's issue tracker and describe briefly what you intend to do. The issue will automatically be assigned by the system with a number. Now we can create a branch, usually naming it after the issue and its number and start working. In our example, let's create an issue to announce that we will add some text to our empty text file:
For the next steps, we will assume our example project to be a bit larger than it is. Let's imagine, this is a large open-source project used by people who rely on a stable version in the master
branch for their work. We have now informed them with our issue #1 that we intend to make changes to the repo. In order to have the stable master
running while we work hard on issue #1, we will split the development's timeline by creating a new parallel branch. In our case, we will call the new branch issue1_text
. This is what we will try to do:
We will create and switch our working copy to a new branch by typing git checkout -b issue1_text
:
As a result, our local working copy will look the same in the explorer view, but the repo is now in a parallel branch. That means that no matter what we do in this branch, the empty file in the master
will be safe and unaffected.
Next, we add some text to the example file:
Again, we have to stage the changes before committing. In order to net let this become boring, we use a new command for that git add .
. This adds all changed files to the stage. But note that it does not stage deleted files. To really add all changes, use git add --all
. But please be careful with this an do not commit changes you did not do intentionally. This is e.g. important with Modelica files if you work in Dymola. Dymola tends to add white space changes to files you did not explicitly work on. Those changes should not be committed.
And now, we will use the second way of writing our commit message. If we only run git commit
without the -m
modifier, a text editor will open. We can write after pressing e.g. i
for insert
mode, exit insert mode by pressing ESC
and save the changes by typing :wq
(maybe meaning write
and quit
). Have look:
The nice thing about this editor, apart from reminding us of earlier computing days, is that it's coloring reminds us to keep the first line of the commit message below 80 characters of length, an empty second line and restarting text add the third line if we want to add more info. This is good style, as it will comply with many server websites and make browsing the history easier. Also, note that we included a reference to issue one by typing For #1
. This will magically mention the commit in our issue tracker once we have pushed our changes. After pushing with git push origin issue1_text
Our issue tracker mentions the commit as:
This is useful to show others following the issue that there is ongoing work here.
Let's assume we are satisfied with our work in branch issue1_text
and want to make this development available to all users by taking the changes from the branch into the master
. To combine the developments of two branches and continue with one single common timeline in one single branch is called to merge one branch into another branch. In our case, we want to merge branch issue1_text
into the master
branch. In a good Git workflow we often do not do this directly, but first issue a formal request for what we want to do. In projects in which 2 or more people collaborate, this is a good way to have some quality control. The developer of the new features or bug fixes usually issues a request, so that a second person can check the code, make comments, and finally accepts or declines the request. More info on e.g. the workflow of AixLib can be found here: https://github.com/RWTH-EBC/AixLib/wiki/Contribute-to-AixLib.
The request we have been referring to is called a Pull Request on GitHub and Merge Request at GitLab. Apart from the name difference, the procedure is quite the same. The requests can be created on the projects webpage. For AixLib on GitHub, you can have a look at the pull requests at https://github.com/RWTH-EBC/AixLib/pulls. For our example, creating a merge request can look something like this:
As shown above, we click to create a New Merge Request. In the next page, we select our source branch and the target branch we want to merge into. In our case, as mentioned above, we want to merge branch issue1_text
into the master
branch. Then we give a quick description of what this merge request addressed, including a reference to the issue we have created before to document our intented developments. Finally, we assign one of our colleagues (in this case we assigned Peter) to check our code, give feedback and accept or decline the request.
In this case, Peter accepted the request directly. We can see the merging of our two branches (and their "timelines") visualized by clicking on Commits in GitLab's left-aligned menu and choosing the Network view in the top menu:
So now we have successfully merged the two branches on our server repo on GitLab. But we have to keep in mind, that this does not directly affect our local repo. The current state in our concept view can be shown like this:
In order to get the changes from the server repo to our local repo, we have to actively ask for them. Above, we pushed our local changes to the server repo. In a similar way, we will now pull the changes on the server back to our local repo. But calling git status
again on our local repo reminds us, that our checked out working copy is still on branch issue1_text
(you can also see this in the drawing above, where the working copy has that branch's background color):
And of course, branch issue1_text
has not been changed on the server repo. The changes affected the server's master
branch, into which we merged branch issue1_text
. What we actually want now is to pull the latest changes from the server's master
branch into our local master
branch. To do that, we thus switch back our local working copy to the master branch. We can do this by calling git checkout master
:
Now our local working copy is back on the master, enabling us to pull the server's master:
Just like with the git push
command, we use the name of the remote
(by default, that is origin
) and the name of the branch we want to pull (in our example: master
) to construct the pull command to be git pull origin master
. We can think of this command as if it were "Please git
, pull
the master
branch from the remote at origin
and merge it with the current state of my working copy". In action, this looks like this:
Great! Now our local repo is back in synch with the server repo and we successfully worked through a first Git example showing us the basic concepts!
We hope this tutorial gave you an idea about how Git works. If anything is still unclear, please help us to improve this tutorial and create an issue on our issue tracker!
If you want to have a look at other tutorials, we can recommend these links:
- Set Up Git - (GitHub help page)
- Git tutorial by Roger Dudler
- Git cheat sheet (useful commands)
- Git tutorial by atlassian
And finally, we promised you an alternative to the command line interface. Here is the link to a video tutorial to Tortoise GIT:
- Getting started
-
Modeling and simulation guide
- Modelica guidelines
- How to Modelica
- Important tools around AixLib
- Move from HeatPump to ModularReversible
-
Contribution guide
- Git Workflow
- Structure of Repository
- Behind the Scenes
- Contribute to AixLib
- Testing and model quality management
- Requirements
- Test Management
- Continuous Integration