forked from cis-ds/course-site
-
Notifications
You must be signed in to change notification settings - Fork 0
/
git06.Rmd
74 lines (64 loc) · 2.67 KB
/
git06.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
title: "Using Git with Python and Jupyter Notebooks"
output:
html_document:
toc: TRUE
toc_depth: 1
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(cache=TRUE)
```
# Git and Python
The short answer is that Git works with Python just as well as R. In fact, you can use RStudio as a Python IDE to edit and run Python (`.py`) scripts. That said, there are plenty of IDEs specifically for Python that have Git integrated. Otherwise, the fallback option is to use Git from the [shell](shell.html) or whatever [Git client you chose to install](git02.html).
# Git and Jupyter Notebooks
Jupyter Notebooks present a more tricky challenge to tracking and syncing with Git. Jupyter Notebooks can be committed and synced with a GitHub repo. In fact, [GitHub will render your notebook directly on your GitHub repo page](https://github.com/blog/1995-github-jupyter-notebooks-3).
The main issue comes in how Jupyter Notebooks store their content. What you see in the Notebook Viewer is the rendered copy of the notebook. The actual notebook looks something like this:
```{r engine = 'python', eval = FALSE}
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello world\n"
]
}
],
"source": [
"print('Hello world')"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [Root]",
"language": "python",
"name": "Python [Root]"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
```
Jupyter Notebooks don't store their data as a plain-text file, but instead in this structured format. This is fine when it comes to storing and tracking code, but the notebook also includes all output. Sometimes you want to track changes to your code without storing all the output as well.
If that is the case, there are [methods for stripping the output from a Jupyter Notebook and only committing changes to the code](https://www.google.com/search?q=git+and+jupyter+notebooks). However these methods are beyond the requirements of this course. In the future you might want to explore these options. However for this course I recommend you simply track all changes, including code and output. Because there is no integration of Git within the Jupyter Notebook server, you will need to use your Git client or the shell to stage, commit, and sync changes.