-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathDownload-files.Rmd
126 lines (99 loc) · 3.92 KB
/
Download-files.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---
title: "Work with files from a Google Drive folder"
date: '`r format(Sys.time(), "%A %B %d %Y %X %Z")`'
output: md_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r include=FALSE}
options(
gargle_oauth_cache = ".secrets",
gargle_oauth_email = "[email protected]"
)
googledrive::drive_deauth()
googledrive::drive_auth(scopes = "https://www.googleapis.com/auth/drive.readonly", email="[email protected]")
url_googledrive <- "https://drive.google.com/drive/folders/11WnXxs56jORbLkD1mFTZxwSaShex3Sse"
id_googledrive <- "11WnXxs56jORbLkD1mFTZxwSaShex3Sse"
```
# Download files
## List files in a google drive folder
All files.
```{r}
dir_files <- googledrive::drive_ls(path = url_googledrive)
dir_files$name
```
All csv files.
```{r}
dir_files <- googledrive::drive_ls(path = url_googledrive, type="csv")
dir_files$name
```
All Excel files.
```{r}
dir_files <- googledrive::drive_ls(path = url_googledrive, type="xlsx")
dir_files$name
```
All Google Spreadsheets.
```{r}
dir_files <- googledrive::drive_ls(path = url_googledrive, type="spreadsheet")
dir_files$name
```
All Google Documents.
```{r}
dir_files <- googledrive::drive_ls(path = url_googledrive, type="document")
dir_files$name
```
## Download a file
First get the files in the folder.
```{r}
a <- googledrive::drive_ls(path = url_googledrive)
```
### Download one Excel file
Download from Google Drive and save to a folder called `data`.
```{r}
file_name <- "data.xlsx"
# find the id of that file
file_id <- a$id[which(a$name==file_name)]
# download the file and save to data folder
googledrive::drive_download(file=file_id, overwrite = TRUE, path = file.path("data", file_name))
```
Read that file in using the **readxl** package.
```{r}
tmp <- readxl::read_excel(file.path(here::here(), "data", "data.xlsx"))
tmp
```
## Download one csv file
Download from Google Drive and save to a folder called `data`.
```{r}
file_name <- "Stone.csv"
file_id <- a$id[which(a$name==file_name)]
googledrive::drive_download(file=file_id, overwrite = TRUE, path = file.path("data", file_name))
```
Read that file in. The first 2 lines are metadata and are skipped.
```{r}
tmp <- read.csv(file.path(here::here(), "data", file_name), skip=2)
tmp
```
### Download csv files
Download all csv files and save to a folder called `data`.
```{r}
a <- googledrive::drive_ls(path = url_googledrive, type = "csv")
for (i in 1:nrow(a)){
googledrive::drive_download(a$id[i], overwrite = TRUE, path = file.path("data", a$name[i]))
}
```
### Download Google Spreadsheets
```{r}
a <- googledrive::drive_ls(path = url_googledrive, type = "spreadsheet")
for (i in 1:nrow(a)){
googledrive::drive_download(a$id[i], type = "csv", overwrite = TRUE, path = file.path("data", a$name[i]))
}
```
## Pushing results up to GitHub
Here I show how you might use the [**gert**](https://CRAN.R-project.org/package=gert) package. Note this works because I am doing it within RStudio and my RStudio set-up has permission to push to this repo. If you don't have RStudio set-up to push to GitHub, then you need to set that up first. Also you need to be working in the local repo of the GitHub repo that your are pushing to. This sounds more complicated than it is.
```{r eval=FALSE}
gert::git_add(file.path("data", "Stone.csv"))
gert::git_commit("Adding a file", author = "Eli Holmes <[email protected]>")
gert::git_push(remote = "origin", repo = ".")
```
You can also use these set ups to set up your code to run from GitHub via GitHub Actions. If the Google Drive is not publicly viewable, then you'll need to research how to set up authorization in a deployed application [read this](https://gargle.r-lib.org/articles/non-interactive-auth.html). But if the Google Drive is publicly viewable, then it is fairly easy to set up a Google Action that is triggered on a schedule or triggered by an outside event. [Read about how to do that here](https://docs.github.com/en/actions/learn-github-actions/events-that-trigger-workflows).