-
Notifications
You must be signed in to change notification settings - Fork 1
/
Saving_loading_if_for_apply.Rmd
95 lines (83 loc) · 2.24 KB
/
Saving_loading_if_for_apply.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
title: "Saving_loading_if_for_apply"
author: "Chengqi(Charley) Wang"
date: "1/29/2020"
output: html_document
---
<br/><br/>
####Load the RNA-Seq data
```{r}
##set your working directory and load the data
setwd('~/Documents/work/paper_write/genome_training/R_training/')
df <- read.table('vehicle_drug_feature_counts.txt',
header = T, sep = '\t', row.names = 1)
summary(df)
df <- df[,6:9]
```
<br/><br/>
####Save the data
```{r}
write.table(df,
file = 'vehicle_drug_feature_counts.copy.txt.txt', #either a character string naming a file or a connection open for writing. "" indicates output to the console.
row.names = T, #indicating whether the row names of df are to be written
col.names = T, #indicating whether the column names of df are to be written
sep = '\t', #the field separator string
quote = F #a logical value (TRUE or FALSE) or a numeric vector. If TRUE, any character or factor columns will be surrounded by double quotes.
)
###load again
df2 <- read.table('vehicle_drug_feature_counts.copy.txt.txt',header = T, sep = '\t', row.names = 1)
head(df2)
```
<br/><br/>
#### 'For' loop
```{r}
df <- as.matrix(df)
##check the quantile for each assay
for (i in 1:ncol(df))
{
print( colnames(df)[i] )
qu <- quantile( df[, i] )
print(qu)
}
```
<br/>
#### 'Apply' replaces 'for' loop
```{r}
qu <- apply(df, 2, function(z){quantile(z)})
print(qu)
```
<br/>
#### if...else statement
```{r}
### pick the expressed genes in first assay
exp_cutoff <- quantile(df[,1], 0.35)
unEXP_cutoff <- quantile(df[,1], 0.3)
n_remaining <- 0
exp_gene <- c() #set a empty vector for expressed genes
unexp_gene <- c() #set a empty vector for un_expressed genes
for (i in (1:length(df[,1])))
{
if (df[i, 1] > exp_cutoff )
{
exp_gene <- c(exp_gene, rownames(df)[i])
}
else if (df[i, 1] < unEXP_cutoff)
{
unexp_gene <- c(unexp_gene, rownames(df)[i])
}
else
{
n_remaining <- n_remaining + 1
}
}
print(c(length(unexp_gene),
length(exp_gene),
n_remaining))
```
<br/>
####use 'which' to replace former script
```{r}
exp_gene2 <- rownames(df)[which(df[,1] > exp_cutoff)]
length(exp_gene2)
length(exp_gene)
```