bioRxiv-analyzerR

The package aims to extract data from the citations downloaded from the citations manager of bioRxiv, which is a pre-print server for biology research

Prerequisites

The following packages must be installed before hand for the program to work

stringr
dplyr
stringi
tm
SnowballC
wordcloud
RColorBrewer
NLP
topicmodels
tidytext
reshape2
ggplot2
pals
Rcpp
igraph

This could be done by running:

install.packages("stringr", "dplyr", "stringi", "tm" , "SnowballC", "wordcloud", "RColorBrewer", "NLP", "topicmodels", "tidytext", "reshape2", "ggplot2", "Rcpp", "igraph")

To Extract the Data from the text file

Use the extractfunction by passing in the name of the file in form of the string.

 
df <- extract("citations.txt")

It will save everything into the assigned variable

To calculate DTM(Document-Term Matrix)

 
dtm <- calculatedtm(df$Abstract)

Pass in the Abstract coloumn from the dataframe you created to calculate the DTM with common words removed.

To calculate Frequency Table

 
freqtable <- calculatefreq(dtm)

To make a frequency table pass in the dtm found before and into the function

To make a word cloud and bar plot

Frequncy table made in the previous function has been used for this

 
makewordcloud(freqtable)

makebarplot(freqtable)

To make topic model graph

Just pass in the abstract coloumn and the function will do the job with topics = 10

 
maketopicmodel(df$Abstract)

To create topics

To create K number of topics pass in

 
topics <- createtopics(dfn$Abstract, K)

K is set to 10 by default

To make links using topics

Pass in the topics, in the funtion to make network of linked topics

 
text_link(topics)

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.RData		.RData
.RDataTmp		.RDataTmp
.Rhistory		.Rhistory
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
cellline2pubtator.Rda		cellline2pubtator.Rda
citations.txt		citations.txt
data_common_words.RData		data_common_words.RData
extract.r		extract.r
list.Rda		list.Rda
list.txt		list.txt
main.r		main.r
word_table.txt		word_table.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bioRxiv-analyzerR

The package aims to extract data from the citations downloaded from the citations manager of bioRxiv, which is a pre-print server for biology research

Prerequisites

To Extract the Data from the text file

To calculate DTM(Document-Term Matrix)

To calculate Frequency Table

To make a word cloud and bar plot

To make topic model graph

To create topics

To make links using topics

About

Releases

Packages

Languages

MrunmayS/bioRxiv-analyzerR

Folders and files

Latest commit

History

Repository files navigation

bioRxiv-analyzerR

The package aims to extract data from the citations downloaded from the citations manager of bioRxiv, which is a pre-print server for biology research

Prerequisites

To Extract the Data from the text file

To calculate DTM(Document-Term Matrix)

To calculate Frequency Table

To make a word cloud and bar plot

To make topic model graph

To create topics

To make links using topics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages