Skip to content

MrunmayS/bioRxiv-analyzerR

Repository files navigation

bioRxiv-analyzerR

The package aims to extract data from the citations downloaded from the citations manager of bioRxiv, which is a pre-print server for biology research

Prerequisites

The following packages must be installed before hand for the program to work

  • stringr
  • dplyr
  • stringi
  • tm
  • SnowballC
  • wordcloud
  • RColorBrewer
  • NLP
  • topicmodels
  • tidytext
  • reshape2
  • ggplot2
  • pals
  • Rcpp
  • igraph

This could be done by running:

install.packages("stringr", "dplyr", "stringi", "tm" , "SnowballC", "wordcloud", "RColorBrewer", "NLP", "topicmodels", "tidytext", "reshape2", "ggplot2", "Rcpp", "igraph")

To Extract the Data from the text file

Use the extractfunction by passing in the name of the file in form of the string.

 
df <- extract("citations.txt")
 

It will save everything into the assigned variable

To calculate DTM(Document-Term Matrix)

 
dtm <- calculatedtm(df$Abstract)
 

Pass in the Abstract coloumn from the dataframe you created to calculate the DTM with common words removed.

To calculate Frequency Table

 
freqtable <- calculatefreq(dtm)
 

To make a frequency table pass in the dtm found before and into the function

To make a word cloud and bar plot

Frequncy table made in the previous function has been used for this

 
makewordcloud(freqtable)

makebarplot(freqtable)
 

To make topic model graph

Just pass in the abstract coloumn and the function will do the job with topics = 10

 
maketopicmodel(df$Abstract)
 

To create topics

To create K number of topics pass in

 
topics <- createtopics(dfn$Abstract, K)
 

K is set to 10 by default

To make links using topics

Pass in the topics, in the funtion to make network of linked topics

 
text_link(topics)
 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages