My personal curated list of useful R packages and extensions for my work as researcher in life science with focus on (pharmaco)genetics, clinical studies, OMICS and shiny apps.
- Inspired by: awesome-react-components and awesome-rshiny.
Packages I load in the beginning of almost every session
- tidyverse - bundel of useful packages
- janitor - tool for cleaning and examin "dirty" data
- readxl - get easily and fastly data out of Excel into
R
- glue - enhanced string manipulation
- furrr -
purrr
's mapping functions but in parallel mode - fst - saving and loading data extremely fast
Science without pvalues is barly not possible
- rstatix - (tidyverse) pipe-friendly framework for basic statistical tests
- gtsummary - elegant and flexible way to create publication-ready analytical and summary tables
- pROC - tools for visualizing, smoothing and comparing ROC curves
- quantreg - quantile regression for non-parametric data
- coin - functions to transform data and a lot of tests
- breakerofchains - Run your chain until the cursor line. Add the addin by using a shortcut like 'Ctrl + shift + b'.
- haplo.stats - analysis of haplotypes.
- SNPassoc - useful for small genetic association studies
- SKAT - gene-based association tests and other burden tests
- ggfastman - plotting tons of pvalues using manhatten plots
- pathfindR - pathway enrichment analysis via active subnetworks
- caret - toolset for classification and regression models
- caretEnsemble - ensembles of caret models allows also to use bootstrapping
- superlearner - easily estimate the performance of multiple machine learning models using cross validation
ggplot extension for further and easy customisation
- ggpubr - easy-to-use functions for creating ‘ggplot2’- based publication ready plots including statistical add ons.
- ggsignif - visualization of statistical differences
- ggbeeswarm - beeswarm plots aka scatter plots or violin/boxplot plots with points
- cowplot -creating publication-quality figures
- ggrepel - annotations without overlapp
- ggannotate - point-and-click tool to annotate plots in the last production step
- plotly - interactive plots
- ggtex - Add nice text, lables and boxes to your ggplots
there are better lists as linked above, but I use following packages very often
- shinydashboard - easy way to create dashboards
- shinydashboardPlus - nice features beyond the original dashboard
- shinyWidgets - additional and nicer inputs
- shinyjs - useful to integrate JavaScript in shiny code
- bslib - customizing bootstrap themes for shiny
packages not fitting 100% in the categories above and other commandline tools
- clinPK - functions for clinical pharmacokinetics and clinical pharmacology
- valr - compare and manipulate genome intervals
- gplots - useful for heatmaps and venn diagrams
- disgenet2r - information about the genetic basis of human diseases
- plink - whole genome association analysis toolset
- snptest - genome-wide association analysis of SNPs
- regenie - whole genome regression modelling of massive large GWAS
- vep - genetic variant annotation including effects and functions
- SNpeff - genomic variant annotations and functional effect prediction toolbox
- pypgx - pharmacogenomics profiles from NGS & SNP array data
- aldy - pharmacogenomics profiles from NGS data
- samtools - toolbox for high-throughput sequencing data
- openai - API to access GPT-3
- celltypist - automated cell type annotation for scRNA-seq
- salmon - transcript quantification from RNA-seq data3
- impute imputation server
- gatk - genome analysis toolkit
- conda - environment management
- slurm - workload management
- nfcore - Easy to use bfx analysis pipelines built using Nextflow