Skip to content

DenisaConstantin/directRNAseqAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

directRNAseqAnalysis

These two custom scripts are designed for the analysis of data generated through ONT direct RNA sequencing.

Script 1: Data Processing and Transcript Identification

The first script, script1.sh, processes reads stored in FASTQ files using a provided reference transcript file. The script must be executed in a directory containing only the relevant input files. The run command is as follows:

./script1.sh transcript_file.fasta

Dependencies

This script requires the following tools:

NanoPlot for quality control (https://github.com/wdecoster/NanoPlot) minimap2 for read alignment (https://github.com/lh3/minimap2) samtools for SAM/BAM file processing (https://github.com/samtools/samtools) NanoCount for transcript quantification (https://github.com/a-slide/NanoCount)

Workflow

Read Alignment: Reads are aligned to the reference transcripts using minimap2. File Conversion: The resulting SAM file is converted into a sorted and indexed BAM file using samtools. Transcript Quantification: The alignment is analyzed with NanoCount, which counts transcripts and identifies genes through a series of bash commands. Quality Control: NanoPlot automatically assesses the quality of each read file; however, no quality threshold is applied, and all reads are included in the analysis.

Output

The script produces a results table named Results.csv with the following columns:

Column 1: Name of the read file Column 2: Number of identified transcripts Column 3: Number of identified genes

Script 2: Group Comparisons

The second script, script2.sh, compares transcript data between two user-provided files generated by script1.sh with the .gene.txt extension. The run command is as follows:

./script2.sh groupA.gene.txt groupB.gene.txt

Output

This script generates three output files:

Transcripts unique to the first group (groupA). Transcripts unique to the second group (groupB). Transcripts common to both groups.

Limitations

Currently, both scripts are designed to work exclusively with reference transcript files downloaded from FlyBase.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages