Skip to content

TUM-Core-Facility-Microbiome/tagmen

Repository files navigation

Tagmen

Steps

0) Install dependencies

Script dependencies

Dependencies of the scripts are listed in the requirements.txt file. Those python dependencies may be installed by running the following pip command.

pip install -r requirements.txt

Additional dependencies

This tool uses cd-hit-est for clustering nucleotide sequences, so also make sure cd-hit is installed. cd-hit is e.g. available from bioconda, or the debian package manager.

1) Extract 15N sequences from RT files

Move all RT sample files you want to analyze to a folder.

Extract sequences:

./readtag.py <INPUT_FOLDER_RT> <OUTPUT_FOLDER_RT>

This will generate the output folder.

2) Run CD-HIT-EST

2a) Generate FASTA file from 15N.tsv file

cd <OUTPUT_FOLDER_RT>
grep -v '#' 341-RT-T1_S341_L001.15N.tsv | awk '{OFS="\t"; print ">"$1"\n"$2}' > 341-RT-T1_S341_L001.15N.fasta

2b) Run CD-Hit-EST

cd <OUTPUT_FOLDER_RT>
cd-hit-est -d 0 -i 341-RT-T1_S341_L001.15N.fasta -o 341-RT-T1_S341_L001.15N.cdhit

2c) Run parse clusters

./parse_cdhit_clusters.py <OUTPUT_FOLDER_RT>/341-RT-T1_S341_L001.15N.tsv <OUTPUT_FOLDER_RT>/341-RT-T1_S341_L001.15N.cdhit <OUTPUT_FOLDER_RT>/341-RT-T1_S341_L001.15N.cdhit.clstr

3) Extract 15N pairs from LT files

Move all LT sample files you want to analyze to a folder.

./linktag.py <INPUT_FOLDER_LT> <OUTPUT_FOLDER_LT>

4) Aggregate results

./aggregate.py <OUTPUT_FOLDER_LT>/337-LT-T1_S337_L001.15Npairs.tsv <INPUT_FOLDER_RT> <OUTPUT_FOLDER_RT>/341-RT-T1_S341_L001.15N.clusters.tsv <OUTPUT_FOLDER_AGGREGATE>

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages