GitHub - Mycology-Microbiology-Center/GSMc: The Global Soil Mycobiome consortium dataset

The Global Soil Mycobiome consortium dataset

This repository contains the code associated with the paper:

Tedersoo L, Mikryukov V, Anslan S, Bahram M, Khalid AN, Corrales A, Agan A, Vasco-Palacios AM, Saitta A, Antonelli A, Rinaldi AC, Verbeken A, Sulistyo BP, Tamgnoue B, Furneaux B, Duarte Ritter C, Nyamukondiwa C, Sharp C, Marín C, Dai DQ, Gohar D, Sharmah D, Biersma EM, Cameron EK, De Crop E, Otsing E, Davydov EA, Albornoz FA, Brearley FQ, Buegger F, Gates G, Zahn G, Bonito G, Hiiesalu I, Hiiesalu I, Zettur I, Barrio IC, Pärn J, Heilmann-Clausen J, Ankuda J, Kupagme JY, Sarapuu J, Maciá-Vicente JG, Fovo JD, Geml J, Alatalo JM, Alvarez-Manjarrez J, Monkai J, Põldmaa K, Runnel K, Adamson K, Bråthen KA, Pritsch K, Tchan KI, Armolaitis K, Hyde KD, Newsham KK, Panksep K, Adebola LA, Lamit LJ, Saba M, da Silva Cáceres ME, Tuomi M, Gryzenhout M, Bauters M, Bálint M, Wijayawardene N, Hagh-Doust N, Yorou NS, Kurina O, Mortimer PE, Meidl P, Nilsson RH, Puusepp R, Casique-Valdés R, Drenkhan R, Garibay-Orijel R, Godoy R, Alfarraj S, Rahimlou S, Põlme S, Dudov SV, Mundra S, Ahmed T, Netherway T, Henkel TW, Roslin T, Fedosov VE, Onipchenko VG, Yasanthika WAE, Lim YW, Piepenbring M, Klavina D, Kõljalg U, and Abarenkov K. The Global Soil Mycobiome consortium dataset for boosting fungal diversity research // Fungal Diversity 111, 573-588 (2021) DOI:10.1007/s13225-021-00493-7.

01.Demultiplex.sh	Functions used to demultiplex PacBio sequencing runs
02.Extract_ITS.sh	Functions used for trimming the primers and extracting ITS region
03.Chimera_removal.sh	Sample-wise reference-based and de novo chimera removal
04.Prepare_UNITE_data.sh	Preparation of UNITE+INSDc data for clustering
05.Clustering.sh	OTU clustering and sequence mapping
06.OTU_representative_script.R 06.Select_new_OTU_representative.sh	Scripts for selecting alternative representative sequences
07.BLAST.sh	Functions used for taxonomic annotation

Data availability

The results of the analysis are available from the PlutoF data repository DOI 10.15156/BIO/2263453 and include an OTU table with corresponding sample and OTU (taxonomic and functional) metadata in spreadsheet and Biological Observation Matrix (BIOM) formats.

UNITE 9.01 beta dataset (used for reference-based chimera identification) available at https://doi.org/10.15156/BIO/1444285

Sequence database used for BLAST-based identification at the kingdom level: https://doi.org/10.15156/BIO/1444347

Dependencies

These scripts require a shell/Linux computing environment and R version 4.0.5.

The following software was used in the analysis:

VSEARCH v.2.17.0 (Rognes et al., 2016)

cutadapt v.3.4 (Martin 2011)

seqkit v.0.16.0 (Shen et al., 2016)

ITSxpress v.1.8.0 (Rivers et al. 2018)

ripgrep v.12.1.1

rush v.0.4.2

LIMA v.2.0.0 (PacBio)

csvtk v.0.23.0

BLAST+ v.2.11.0 (Camacho et al. 2009)

GNU bash v.5.0.17

GNU parallel v.20210422 (Tange, 2021)

GNU awk v.5.1.0

GNU find v.4.7.0

GNU sed v.4.8

R v.4.0.5 (R Core Team, 2021)

data.table v.1.14.0 (Dowle and Srinivasan, 2021)

Biostrings v.2.6.0 (Pagès et al., 2021)

plyr v.1.8.6 (Wickham, 2011)

Acknowledgements

The bulk of this work was supported by the Estonian Science Foundation (grants PRG632, PSG136, MOBTP198, PUT1170), Norway-Baltic financial mechanism (grant EMP442) and Novo Nordisk Fonden (Silva Nova).

Contact

For all code-related questions, please file a GitHub issue.

Please email [email protected] or [email protected] for any additional questions about the analytical methods used in this paper. All other relevant data are available from the authors upon request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Global Soil Mycobiome consortium dataset

Contents

Data availability

Dependencies

Acknowledgements

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
01.Demultiplex.sh		01.Demultiplex.sh
02.Extract_ITS.sh		02.Extract_ITS.sh
03.Chimera_removal.sh		03.Chimera_removal.sh
04.Prepare_UNITE_data.sh		04.Prepare_UNITE_data.sh
05.Clustering.sh		05.Clustering.sh
06.OTU_representative_script.R		06.OTU_representative_script.R
06.Select_new_OTU_representative.sh		06.Select_new_OTU_representative.sh
07.BLAST.sh		07.BLAST.sh
LICENSE		LICENSE
README.md		README.md

License

Mycology-Microbiology-Center/GSMc

Folders and files

Latest commit

History

Repository files navigation

The Global Soil Mycobiome consortium dataset

Contents

Data availability

Dependencies

Acknowledgements

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages