Skip to content

Releases: metagenome-atlas/atlas

Use checkM2

03 Feb 13:56
c0b97a7
Compare
Choose a tag to compare

What's Changed

Thank you @trickovicmatija for your help.

Full Changelog: v2.13.1...v2.14.0

V2.13

25 Nov 13:04
Compare
Choose a tag to compare

What's Changed

  • use minimap for contigs, genecatalog and genomes in #569 #577
  • filter genomes my self in #568
    The filter function is defined in the config file:
genome_filter_criteria: "(Completeness-5*Contamination >50 ) & (Length_scaffolds >=50000) & (Ambigious_bases <1e6) & (N50 > 5*1e3) & (N_scaffolds < 1e3)"

The genome filtering is similar as other publications in the field, e.g. GTDB. What is maybe a bit different is that genomes with completeness around 50% and contamination around 10% are excluded where as using the default parameters dRep would include those.

  • use Drep again in #579
    We saw better performances using drep. This scales also now to ~1K samples
  • Use new Dram version 1.4 by in #564

Full Changelog: v2.12.0...v2.13.0

v2.12.0

07 Oct 12:48
3d8e200
Compare
Choose a tag to compare

What's Changed

  • GTDB-tk requires rule extract_gtdb to run first by @Waschina in #551
  • use Galah instead of Drep
  • use bbsplit for mapping to genomes (maybe move to minimap in future)
  • faster gene catalogs quantification using minimap.
  • Compatible with snakemake v7.15

New Contributors

Full Changelog: v2.11.1...v2.12.0

Fix Enormous gene catalog

09 Sep 09:19
Compare
Choose a tag to compare

Due to an bug, the genecatalog was created based on all gene not only the representatives in v.2.11

If you have an oversized gene catalog:
Rerun:

atlas run genecatalog -R generate_orf_info

Small change in Dram environment to fix #547

Use parquet and pyfastx to handle large gene catalogs

05 Aug 20:43
Compare
Choose a tag to compare

What's Changed

  • Make atlas handle large gene catalogs using parquet and pyfastx (Fix #515)

parquet files can be opened in python with

import pandas as pd
coverage = pd.read_parquet("working_dir/Genecatalog/counts/median_coverage.parquet")
coverage.set_index("GeneNr", inplace=True)

and in R it should be something like:

arrow::read_parquet("working_dir/Genecatalog/counts/median_coverage.parquet")

Full Changelog: v2.10.0...v2.11.0

GTDB v 207 low memory profiling

26 Jul 13:54
591446d
Compare
Choose a tag to compare

New Features

  • GTDB version 207
  • Low memory taxonomic annotation

Minor changes

Full Changelog: v2.9.1...v2.10.0

Go Public

04 Apr 09:43
Compare
Choose a tag to compare

What's Changed

  • ✨ Start an atlas project from public data in SRA Docs
  • Make atlas ready for python 3.10 #498
  • Add strain profiling using inStrain You can run atlas run genomes strains

New Contributors

  • @alienzj made their first contribution to fix config when run DRAM annotate in #495

Full Changelog: v2.8.2...v2.9.0

V2.8 - Toiminnot

01 Nov 13:21
Compare
Choose a tag to compare

This is a major update of metagenome-atlas. It was developed for the 3-day course in Finnland, that's also why it has a finish release name.

What is new?

New binners

It integrates bleeding-edge binners Vamb and SemiBin that use Co-binning based on co-abundance. Thank you @yanhui09 and @psj1997 for helping with this. The first results show better results using these binners over the default.

See more

Pathway annotations

The command atlas run genomes produces genome-level functional annotation and Kegg pathways respective modules. It uses DRAM from @shafferm with a hack to produce all available Kegg modules.

See more

Genecatalog

The command atlas run gene catalog now produces directly the abundance of the different genes. See more in #276

In future this part of the pipeline will include protein assembly to better tackle complicated metagenomes.

Minor updates

Reports are back

See for example the QC report

Update of all underlying tools

All tools use in atlas are now up to date. From assebler to GTDB.
The one exception is, BBmap which contains a bug and ignores the minidenty parameter.

Atlas init

Atlas init correctly parses fastq files even if they are in subfolders and if paired-ends are named simply Sample_1/Sample_2. @Sofie8 will be happy about this.
Atlas log uses nice colors.

Default clustering of Subspecies

The default ANI threshold for genome-dereplication was set to 97.5% to include more sub-species diversity.

See more

Python 3.8, new ruaml

03 Sep 07:16
Compare
Choose a tag to compare

Bug fixes Drep

03 Aug 10:09
Compare
Choose a tag to compare
2.6a3

Set jobs to default