11 Oct 08:24

SilasK

2871b8a

Latest

What's Changed

Qc reads, assembly are now written in the sample.tsv from the start. This should fix errors of partial writing to the sample.tsv #695
It also allows you to add external assemblies.
singletons reads are no longer used trough the pipeline.
This changes the default paths for raw reads and assemblies.
assembly are now in Assembly/fasta/{sample}.fasta
reads: QC/reads/{sample}_{fraction}.fastq.gz

Seemless update: If you update atlas and continue on an old project. Your old files will be copies. Or the path defined in the sample.tsv will be used.

Assets 2

17 Aug 15:10

SilasK

v2.18.0

5caa6ca

Co-binning

Co-binning with sub-groups

#683

In this new version, Atlas uses binning with co-abundance as default.
While binning each sample individually is faster, using co-abundance for binning, by quantifying the coverage of contigs across multiple samples provides valuable insights about contig co-variation.

Note

Previously each sample was put in its own BinGroup optimized for single-sample binning.
Running vamb in those versions would consider all samples, regardless of their BinGroup.
Hence updating to v2.18 might cause errors if using a sample.tsv file from an older Atlas version.
You can resolve this by assigning a unique BinGroup to each sample.

Link to documentation

Full Changelog: v2.17.2...v2.18.0

Assets 2

21 Jul 12:21

SilasK

v2.17.2

cdd2581

v2.17.2

Fixes

Ignore certificate for gtdb_v08 by @mladen5000 in #674
Fixed pandas dependency for instrain by @mladen5000 in #678
Convert mem_mb value from gb to mb for select rules by @LLansing in #681
ci: use micromamba by @SilasK in #682

Contributors

SilasK, mladen5000, and LLansing

Assets 2

15 Jun 13:11

SilasK

v2.17.0

da97853

Use skani for genome clustering

Skani

The tool Skani claims to be better and faster than the combination of mash + FastANI as used by dRep
I implemented the skin for species clustering.
We now do the species clustering in the atlas run binning step.
So you get information about the number of dereplicated species in the binning report. This allows you to run different binners before choosing the one to use for the genome annotation.
Also, the file storage was improved all important files are in Binning/{binner}/

My custom species clustering does the following steps:

Pre-cluster genomes with single-linkage at 92.5 ANI.
Re-calibrate checkm2 results.

If a minority of genomes from a pre-cluster use a different translation table they are removed
If some genomes of a pre-cluster don't use the specialed completeness model we re-calibrate completeness to the minimum value.
This ensures that not a bad genome evaluated on the general model is preferred over a better genome evaluated on the specific model.
See also https://silask.github.io/post/better_genomes/ Section 2.
Drop genomes that don't correspond to the filter criteria after re-calibration

Cluster genomes with ANI threshold default 95%
Select the best genome as representative based on the Quality score Completeness - 5x Contamination

New Contributors

@jotech made their first contribution in #667

Full Changelog: v2.16.3...v2.17.0

Contributors

jotech

Assets 2

1 Join discussion

17 May 13:35

SilasK

v2.16.2

196bb05

GTDB v8

Save GTDB v8 in download folder for GTDB v8 Thanky to @strejcem

Contributors

strejcem

Assets 2

12 May 13:54

SilasK

v2.16.1

af8322b

V2.16

What's Changed

fix gene_info.parquet by @SilasK in #642
docs: update gene catalog by @SilasK in #643
add minimum mapping quality in pileup by @johnne in #647
gtdb v8 by @SilasK in #648

New Contributors

@johnne made their first contribution in #647

Full Changelog: v2.15.2...v2.16.1

Contributors

johnne and SilasK

Assets 2

04 May 14:56

SilasK

v2.15.2

408aead

v2.15.2

What's Changed

Annotate gene catalog with Kegg, CAZy using DRAM
You can turn off GUNC

Full Changelog: v2.15.1...v2.15.2

Assets 2

13 Apr 22:42

SilasK

v2.15.0

8b46cdd

GUNC'n'More

What's Changed

Use Gunc
New Folder organisation: Main output files for Binning are in the new folder Binning
Use hdf-format for gene catalogs. Allow efficient storage and selective access to large count and coverage matrices from the genecatalog. (See docs for how to load them) #621
Semibin v. 1.5 by @SilasK in #622

Contributors

SilasK

Assets 2

03 Feb 13:56

SilasK

v2.14.0

c0b97a7

Use checkM2

What's Changed

Support for checkm2 by @SilasK in #607

Thank you @trickovicmatija for your help.

Full Changelog: v2.13.1...v2.14.0

Contributors

SilasK and trickovicmatija

Assets 2

25 Nov 13:04

SilasK

v2.13.0

42c4cbd

V2.13

What's Changed

use minimap for contigs, genecatalog and genomes in #569 #577
filter genomes my self in #568
The filter function is defined in the config file:

genome_filter_criteria: "(Completeness-5*Contamination >50 ) & (Length_scaffolds >=50000) & (Ambigious_bases <1e6) & (N50 > 5*1e3) & (N_scaffolds < 1e3)"

The genome filtering is similar as other publications in the field, e.g. GTDB. What is maybe a bit different is that genomes with completeness around 50% and contamination around 10% are excluded where as using the default parameters dRep would include those.

use Drep again in #579
We saw better performances using drep. This scales also now to ~1K samples
Use new Dram version 1.4 by in #564

Full Changelog: v2.12.0...v2.13.0

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Co-binning with sub-groups

Note

Fixes

Contributors

Skani

New Contributors

Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Releases: metagenome-atlas/atlas

All in the sample table

What's Changed

Co-binning

Co-binning with sub-groups

Note

v2.17.2

Fixes

Contributors

Use skani for genome clustering

Skani

New Contributors

Contributors

GTDB v8

Contributors

V2.16

What's Changed

New Contributors

Contributors

v2.15.2

What's Changed

GUNC'n'More

What's Changed

Contributors

Use checkM2

What's Changed

Contributors

V2.13

What's Changed