Skip to content

Releases: jtamames/SqueezeMeta

v1.6.3

20 Sep 07:54
Compare
Choose a tag to compare
  • Conda installations will now prioritize conda binaries instead of the vendored ones in some cases. This will hopefully fix certain issues in which SqueezeMeta was failing on certain distributions/versions.
  • test_install.pl now performs additional tests to check that binaries can be executed in the current environment.
  • Increased speed and reduced memory usage in step 10 (read counting).
  • Fixed an error in which projects created with the sequential mode would fail to restart. Note that each sample still has to be restarted individually.
  • Fixed an error in which step 16 (DAStool bin merging) would be attempted even if the --nobins flag was provided.
  • SQMtools: fixed an error in exportPathways when the requested KEGG map had only arrows.
  • SQMtools: fixed an error in which figures would not generated properly when `count='percent' was selected if any sample had 0 reads (as could happen when analyzing subsets).

v1.6.2post3

12 Jul 16:13
Compare
Choose a tag to compare
  • Update SPAdes to 3.15.5 so it works with python 3.10

v1.6.2post2

11 Jul 16:15
Compare
Choose a tag to compare
  • Upgrade to python 3.10 and improve conda packaging, hopefully fix #705 and be more future-proof

v1.6.2post1

03 May 18:26
Compare
Choose a tag to compare
  • Fix an issue in which pysam was not properly installed when installing SqueezeMeta through conda

v1.6.2

21 Mar 17:29
0647985
Compare
Choose a tag to compare

New features

  • Added spades-base as a possible assembler for SqueezeMeta. This will make SqueezeMeta call SPAdes with no additional flags. Flags for SPAdes can then customized by the user by passing --assembly_options "EXTRA OPTIONS" when calling SqueezeMeta. More information can be found in the ReadMe and the PDF manual.
  • Added the utility script sqm2zip.py, which allows to pack the essential files from a SqueezeMeta project into a single zip file.
  • SQMtools: loadSQM can now load a project directly from a zip file created by sqm2zip.py (syntax would be `loadSQM("/path/to/my_project.zip").
  • SQMtools: SQMtools is now available in CRAN and can be installed with install.packages("SQMtools") in Windows, Mac and Linux computers.
  • These changes are meant to allow users to easily transfer their data from their clusters/workstations to their personal computers and explore their results there.
  • SQMtools: mostAbundant and mostVariable now accept the argument bycol = TRUE, which will make these functions operate on columns rather than rows.

Minor changes / bugfixes

  • We now use coverage variances in addition to average contig coverages when calling metabat2, which should improve the quality of the resulting bins.
  • Mapping results are now stored as BAM files instead of SAM files, which should reduce disk usage.

Known issues / Other announcements

  • The make_databases.pl script may spend a lot of time in the "Creating SQLite databases" step. We have included a patch to improve this, but still it happens inconsistently (taking a few hours in some systems, and several days in others). Having a lot (1-2 Tb) of free disk space may help. download_databases.pl should be considered as the preferred way of quickly getting reasonably-up-to-date databases.
  • We are discontinuing official support for CentOS7, as its default libraries are too outdated now. We plan on supporting SqueezeMeta in Debian, WSL2-Ubuntu and (hopefully) CentOS Upstream in the not so distant future.

v1.6.1post1

07 Feb 12:34
Compare
Choose a tag to compare
  • Fix for yesterday's release, which did not include all the intended features.

v1.6.1

06 Feb 10:23
6643a92
Compare
Choose a tag to compare

New features

  • Added the seqvec2fasta function to SQMtools. It will print a named vector containing sequences (as the ones used to store contig and ORF sequences in SQM$contigs$seqs and SQM$orfs$seqs as a single fasta-formatted string.
  • The make_databases.pl, download_databases.pl and configure_nodb.pl scripts now perform more error checking after each database creation step, and will call test_install.pl before finishing. This should help detect the instances in which database creation was unsuccessful e.g. due to a failed download.

Minor changes / bugfixes

  • Fixed a bug in remap.pl.
  • Fixed a bug introduced in v1.6.0 in which trimmomatic was not being called even when the --cleaning flag was provided.
  • Fixed a bug in which single reads were causing problems during assembly.
  • Fixed a bug in which cover.pl was using the system's perl interpreter instead the one in the user environment.
  • Improved SQL queries in make_databases.pl to hopefully speed up database creation.
  • Fixed an issue in which mothur dependencies were not correctly fulfilled by conda.
  • Fixed an issue in which restarting a sequential project failed at step 4.
  • Fixed several minor issues with the restart mode.
  • Fixed remove_duplicate_markers.pl so it works in the new binning structure.
  • Fixed an issue in which SPAdes was using only 400G of memory even if more was available in the system.
  • engine="data.table and tax_mode="prokfilter" are now the default options in loadSQM.
  • Fixed an issue in which subsetSamples corrupted the binning information, making it impossible to further subset the resulting object.
  • The PDF SQMtools manual is back. Future availability will depend on whether I can keep getting R's clunky latex interface to produce PDF's in which the tables are rendered correctly.

Known issues

  • The make_databases.pl may spend a lot of time in the "Creating SQLite databases" step. We have included a patch to improve this, but still it happens inconsistently (taking a few hours in some systems, and several days in others). Having a lot (1-2 Tb) of free disk space may help. download_databases.pl should be considered as the preferred way of quickly getting reasonably-up-to-date databases.

v1.6.0 - One egg for many baskets

10 Sep 07:39
21ce1ff
Compare
Choose a tag to compare

New features

  • The script restart.pl has been removed. Project restart is now achieved by calling SqueezeMeta.pl --restart -p <project_name>. The flags -step <STEP> --force-overwrite can be added to this call in order to restart the pipeline from a specific step.
  • Users can now control whether the source of bin taxonomy is the LCA algorithm from SqueezeMeta, or the taxonomic assignment performed by CheckM. This can be controlled with the flag -taxbinmode. Options are s (SqueezeMeta only, default), c (CheckM), s+c (SqueezeMeta, missing ranks will be completed with CheckM taxonomy when possible) or c+s (CheckM, missing ranks will be completed with SqueezeMeta taxonomy when possible).
  • Users can now control the minimum percentage of genes from the same taxa needed in order to taxonomically annotate a contig. This can be done with the flag -consensus .
  • sqm_longreads.pl will now consider partial hits completely contained inside a long read as valid hits. Before, partial hits were only considered valid if they occurred at the beginning or end of the reads. This has a noticeable impact in the annotation percentages. The old behaviour can be reinstated with the flags -n or -nopartialhits.
  • sqm2pavian.pl now works with results from sqm_reads.pl and sqm_longreads.pl.
  • Added the option --filter to sqm_mapper.pl. When this flag is present, the script will filter a set of input sequences, returning only the ones that did not map to the reference.
  • SQMtools: SQM objects now track the length, abundance, mapped bases, coverage and coverage per million reads of bins. The corresponding matrices can be found under the SQM$bins list. When running subsetContigs, these values will be updated taking in consideration only the contigs from each bin that were selected.
  • SQMtools: added the subsetSamples function to generate subsetted SQM objects containing only the requested samples.
  • SQMtoools: added the plotBins function to generate barcharts with the distribution of bins across samples.
  • SQMtools: unmapped reads for functions are no longer tracked, since it led to inconsistent results in some cases (see #442). This also affects the tables generated by sqm2tables.py.
  • SQMtools: added the mostVariable function, which will return the most variable rows (based on their coefficient of variation) from a data.frame or matrix. The interface is otherwise similar to the mostAbundant function.
  • SQMtools: SQM objects now track the coverage per million of reads of orfs, contigs, bins and functions. Each can be accessed inside the corresponding list under the cpm name. "cpm" is also a valid count option for plotFunctions and plotBins.

Minor changes / bugfixes

  • SQMtools will from now on follow the same version numbers as the corresponding SqueezeMeta releases.
  • Updated DIAMOND version to 2.0.15.
  • Fixed a bug when adding taxonomic assignments to bins, in which a lack of consensus in a high level prevented looking for consensus at deeper levels.
  • Fixed a bug in which data.table may make DAStool crash if it was called with a very high number of threads.
  • Fixed a bug in which both reads of a pair were counted as mapped even if only one of them actually mapped to the reference. This had little impact in real datasets, but is corrected now.
  • Fixed a bug in which custom arguments passed to bowtie2 with -mapping_options conflicted in some cases with the --very-sensitive-local option that we use by default when calling bowtie2. --very-sensitive-local is now skipped when the user provides custom arguments to bowtie2.
  • Fixed an uncommon issue in which contigs could end up being assigned to more than one bin after restarting the pipeline.
  • Fixed a bug in sqm_longreads.pl when using several input files from the same sample.
  • loadSQM now removes redundant info from the orfs and contigs tables when loading a project into SQMtools resulting in less memory usage.
  • Fixed a bug in which loading a project with loadSQM could randomly caused an error.
  • We no longer provide a PDF manual for SQMtools. The documentation for each function can still be accessed from the R terminal or RStudio.

Compatibility Changes

  • Results generated by previous versions of SqueezeMeta will not load into SQMtools 1.6.0 (which corresponds to SqueezeMeta release 1.6.0). Running 19.getcontigs.pl /path/to/project will make a project generated with SqueezeMeta v1.5 compatible with the new version of SQMtools.

v1.5.2

12 Apr 11:14
Compare
Choose a tag to compare

Minor changes / bugfixes

  • Fixed a bug in consensus taxonomy search during binning, in which a bin could get assigned to a low taxonomic rank even if there was no consensus at higher taxonomic ranks.
  • Updated DIAMOND version to 2.0.14. This should get rid of several cases in which search against the nr database resulted in out of memory errors.
  • Fixed a typo in the PDF manual in which Figure 6 was missing

v1.5.1

20 Jan 17:40
Compare
Choose a tag to compare

Minor changes / bugfixes

  • Fixes #417, in which flye was missing some necessary binaries