Skip to content

Farewell, SUMMARY.cp

Compare
Choose a tag to compare
@meren meren released this 09 Oct 17:25
· 17925 commits to master since this release

Apart from fixes for many tiny bugs, we have some major changes in v1.2.0!

By the way, the anvi'o paper is out, check it out: https://peerj.com/articles/1319/

Killing the SUMMARY.cp (with fire .. and HDF5) (#202)

  • We have been storing coverage and variability information for each contig in a separate, serialized Python object file in the output directory. Although it was pretty useful for speed purposes, with thousands and thousands of small files, this solution created a huge overhead on all file system related operations. I.e., it was a pain to download or copy/move anvi'o analyses. With this release we're switching to a much better solution using HDF5. From now on there will only be one file, by default called "AUXILIARY-DATA.h5" in the output directory. You can generate this new AUXILIARY-DATA.h5 file from your old SUMMARY.cp files. Please see anvi-script-generate-auxiliary-data-from-summary-cp script to see how to upgrade your previous analyses (it is really very easy!).

Other changes you should know about

  • We changed the way the variability data is generated (afd7a57) (which will affect the variability view in the interactive interface). Instead of the previous heuristic which summarized the single nucleotide variations in a given contig into a single number, we now use the "variation density" concept the way we defined in the anvi'o paper (which corresponds to number of reported variable nucleotide positions per kb; see Figure 3 in the publication to see how it is employed).
  • Anvi'o interactive interface can be run in an ad hoc manner with user-provided files instead of anvi'o databases. However, this mode did not support the storage of state files or collections from the resulting tree. Thanks to Lois Maignien's push, we changed the way anvi-interactive works. Now it does support the storage of states and collections (#203).
  • Now the taxonomy is properly reported in the summary outputs (thanks to Rika Anderson's report) (#201).
  • anvi-profile -i BAM_FILE.bam --list-contigs works again (thanks to Linda Amaral Zettler's report)(#200).
  • Now we have a TAB-delimited matrix for the overall summary of bins, and there is a download link for it in the summary output. Here how it looks in the infamous mini_test (#204):

image


Please don't install the new version using the archive files below. Instead, try one of these:

Installing from source | Installing via PyPi | OS X installer | Docker image