DRAM 1.4.4-Point Release #250
rmFlynn
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is the official release of DRAM1.4.4. The 1.4.0 release has significant changes that could impact your research. The 1.4.4 point release is less significant, but still important for dram-v and dram users. Please review these changes and help us validate this release!
Install / upgrade:
If DRAM is installed with Bioconda, and then it can be upgraded like any Conda package. Note that the conda package for dram may be delayed slightly while it is validated, but it should be available within a day or two of the release.
If you already have a DRAM environment and want to upgrade:
If you are using an old database, like in the example above, you may need to check out a special version of dram from GitHub.
To install the DRAM in a new Conda environment, follow the instructions in the README.
Change log DRAM1.4.4 addendum:
Change log DRAM1.4.0:
DRAM distill now includes a new metabolism for methylation. Although planned for DRAM2 you can already include this tool in annotation and distillation provided you follow the instructions below.
In order to distill with methyl, you need only download the new FASTA file and point to it with the dram custom database options that were introduced in DRAM1.3. Note that in order to distill correctly, you will need to use the correct name ‘methyl’ and must use DRAM 1.4.
To Annotate with methyl, do something like:
To Distill with methyl:
Learn more about custom databases, in the Wiki.
Glycoside hydrolase subfamily calls, subfamily calls are now being incorporated into annotations with changes in databases and code; this impacts what gets pulled into the distillate and product because these are looking for family level (e.g.
AA1
) not subfamily level (e.g.AA1_1
,AA2_2
).In response, DRAM is changing the output of the dbCAN database in DRAM1.4. Raw- cazyme subfamilies will be output into the
cazy_id
column, and the corresponding description for the cazyme family will be put into thecazy_hit
column.The Distillation in DRAM1.4 will count cazymes marked at subfamily level on the family level; this means for cazyme family
AA1
there will be 4 entries in the distillateAA1
,AA1_1
,AA1_2
, andAA1_3
and the sum of these four will be the total number ofAA1
cazymes. In DRAM1.3 and previous, the distillate for this exampleAA1
with no underscore would include cazymes that can be assigned to family AA1, but do not have a subfamily designation.The DRAM Product will also count cazymes at the family level. For the
AA1
example,AA1_1
,AA1_2
, andAA1_3
will be counted asAA1
for the current rules in assigning cazymes to compounds.More changes are also being made that will affect CAZY IDs in DRAM1.4. The cutoff e-value is being changed to 1e-18 to conform to best practices for the database.
DRAM1.4 also introduced a new column for best hit per gene from dbCAN database named
cazy_best_hit
. This column will be the match to the gene that has the highest coverage and lowest full-sequence e-value as calculated by mmseqs, with priority on e-value.Cazy_best_hit
will be the only column considered downstream in the distillate and product. DRAM1.3 pulls and counts all dbCAN hits above e-value 1e-15, rather than profiling best hits.New column corresponding to EC number information from subfamilies, named cazy_subfamily_ec has been added in DRAM1.4. These EC numbers will also be used as part of the distillate along with those from kegg, as part of pathways and other tools. For now, incomplete EC numbers will be included, but not considered for the distillate. The subfamilies will be excluded from the product in order to facilitate its goals of being a larger overview.
Logging is now fully implemented in DRAM1.4. Log files will be created for almost all DRAM functions. The log file for annotations will appear in the annotations' folder by default, and the log file for the dram distillation will by default be in the distillation folder. You can also use the
--log_file_path
argument to set the log path. A log file for database processing is set by the config file, and by default it will be in the databases' directory. All content that DRAM prints to the command line will appear in the log file .The dram config now stores when databases were downloaded, citation information and version information when applicable. This information is printed to the log at the beginning of each run. The old format can still be imported if you want to keep your DRAM1.3 databases.
In 1.4 you can set a config file to use in dram annotation and distillation at run time in 2 ways. (1) use --config_loc with DRAM.py or DRAM-v.py or (2) set the environment variable DRAM_CONFIG_LOCATION. This will not store or import the config, and that config will only be used for that run.
Significant Bug fixes are also included in this release.
Known issues:
This discussion was created from the release DRAM 1.4.4-Point Release.
Beta Was this translation helpful? Give feedback.
All reactions