MultiQC Version 1.9
Another massive release - many thanks to all of the contributors! Keep those pull-requests and issues coming!
Dropped official support for Python 2
Python 2 had its official sunset date
on January 1st 2020, meaning that it will no longer be developed by the Python community.
Part of the python.org statement reads:
That means that we will not improve it anymore after that day,
even if someone finds a security problem in it.
You should upgrade to Python 3 as soon as you can.
Very many Python packages no longer support Python 2
and it whilst the MultiQC code is currently compatible with both Python 2 and Python 3,
it is increasingly difficult to maintain compatibility with the dependency packages it
uses, such as MatPlotLib, numpy and more.
As of MultiQC version 1.9, Python 2 is no longer officially supported.
Automatic CI tests will no longer run with Python 2 and Python 2 specific workarounds
are no longer guaranteed.
Whilst it may be possible to continue using MultiQC with Python 2 for a short time by
pinning dependencies, MultiQC compatibility for Python 2 will now slowly drift and start
to break. If you haven't already, you need to switch to Python 3 now.
New MultiQC Features
- Now using GitHub Actions for all CI testing
- Dropped Travis and AppVeyor, everything is now just on GitHub
- Still testing on both Linux and Windows, with multiple versions of Python
- CI tests should now run automatically for anyone who forks the MultiQC repository
- Linting with
--lint
now checks line graphs as well as bar graphs - New
gathered
template with no tool name sections (#1119) - Added
--sample-filters
option to add show/hide buttons at the top of the report (#1125)- Buttons control the report toolbox Show/Hide tool, filtering your samples
- Allows reports to be pre-configured based on a supplied list of sample names at report-generation time.
- Line graphs can now have
Log10
buttons (same functionality as bar graphs) - Importing and running
multiqc
in a script is now a little Bettermultiqc.run
now returns thereport
andconfig
as well as the exit code. This means that you can explore the MultiQC run time a little in the Python environment.- Much more refactoring is needed to make MultiQC as useful in Python scripts as it could be. Watch this space.
- If a custom module
anchor
is set usingmodule_order
, it's now used a bit more:- Prefixed to module section IDs
- Appended to files saved in
multiqc_data
- Should help to prevent duplicates requiring
-1
suffixes when running a module multiple times
- New heatmap plot config options
xcats_samples
andycats_samples
- If set to
False
, the report toolbox options (highlight, rename, show/hide) do not affect that axis. - Means that the Show only matching samples report toolbox option works on FastQC Status Checks, for example (#1172)
- If set to
- Report header time and analysis paths can now be hidden
- New config options
show_analysis_paths
andshow_analysis_time
(#1113)
- New config options
- New search pattern key
skip: true
to skip specific searches when modules look for a lot of different files (eg. Picard). - New
--profile-runtime
command line option (config.profile_runtime
) to give analysis of how long the report takes to be generated- Plots of the file search results and durations are added to the end of the MultiQC report as a special module called Run Time
- A summary of the time taken for the major stages of MultiQC execution are printed to the command line log.
- New table config option
only_defined_headers
- Defaults to
true
, set tofalse
to also show any data columns that are not defined as headers - Useful as allows table-wide defaults to be set with column-specific overrides
- Defaults to
- New
module
key allowed forconfig.extra_fn_clean_exts
andconfig.fn_clean_exts
- Means you can limit the action of a sample name cleaning pattern to specific MultiQC modules (#905)
New Custom Content features
- Improve support for HTML files - now just end your HTML filename with
_mqc.html
- Native handling of HTML snippets as files, no MultiQC config or YAML file required.
- Also with embedded custom content configuration at the start of the file as a HTML comment.
- Add ability to group custom-content files into report sections
- Use the new
parent_id
,parent_name
andparent_description
config keys to group content together like a regular module (#1008)
- Use the new
- Custom Content files can now be configured using
custom_data
, without giving search patterns or data
New Modules:
- DRAGEN
- Illumina Bio-IT Platform that uses FPGA for secondary NGS analysis
- iVar
- Added support for iVar: a computational package that contains functions broadly useful for viral amplicon-based sequencing.
- Kaiju
- Fast and sensitive taxonomic classification for metagenomics
- Kraken
- K-mer matching tool for taxonomic classification. Module plots bargraph of counts for top-5 hits across each taxa rank. General stats summary.
- MALT
- Megan Alignment Tool: Metagenomics alignment tool.
- miRTop
- Command line tool to annotate miRNAs with a standard mirna/isomir naming (mirGFF3)
- Module started by @oneillkza and completed by @FlorianThibord
- MultiVCFAnalyzer
- Combining multiple VCF files into one coherent report and format for downstream analysis.
- Picard - new submodules for
QualityByCycleMetrics
,QualityScoreDistributionMetrics
&QualityYieldMetrics
- See #1116
- Rockhopper
- RNA-seq tool for bacteria, includes bar plot showing where features map.
- Sickle
- A windowed adaptive trimming tool for FASTQ files using quality
- Somalier
- Relatedness checking and QC for BAM/CRAM/VCF for cancer, DNA, BS-Seq, exome, etc.
- VarScan2
- Variant calling and somatic mutation/CNV detection for next-generation sequencing data
Module updates:
- BISCUIT
- Major rewrite to work with new BISCUIT QC script (BISCUIT
v0.3.16+
)- This change breaks backwards-compatability with previous BISCUIT versions. If you are unable to upgrade BISCUIT, please use MultiQC v1.8.
- Fixed error when missing data in log files (#1101)
- Major rewrite to work with new BISCUIT QC script (BISCUIT
- bcl2fastq
- Samples with multiple library preps (i.e barcodes) will now be handled correctly (#1094)
- BUSCO
- Updated log search pattern to match new format in v4 with auto-lineage detection option (#1163)
- Cutadapt
- New bar plot showing the proportion of reads filtered out for different criteria (eg. too short, too many Ns) (#1198)
- DamageProfiler
- Removes redundant typo in init name. This makes referring to the module's column consistent with other modules when customising general stats table.
- DeDup
- Updates plots to make compatible with 0.12.6
- Fixes reporting errors - barplot total represents mapped reads, not total reads in BAM file
- New: Adds 'Post-DeDup Mapped Reads' column to general stats table.
- FastQC
- FastQ Screen
- fgbio
- New: Plot error rate by read position from
ErrorRateByReadPosition
- GroupReadsByUmi plot can now be toggled to show relative percents (#1147)
- New: Plot error rate by read position from
- FLASh
- Logs not reporting innie and outine uncombined pairs now plot combined pairs instead (#1173)
- GATK
- Made parsing for VariantEval more tolerant, so that it will work with output from the tool when run in different modes (#1158)
- MTNucRatioCalculator
- Fixed misleading value suffix in general stats table
- Picard MarkDuplicates
- Major change - previously, if multiple libraries (read-groups) were found then only the first would be used and all others ignored. Now, values from all libraries are merged and
PERCENT_DUPLICATION
andESTIMATED_LIBRARY_SIZE
are recalculated. Libraries can be kept as separate samples with a new MultiQC configuration option -picard_config: markdups_merge_multiple_libraries: False
- Major change - Updated
MarkDuplicates
bar plot to double the read-pair counts, so that the numbers stack correctly. (#1142)
- Major change - previously, if multiple libraries (read-groups) were found then only the first would be used and all others ignored. Now, values from all libraries are merged and
- Picard HsMetrics
- Picard WgsMetrics
- Updated parsing code to recognise new java class string (#1114)
- QualiMap
- RSeqC
- RNASeQC2
- Updated to handle the parsing metric files from the newer rewrite of RNA-SeqQC.
- Samblaster
- Improved parsing to handle variable whitespace (#1176)
- Samtools
- Removes hardcoding of general stats column names. This allows column names to indicate when a module has been run twice (#1076).
- Added an observed over expected read count plot for
idxstats
(#1118) - Added additional (by default hidden) column for
flagstat
that displays number total number of reads in a bam
- sortmerna
- Fix the bug for the latest sortmerna version 4.2.0 (#1121)
- sexdeterrmine
- Added a scatter plot of relative X- vs Y-coverage to the generated report.
- VerifyBAMID
- Allow files with column header
FREEMIX(alpha)
(#1112)
- Allow files with column header
Bug Fixes:
- Added a new test to check that modules work correctly with
--ignore-samples
. A lot of them didn't:Mosdepth
,conpair
,Qualimap BamQC
,RNA-SeQC
,GATK BaseRecalibrator
,SNPsplit
,SeqyClean
,Jellyfish
,hap.py
,HOMER
,BBMap
,DeepTools
,HiCExplorer
,pycoQC
,interop
- These modules have now all been fixed and
--ignore-samples
should work as you expect for whatever data you have.
- Removed use of
shutil.copy
to avoid problems with working on multiple filesystems (#1130) - Made folder naming behaviour of
multiqc_plots
consistent withmultiqc_data
- Incremental numeric suffixes now added if folder already exists
- Plots folder properly renamed if using
-n
/--filename
- Heatmap plotting function is now compatible with MultiQC toolbox
hide
andhighlight
(#1136) - Plot config
logswitch_active
now works as advertised - When running MultiQC modules several times, multiple data files are now created instead of overwriting one another (#1175)
- Fixed minor bug where tables could report negative numbers of columns in their header text
- Fixed bug where numeric custom content sample names could trigger a
TypeError
(#1091) - Fixed custom content bug HTML data in a config file would trigger a
ValueError
(#1071) - Replaced deprecated 'warn()' with 'warning()' of the logging module
- Custom content now supports
section_extra
config key to add custom HTML after description. - Barplots with
ymax
set now ignore this when you click the Percentages tab.