Another massive release - many thanks to all of the contributors! Keep those pull-requests and issues coming!

Dropped official support for Python 2

Python 2 had its official sunset date
on January 1st 2020, meaning that it will no longer be developed by the Python community.
Part of the python.org statement reads:

That means that we will not improve it anymore after that day,
even if someone finds a security problem in it.
You should upgrade to Python 3 as soon as you can.

Very many Python packages no longer support Python 2
and it whilst the MultiQC code is currently compatible with both Python 2 and Python 3,
it is increasingly difficult to maintain compatibility with the dependency packages it
uses, such as MatPlotLib, numpy and more.

As of MultiQC version 1.9, Python 2 is no longer officially supported.
Automatic CI tests will no longer run with Python 2 and Python 2 specific workarounds
are no longer guaranteed.

Whilst it may be possible to continue using MultiQC with Python 2 for a short time by
pinning dependencies, MultiQC compatibility for Python 2 will now slowly drift and start
to break. If you haven't already, you need to switch to Python 3 now.

New MultiQC Features

Now using GitHub Actions for all CI testing
- Dropped Travis and AppVeyor, everything is now just on GitHub
- Still testing on both Linux and Windows, with multiple versions of Python
- CI tests should now run automatically for anyone who forks the MultiQC repository
Linting with --lint now checks line graphs as well as bar graphs
New gathered template with no tool name sections (#1119)
Added --sample-filters option to add show/hide buttons at the top of the report (#1125)
- Buttons control the report toolbox Show/Hide tool, filtering your samples
- Allows reports to be pre-configured based on a supplied list of sample names at report-generation time.
Line graphs can now have Log10 buttons (same functionality as bar graphs)
Importing and running multiqc in a script is now a little Better
- multiqc.run now returns the report and config as well as the exit code. This means that you can explore the MultiQC run time a little in the Python environment.
- Much more refactoring is needed to make MultiQC as useful in Python scripts as it could be. Watch this space.
If a custom module anchor is set using module_order, it's now used a bit more:
- Prefixed to module section IDs
- Appended to files saved in multiqc_data
- Should help to prevent duplicates requiring -1 suffixes when running a module multiple times
New heatmap plot config options xcats_samples and ycats_samples
- If set to False, the report toolbox options (highlight, rename, show/hide) do not affect that axis.
- Means that the Show only matching samples report toolbox option works on FastQC Status Checks, for example (#1172)
Report header time and analysis paths can now be hidden
- New config options show_analysis_paths and show_analysis_time (#1113)
New search pattern key skip: true to skip specific searches when modules look for a lot of different files (eg. Picard).
New --profile-runtime command line option (config.profile_runtime) to give analysis of how long the report takes to be generated
- Plots of the file search results and durations are added to the end of the MultiQC report as a special module called Run Time
- A summary of the time taken for the major stages of MultiQC execution are printed to the command line log.
New table config option only_defined_headers
- Defaults to true, set to false to also show any data columns that are not defined as headers
- Useful as allows table-wide defaults to be set with column-specific overrides
New module key allowed for config.extra_fn_clean_exts and config.fn_clean_exts
- Means you can limit the action of a sample name cleaning pattern to specific MultiQC modules (#905)

New Custom Content features

Improve support for HTML files - now just end your HTML filename with _mqc.html
- Native handling of HTML snippets as files, no MultiQC config or YAML file required.
- Also with embedded custom content configuration at the start of the file as a HTML comment.
Add ability to group custom-content files into report sections
- Use the new parent_id, parent_name and parent_description config keys to group content together like a regular module (#1008)
Custom Content files can now be configured using custom_data, without giving search patterns or data
- Allows you to set descriptions and nicer titles for images and other 'blunt' data types in reports (#1026)
- Allows configuration of custom content separately from files themselves (tsv, csv, txt formats) (#1205)

New Modules:

DRAGEN
- Illumina Bio-IT Platform that uses FPGA for secondary NGS analysis
iVar
- Added support for iVar: a computational package that contains functions broadly useful for viral amplicon-based sequencing.
Kaiju
- Fast and sensitive taxonomic classification for metagenomics
Kraken
- K-mer matching tool for taxonomic classification. Module plots bargraph of counts for top-5 hits across each taxa rank. General stats summary.
MALT
- Megan Alignment Tool: Metagenomics alignment tool.
miRTop
- Command line tool to annotate miRNAs with a standard mirna/isomir naming (mirGFF3)
- Module started by @oneillkza and completed by @FlorianThibord
MultiVCFAnalyzer
- Combining multiple VCF files into one coherent report and format for downstream analysis.
Picard - new submodules for QualityByCycleMetrics, QualityScoreDistributionMetrics & QualityYieldMetrics
- See #1116
Rockhopper
- RNA-seq tool for bacteria, includes bar plot showing where features map.
Sickle
- A windowed adaptive trimming tool for FASTQ files using quality
Somalier
- Relatedness checking and QC for BAM/CRAM/VCF for cancer, DNA, BS-Seq, exome, etc.
VarScan2
- Variant calling and somatic mutation/CNV detection for next-generation sequencing data

Module updates:

BISCUIT
- Major rewrite to work with new BISCUIT QC script (BISCUIT v0.3.16+)
  - This change breaks backwards-compatability with previous BISCUIT versions. If you are unable to upgrade BISCUIT, please use MultiQC v1.8.
- Fixed error when missing data in log files (#1101)
bcl2fastq
- Samples with multiple library preps (i.e barcodes) will now be handled correctly (#1094)
BUSCO
- Updated log search pattern to match new format in v4 with auto-lineage detection option (#1163)
Cutadapt
- New bar plot showing the proportion of reads filtered out for different criteria (eg. too short, too many Ns) (#1198)
DamageProfiler
- Removes redundant typo in init name. This makes referring to the module's column consistent with other modules when customising general stats table.
DeDup
- Updates plots to make compatible with 0.12.6
- Fixes reporting errors - barplot total represents mapped reads, not total reads in BAM file
- New: Adds 'Post-DeDup Mapped Reads' column to general stats table.
FastQC
- Fixed tooltip text in Sequence Duplication Levels plot (#1092)
- Handle edge-case where a FastQC report was for an empty file with 0 reads (#1129)
FastQ Screen
- Don't skip plotting % No Hits even if it's 0% (#1126)
- Refactor parsing code. Avoids error with -0.00 %Unmapped (#1126)
- New plot for Bisulfite Reads, if data is present
- Categories in main plot are now sorted by the total read count and hidden if 0 across all samples
fgbio
- New: Plot error rate by read position from ErrorRateByReadPosition
- GroupReadsByUmi plot can now be toggled to show relative percents (#1147)
FLASh
- Logs not reporting innie and outine uncombined pairs now plot combined pairs instead (#1173)
GATK
- Made parsing for VariantEval more tolerant, so that it will work with output from the tool when run in different modes (#1158)
MTNucRatioCalculator
- Fixed misleading value suffix in general stats table
Picard MarkDuplicates
- Major change - previously, if multiple libraries (read-groups) were found then only the first would be used and all others ignored. Now, values from all libraries are merged and PERCENT_DUPLICATION and ESTIMATED_LIBRARY_SIZE are recalculated. Libraries can be kept as separate samples with a new MultiQC configuration option - picard_config: markdups_merge_multiple_libraries: False
- Major change - Updated MarkDuplicates bar plot to double the read-pair counts, so that the numbers stack correctly. (#1142)
Picard HsMetrics
- Updated large table to use columns specified in the MultiQC config. See docs. (#831)
Picard WgsMetrics
- Updated parsing code to recognise new java class string (#1114)
QualiMap
- Fixed QualiMap mean coverage calculation #1082, #1077
RSeqC
- Support added for output from geneBodyCoverage2.py script (#844)
- Single sample view in the "Junction saturation" plot now works with the toolbox properly (rename, hide, highlight) (#1133)
RNASeQC2
- Updated to handle the parsing metric files from the newer rewrite of RNA-SeqQC.
Samblaster
- Improved parsing to handle variable whitespace (#1176)
Samtools
- Removes hardcoding of general stats column names. This allows column names to indicate when a module has been run twice (#1076).
- Added an observed over expected read count plot for idxstats (#1118)
- Added additional (by default hidden) column for flagstat that displays number total number of reads in a bam
sortmerna
- Fix the bug for the latest sortmerna version 4.2.0 (#1121)
sexdeterrmine
- Added a scatter plot of relative X- vs Y-coverage to the generated report.
VerifyBAMID
- Allow files with column header FREEMIX(alpha) (#1112)

Bug Fixes:

Added a new test to check that modules work correctly with --ignore-samples. A lot of them didn't:
- Mosdepth, conpair, Qualimap BamQC, RNA-SeQC, GATK BaseRecalibrator, SNPsplit, SeqyClean, Jellyfish, hap.py, HOMER, BBMap, DeepTools, HiCExplorer, pycoQC, interop
- These modules have now all been fixed and --ignore-samples should work as you expect for whatever data you have.
Removed use of shutil.copy to avoid problems with working on multiple filesystems (#1130)
Made folder naming behaviour of multiqc_plots consistent with multiqc_data
- Incremental numeric suffixes now added if folder already exists
- Plots folder properly renamed if using -n/--filename
Heatmap plotting function is now compatible with MultiQC toolbox hide and highlight (#1136)
Plot config logswitch_active now works as advertised
When running MultiQC modules several times, multiple data files are now created instead of overwriting one another (#1175)
Fixed minor bug where tables could report negative numbers of columns in their header text
Fixed bug where numeric custom content sample names could trigger a TypeError (#1091)
Fixed custom content bug HTML data in a config file would trigger a ValueError (#1071)
Replaced deprecated 'warn()' with 'warning()' of the logging module
Custom content now supports section_extra config key to add custom HTML after description.
Barplots with ymax set now ignore this when you click the Percentages tab.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiQC Version 1.9