A mercury resource archive contains data files that mercury can use to analyze network traffic. It is a POSIX Tape Archive, or .tar
file. It may be compressed via GZIP, in which case a .gz
extension is appended to the .tar
extension (resulting in a .tar.gz
extension). It may be encrypted using the Advanced Encryption Standard (AES) in Cipher Block Chaining (CBC) mode of operation, in which case the initial 16 bytes of the file MUST contain the CBC Initialization Vector (IV), and .enc
is appended to the extension. When reading a resource archive, decryption (if any) precedes decompression, and decompression (if any) precedes archive processing. When writing a resource archive, the order of those operations is reversed.
The following files may appear in a resource archive:
VERSION
is a text file containing a single line representing the version of the resource archive, and more than one, but fixed of a fixed count,;
separated qualifiers in specific order.- e.g.
2024-06-26; 2.0.lite
- e.g.
fp_prevalence_tls.txt
is a text file, each line of which is a string representation of a fingerprint.fingerprint_db.json
,fingerprint_db_normal.json
, andfingerprint_db_lite.json
are JSON files containing a fingerprint and destination database.doh-watchlist.txt
is a text file, each line of which contains an IPv4 or IPv6 address or a DNS name associated with a DNS over HTTPS server. DNS names MUST contain punycode representations of internationalized domain names, and not UTF-8.pyasn.db
is a text file, each line of which contains a IP subnet and corresponding decimal Autonomous System Number (ASN), separated by whitespace.
A resource archive MAY contain a VERSION
file, and MUST contain fp_prevalence_tls.txt
, doh-watchlist.txt
, and pyasn.db
files. A resource archive MUST contain a fingerprint_db.json
file, and may contain a fingerprint_db_lite.json
file.
fingerprint_db.json
and theVERSION
file contains an identifier includinglite
, e.g.2.0.lite
:- The archive is a lite archive of the new format.
- Classfier ignores the configured
fp_proc_threshold
andproc_dst_threshold
thresholds and loads thefingerprint_db.json
fingerprint_db.json
and theVERSION
file contains an identifier includingfull
, e.g.2.0.full
:- The archive is a full archive of the new format.
- Classfier ignores the configured
fp_proc_threshold
andproc_dst_threshold
thresholds and loads thefingerprint_db.json
fingerprint_db.json
and no identifier inVERSION
:- A regular archive of the depricated format.
- The classifer does not load and disables all protocols from libmerc config.
- Dual DB: An archive with both
fingerprint_db.json
andfingerprint_db_lite.json
and theVERSION
file contains an identifier includingdual
, e.g.2.0.dual
:- If atleast one of the thresholds,
fp_proc_threshold
andproc_dst_threshold
, is configured, the classifier loadsfingerprint_db_lite.json
and ignores the configured thresholds. - If neither of the thresholds are configured, the classifier loads
fingerprint_db.json
.
- If atleast one of the thresholds,