-
Notifications
You must be signed in to change notification settings - Fork 13
Global Settings
The construction writes status information to stderr
. There are four possible verbosity levels, which can be set using Verbosity::set(size_type new_level)
:
Level | Numerical | Description |
---|---|---|
Verbosity::SILENT |
0 | No status information (default) |
Verbosity::BASIC |
1 | Basic progress information and statistics on the input and the final index |
Verbosity::EXTENDED |
2 | Adds intermediate statistics for each batch |
Verbosity::FULL |
3 | Adds detailed information for each batch |
By default, temporary files are written to the current working directory, but the directory can be changed with TempFile::setDirectory(const std::string& directory)
.
GBWT deletes the temporary files under normal circumstances. If the program crashes (e.g. due to invalid data or running out of memory) without calling std::exit()
, some files may remain.
The haplotype generation interface stores phasing information in temporary files in order to save memory. Typical space usage is similar to the .vcf.gz
file, though it depends on the number of samples per file. Because the files are run-length encoded, storing many samples per file reduces disk usage while increasing memory usage. In some cases, space usage can be several times higher. For example, run-length encoding does not help with the human chromosome X, if male and female samples are randomly interleaved.
The naming scheme for phasing files is phasing_host_process_counter (e.g. phasing_vr-4-1-14_15606_40).
The fast merging algorithm writes the rank array to disk in a number of temporary files. Each file contains a gap-encoded sorted subset of the rank array. The total size of these files is 2-3 bytes times the total length of the sequences in the inserted (smaller) GBWT.
The naming scheme for rank array files is ranks_host_process_counter (e.g. ranks_vr-4-1-14_15606_40).