forked from hail-is/hail
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[query] fix bad bug in IndexedRVDSpec2 (hail-is#14420)
CHANGELOG: Fixes a serious, but likely rare, bug in the Table/MatrixTable reader, which has been present since Sep 2020. It manifests as many (around half or more) of the rows being dropped. This could only happen when 1) reading a (matrix)table whose partitioning metadata allows rows with the same key to be split across neighboring partitions, and 2) reading it with a different partitioning than it was written. 1) would likely only happen by reading data keyed by locus and alleles, and rekeying it to only locus before writing. 2) would likely only happen by using the `_intervals` or `_n_partitions` arguments to `read_(matrix)_table`, or possibly `repartition`. Please reach out to us if you're concerned you may have been affected by this. This fixes a serious and longstanding bug in `IndexedRVDSpec2`, which appears to have been around since this code was first added in hail-is#9522 almost four years ago. It was reported in this [zulip thread](https://hail.zulipchat.com/#narrow/stream/123010-Hail-Query-0.2E2-support/topic/Number.20of.20rows.20changing.20with.20partitioning). I want to do further work to better characterize exactly what it takes to be affected by this bug, but I think you must have a table or matrixtable on disk which has duplicate keys, and moreover keys which span neighboring partitions, and then you must read the data with a different partitioner. The root of the issue is an invalid assumption made in the code. To read data written with partitioner `p1` using new partitioner `p2`, it first computes the "intersection", or common refinement, of the two. It then assumes that each partition in the refinement overlaps exactly one partition of `p1`. But this is only true if the partitions of `p1` are themselves mutually disjoint, which is usually but not necessarily true. For example, suppose `p1 = [ [1, 5], [5, 8] ]` is the old partitioner, and `p2 = [ [1, 4), [4, 8] ]` is the new. Note that the two input partitions are not disjoint, as the key `5` is allowed in both. The common refinement would then be `[ [1, 4), [4, 5], [5, 8] ]`. For each partition in the refinement, we want to read in the corresponding range from the appropriate input partition, then we want to group the partitions in the refinement to match the new partitioner. The code finds "the appropriate input partition" by taking the first input partition which overlaps the refinement partition, using `lowerBoundInterval`. That works if there is only one overlapping input partition, but here fails, since the refinement partition `[5, 8]` overlaps both input partitions. So the code mistakenly reads from the input partition `[1, 5]` to produce the refinement partition `[5, 8]`, and so completely drops all rows in the input `[5, 8]`. In practice, I think the most likely way to run into this (and the way it was found by a user) is to have a dataset keyed by `["locus", "alleles"]`, which has split multi-allelics, so there are multiple rows with the same locus. Then shorten the key to `["locus"]`, write the dataset to disk, and read it back with a different partitioning, e.g. by passing a `_n_partitions` argument to `read_table` or `read_matrix_table`. For instance, if the partitioning was originally `[ [{1:1, ["A"]}, {1:500, ["G"]}), [{1:500, ["G"]}, {1:1000, ["C"]}] ]`, then after shortening the key it would be `[ [1:1, 1:500], [1:500, 1:1000] ]`. Notice that even though the original partitioning had no overlap, it does after shortening the key, because rows with locus `1:500` with alleles less than `["G"]` are allowed in the first partition, so we have to make the right endpoint inclusive after shortening. You would then need to write this rekeyed dataset to disk and read it back with different partitioning (note that `ds.repartition` is enough to do this in the batch backend). I still need to think through what holes in our testing allowed this to remain undetected for so long, and attempt to plug them. We should also plan for what to tell a user who is concerned they may have been affected by this in the past.
- Loading branch information
1 parent
e27d149
commit 22114a2
Showing
107 changed files
with
103 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
3 changes: 3 additions & 0 deletions
3
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/README.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-705d4033e0c9 | ||
Created at 2024/03/27 12:03:10 |
Empty file.
3 changes: 3 additions & 0 deletions
3
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/cols/README.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-705d4033e0c9 | ||
Created at 2024/03/27 12:03:10 |
Empty file.
Binary file added
BIN
+264 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/cols/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+251 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/cols/rows/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+343 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/cols/rows/parts/part-0
Binary file not shown.
3 changes: 3 additions & 0 deletions
3
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/entries/README.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-705d4033e0c9 | ||
Created at 2024/03/27 12:03:10 |
Empty file.
Binary file added
BIN
+345 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/entries/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+1.1 KB
...src/test/resources/sample.vcf-20-partitions-with-overlap.mt/entries/rows/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+8.34 KB
...artitions-with-overlap.mt/entries/rows/parts/part-00-cdb826da-6c5c-47b6-945b-3190a87a6a14
Binary file not shown.
Binary file added
BIN
+6.47 KB
...artitions-with-overlap.mt/entries/rows/parts/part-01-06f6a507-61e2-4bd1-a917-e1809270144c
Binary file not shown.
Binary file added
BIN
+7.66 KB
...artitions-with-overlap.mt/entries/rows/parts/part-02-881d024c-5baf-4fe6-bc8f-53eda3845bde
Binary file not shown.
Binary file added
BIN
+7.36 KB
...artitions-with-overlap.mt/entries/rows/parts/part-03-1e085a57-4dcb-4131-bc79-353324ffad47
Binary file not shown.
Binary file added
BIN
+9.12 KB
...artitions-with-overlap.mt/entries/rows/parts/part-04-d17ed9aa-6b33-4b0b-85d5-578da32f7581
Binary file not shown.
Binary file added
BIN
+8.26 KB
...artitions-with-overlap.mt/entries/rows/parts/part-05-40d512f8-23ba-485e-aefa-47eced2bfe6d
Binary file not shown.
Binary file added
BIN
+6.61 KB
...artitions-with-overlap.mt/entries/rows/parts/part-06-9b2dc9c7-c8b1-4ed4-9056-20142b5f6658
Binary file not shown.
Binary file added
BIN
+8.57 KB
...artitions-with-overlap.mt/entries/rows/parts/part-07-b9a32d97-cb10-4158-aeaa-645dcea68ca7
Binary file not shown.
Binary file added
BIN
+6.63 KB
...artitions-with-overlap.mt/entries/rows/parts/part-08-c2a0123f-a3d4-4b80-9c21-73cb2bed0b63
Binary file not shown.
Binary file added
BIN
+7.87 KB
...artitions-with-overlap.mt/entries/rows/parts/part-09-ca197aee-6bfd-4068-b771-e9ca63551a7c
Binary file not shown.
Binary file added
BIN
+6.04 KB
...artitions-with-overlap.mt/entries/rows/parts/part-10-17048169-a98b-49ee-ae4d-62641023b3ac
Binary file not shown.
Binary file added
BIN
+6.57 KB
...artitions-with-overlap.mt/entries/rows/parts/part-11-c89858f5-4d78-4739-af31-308a1c257ff4
Binary file not shown.
Binary file added
BIN
+5.75 KB
...artitions-with-overlap.mt/entries/rows/parts/part-12-3e391e78-782d-495d-a29c-cacc56e1baf8
Binary file not shown.
Binary file added
BIN
+8.17 KB
...artitions-with-overlap.mt/entries/rows/parts/part-13-62566d28-e496-4538-a325-b567be66accf
Binary file not shown.
Binary file added
BIN
+6.99 KB
...artitions-with-overlap.mt/entries/rows/parts/part-14-8ab32ab7-15cd-4302-bb45-6b3dc02db5b6
Binary file not shown.
Binary file added
BIN
+5.98 KB
...artitions-with-overlap.mt/entries/rows/parts/part-15-c4301966-4fd8-4ea0-b439-b49a693bf683
Binary file not shown.
Binary file added
BIN
+7.14 KB
...artitions-with-overlap.mt/entries/rows/parts/part-16-8d638c2e-b1a5-4507-ba00-337a02e3f431
Binary file not shown.
Binary file added
BIN
+4.92 KB
...artitions-with-overlap.mt/entries/rows/parts/part-17-0c739863-b5fe-4e33-8f47-3e2751b599df
Binary file not shown.
Binary file added
BIN
+6.33 KB
...artitions-with-overlap.mt/entries/rows/parts/part-18-35d65ae7-5d1d-43f8-bb21-e6565874975e
Binary file not shown.
Binary file added
BIN
+7.59 KB
...artitions-with-overlap.mt/entries/rows/parts/part-19-2fd81de2-5d34-43db-809d-2f1fe1e67200
Binary file not shown.
3 changes: 3 additions & 0 deletions
3
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/globals/README.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-705d4033e0c9 | ||
Created at 2024/03/27 12:03:10 |
Empty file.
Binary file added
BIN
+239 Bytes
.../test/resources/sample.vcf-20-partitions-with-overlap.mt/globals/globals/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+36 Bytes
.../src/test/resources/sample.vcf-20-partitions-with-overlap.mt/globals/globals/parts/part-0
Binary file not shown.
Binary file added
BIN
+254 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/globals/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+239 Bytes
...src/test/resources/sample.vcf-20-partitions-with-overlap.mt/globals/rows/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+36 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/globals/rows/parts/part-0
Binary file not shown.
Binary file added
BIN
+251 Bytes
...0-partitions-with-overlap.mt/index/part-00-cdb826da-6c5c-47b6-945b-3190a87a6a14.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-00-cdb826da-6c5c-47b6-945b-3190a87a6a14.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+244 Bytes
...0-partitions-with-overlap.mt/index/part-01-06f6a507-61e2-4bd1-a917-e1809270144c.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-01-06f6a507-61e2-4bd1-a917-e1809270144c.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+242 Bytes
...0-partitions-with-overlap.mt/index/part-02-881d024c-5baf-4fe6-bc8f-53eda3845bde.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-02-881d024c-5baf-4fe6-bc8f-53eda3845bde.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+257 Bytes
...0-partitions-with-overlap.mt/index/part-03-1e085a57-4dcb-4131-bc79-353324ffad47.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-03-1e085a57-4dcb-4131-bc79-353324ffad47.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+244 Bytes
...0-partitions-with-overlap.mt/index/part-04-d17ed9aa-6b33-4b0b-85d5-578da32f7581.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-04-d17ed9aa-6b33-4b0b-85d5-578da32f7581.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+236 Bytes
...0-partitions-with-overlap.mt/index/part-05-40d512f8-23ba-485e-aefa-47eced2bfe6d.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-05-40d512f8-23ba-485e-aefa-47eced2bfe6d.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+244 Bytes
...0-partitions-with-overlap.mt/index/part-06-9b2dc9c7-c8b1-4ed4-9056-20142b5f6658.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-06-9b2dc9c7-c8b1-4ed4-9056-20142b5f6658.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+233 Bytes
...0-partitions-with-overlap.mt/index/part-07-b9a32d97-cb10-4158-aeaa-645dcea68ca7.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-07-b9a32d97-cb10-4158-aeaa-645dcea68ca7.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+244 Bytes
...0-partitions-with-overlap.mt/index/part-08-c2a0123f-a3d4-4b80-9c21-73cb2bed0b63.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-08-c2a0123f-a3d4-4b80-9c21-73cb2bed0b63.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+246 Bytes
...0-partitions-with-overlap.mt/index/part-09-ca197aee-6bfd-4068-b771-e9ca63551a7c.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-09-ca197aee-6bfd-4068-b771-e9ca63551a7c.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+238 Bytes
...0-partitions-with-overlap.mt/index/part-10-17048169-a98b-49ee-ae4d-62641023b3ac.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-10-17048169-a98b-49ee-ae4d-62641023b3ac.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+233 Bytes
...0-partitions-with-overlap.mt/index/part-11-c89858f5-4d78-4739-af31-308a1c257ff4.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-11-c89858f5-4d78-4739-af31-308a1c257ff4.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+244 Bytes
...0-partitions-with-overlap.mt/index/part-12-3e391e78-782d-495d-a29c-cacc56e1baf8.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-12-3e391e78-782d-495d-a29c-cacc56e1baf8.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+246 Bytes
...0-partitions-with-overlap.mt/index/part-13-62566d28-e496-4538-a325-b567be66accf.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-13-62566d28-e496-4538-a325-b567be66accf.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+244 Bytes
...0-partitions-with-overlap.mt/index/part-14-8ab32ab7-15cd-4302-bb45-6b3dc02db5b6.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-14-8ab32ab7-15cd-4302-bb45-6b3dc02db5b6.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+257 Bytes
...0-partitions-with-overlap.mt/index/part-15-c4301966-4fd8-4ea0-b439-b49a693bf683.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-15-c4301966-4fd8-4ea0-b439-b49a693bf683.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+246 Bytes
...0-partitions-with-overlap.mt/index/part-16-8d638c2e-b1a5-4507-ba00-337a02e3f431.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-16-8d638c2e-b1a5-4507-ba00-337a02e3f431.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+246 Bytes
...0-partitions-with-overlap.mt/index/part-17-0c739863-b5fe-4e33-8f47-3e2751b599df.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-17-0c739863-b5fe-4e33-8f47-3e2751b599df.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+246 Bytes
...0-partitions-with-overlap.mt/index/part-18-35d65ae7-5d1d-43f8-bb21-e6565874975e.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-18-35d65ae7-5d1d-43f8-bb21-e6565874975e.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+257 Bytes
...0-partitions-with-overlap.mt/index/part-19-2fd81de2-5d34-43db-809d-2f1fe1e67200.idx/index
Binary file not shown.
Binary file added
BIN
+181 Bytes
...s-with-overlap.mt/index/part-19-2fd81de2-5d34-43db-809d-2f1fe1e67200.idx/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+545 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/metadata.json.gz
Binary file not shown.
3 changes: 3 additions & 0 deletions
3
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/rows/README.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
This folder comprises a Hail (www.hail.is) native Table or MatrixTable. | ||
Written with version 0.2.128-705d4033e0c9 | ||
Created at 2024/03/27 12:03:10 |
Empty file.
1 change: 1 addition & 0 deletions
1
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/rows/metadata.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"file_version":67328,"hail_version":"0.2.128-705d4033e0c9","references_rel_path":"../references","table_type":"Table{global:Struct{},key:[locus],row:Struct{locus:Locus(GRCh37),alleles:Array[String],rsid:String,qual:Float64,filters:Set[String],info:Struct{NEGATIVE_TRAIN_SITE:Boolean,HWP:Float64,AC:Array[Int32],culprit:String,MQ0:Int32,ReadPosRankSum:Float64,AN:Int32,InbreedingCoeff:Float64,AF:Array[Float64],GQ_STDDEV:Float64,FS:Float64,DP:Int32,GQ_MEAN:Float64,POSITIVE_TRAIN_SITE:Boolean,VQSLOD:Float64,ClippingRankSum:Float64,BaseQRankSum:Float64,MLEAF:Array[Float64],MLEAC:Array[Int32],MQ:Float64,QD:Float64,END:Int32,DB:Boolean,HaplotypeScore:Float64,MQRankSum:Float64,CCC:Int32,NCC:Int32,DS:Boolean}}}","components":{"globals":{"name":"RVDComponentSpec","rel_path":"../globals/rows"},"rows":{"name":"RVDComponentSpec","rel_path":"rows"},"partition_counts":{"name":"PartitionCountsComponentSpec","counts":[18,17,17,18,17,17,17,18,17,17,17,18,17,17,17,18,17,17,17,18]},"properties":{"name":"PropertiesSpec","properties":{"distinctlyKeyed":false}}},"name":"TableSpec"} |
Binary file added
BIN
+517 Bytes
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/rows/metadata.json.gz
Binary file not shown.
1 change: 1 addition & 0 deletions
1
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/rows/rows/metadata.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"name":"IndexedRVDSpec2","_key":["locus"],"_codecSpec":{"name":"TypedCodecSpec","_eType":"+EBaseStruct{locus:EBaseStruct{contig:+EBinary,position:+EInt32},alleles:EArray[EBinary],rsid:EBinary,qual:EFloat64,filters:EArray[EBinary],info:EBaseStruct{NEGATIVE_TRAIN_SITE:EBoolean,HWP:EFloat64,AC:EArray[EInt32],culprit:EBinary,MQ0:EInt32,ReadPosRankSum:EFloat64,AN:EInt32,InbreedingCoeff:EFloat64,AF:EArray[EFloat64],GQ_STDDEV:EFloat64,FS:EFloat64,DP:EInt32,GQ_MEAN:EFloat64,POSITIVE_TRAIN_SITE:EBoolean,VQSLOD:EFloat64,ClippingRankSum:EFloat64,BaseQRankSum:EFloat64,MLEAF:EArray[EFloat64],MLEAC:EArray[EInt32],MQ:EFloat64,QD:EFloat64,END:EInt32,DB:EBoolean,HaplotypeScore:EFloat64,MQRankSum:EFloat64,CCC:EInt32,NCC:EInt32,DS:EBoolean}}","_vType":"Struct{locus:Locus(GRCh37),alleles:Array[String],rsid:String,qual:Float64,filters:Set[String],info:Struct{NEGATIVE_TRAIN_SITE:Boolean,HWP:Float64,AC:Array[Int32],culprit:String,MQ0:Int32,ReadPosRankSum:Float64,AN:Int32,InbreedingCoeff:Float64,AF:Array[Float64],GQ_STDDEV:Float64,FS:Float64,DP:Int32,GQ_MEAN:Float64,POSITIVE_TRAIN_SITE:Boolean,VQSLOD:Float64,ClippingRankSum:Float64,BaseQRankSum:Float64,MLEAF:Array[Float64],MLEAC:Array[Int32],MQ:Float64,QD:Float64,END:Int32,DB:Boolean,HaplotypeScore:Float64,MQRankSum:Float64,CCC:Int32,NCC:Int32,DS:Boolean}}","_bufferSpec":{"name":"LEB128BufferSpec","child":{"name":"BlockingBufferSpec","blockSize":65536,"child":{"name":"ZstdBlockBufferSpec","blockSize":65536,"child":{"name":"StreamBlockBufferSpec"}}}}},"_indexSpec":{"name":"IndexSpec2","_relPath":"../../index","_leafCodec":{"name":"TypedCodecSpec","_eType":"EBaseStruct{first_idx:+EInt64,keys:+EArray[+EBaseStruct{key:+EBaseStruct{locus:EBaseStruct{contig:+EBinary,position:+EInt32}},offset:+EInt64,annotation:+EBaseStruct{entries_offset:EInt64}}]}","_vType":"Struct{first_idx:Int64,keys:Array[Struct{key:Struct{locus:Locus(GRCh37)},offset:Int64,annotation:Struct{entries_offset:Int64}}]}","_bufferSpec":{"name":"LEB128BufferSpec","child":{"name":"BlockingBufferSpec","blockSize":65536,"child":{"name":"ZstdBlockBufferSpec","blockSize":65536,"child":{"name":"StreamBlockBufferSpec"}}}}},"_internalNodeCodec":{"name":"TypedCodecSpec","_eType":"EBaseStruct{children:+EArray[+EBaseStruct{index_file_offset:+EInt64,first_idx:+EInt64,first_key:+EBaseStruct{locus:EBaseStruct{contig:+EBinary,position:+EInt32}},first_record_offset:+EInt64,first_annotation:+EBaseStruct{entries_offset:EInt64}}]}","_vType":"Struct{children:Array[Struct{index_file_offset:Int64,first_idx:Int64,first_key:Struct{locus:Locus(GRCh37)},first_record_offset:Int64,first_annotation:Struct{entries_offset:Int64}}]}","_bufferSpec":{"name":"LEB128BufferSpec","child":{"name":"BlockingBufferSpec","blockSize":65536,"child":{"name":"ZstdBlockBufferSpec","blockSize":65536,"child":{"name":"StreamBlockBufferSpec"}}}}},"_keyType":"Struct{locus:Locus(GRCh37)}","_annotationType":"Struct{entries_offset:Int64}"},"_partFiles":["part-00-cdb826da-6c5c-47b6-945b-3190a87a6a14","part-01-06f6a507-61e2-4bd1-a917-e1809270144c","part-02-881d024c-5baf-4fe6-bc8f-53eda3845bde","part-03-1e085a57-4dcb-4131-bc79-353324ffad47","part-04-d17ed9aa-6b33-4b0b-85d5-578da32f7581","part-05-40d512f8-23ba-485e-aefa-47eced2bfe6d","part-06-9b2dc9c7-c8b1-4ed4-9056-20142b5f6658","part-07-b9a32d97-cb10-4158-aeaa-645dcea68ca7","part-08-c2a0123f-a3d4-4b80-9c21-73cb2bed0b63","part-09-ca197aee-6bfd-4068-b771-e9ca63551a7c","part-10-17048169-a98b-49ee-ae4d-62641023b3ac","part-11-c89858f5-4d78-4739-af31-308a1c257ff4","part-12-3e391e78-782d-495d-a29c-cacc56e1baf8","part-13-62566d28-e496-4538-a325-b567be66accf","part-14-8ab32ab7-15cd-4302-bb45-6b3dc02db5b6","part-15-c4301966-4fd8-4ea0-b439-b49a693bf683","part-16-8d638c2e-b1a5-4507-ba00-337a02e3f431","part-17-0c739863-b5fe-4e33-8f47-3e2751b599df","part-18-35d65ae7-5d1d-43f8-bb21-e6565874975e","part-19-2fd81de2-5d34-43db-809d-2f1fe1e67200"],"_jRangeBounds":[{"start":{"locus":{"contig":"20","position":10019093}},"end":{"locus":{"contig":"20","position":10286773}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":10286773}},"end":{"locus":{"contig":"20","position":10603326}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":10603326}},"end":{"locus":{"contig":"20","position":10625804}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":10625804}},"end":{"locus":{"contig":"20","position":10653469}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":10653469}},"end":{"locus":{"contig":"20","position":13071871}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":13071871}},"end":{"locus":{"contig":"20","position":13260252}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":13260252}},"end":{"locus":{"contig":"20","position":13561632}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":13561632}},"end":{"locus":{"contig":"20","position":13709115}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":13709115}},"end":{"locus":{"contig":"20","position":13798776}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":13798776}},"end":{"locus":{"contig":"20","position":14032627}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":14032627}},"end":{"locus":{"contig":"20","position":15948325}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":15948325}},"end":{"locus":{"contig":"20","position":16347823}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":16347823}},"end":{"locus":{"contig":"20","position":16410559}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":16410559}},"end":{"locus":{"contig":"20","position":17410116}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":17410116}},"end":{"locus":{"contig":"20","position":17475217}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":17475217}},"end":{"locus":{"contig":"20","position":17595540}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":17595540}},"end":{"locus":{"contig":"20","position":17600357}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":17600357}},"end":{"locus":{"contig":"20","position":17608348}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":17608348}},"end":{"locus":{"contig":"20","position":17705709}},"includeStart":true,"includeEnd":true},{"start":{"locus":{"contig":"20","position":17705709}},"end":{"locus":{"contig":"20","position":17970876}},"includeStart":true,"includeEnd":true}],"_attrs":{}} |
Binary file added
BIN
+1.56 KB
hail/src/test/resources/sample.vcf-20-partitions-with-overlap.mt/rows/rows/metadata.json.gz
Binary file not shown.
Binary file added
BIN
+1.93 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-00-cdb826da-6c5c-47b6-945b-3190a87a6a14
Binary file not shown.
Binary file added
BIN
+1.89 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-01-06f6a507-61e2-4bd1-a917-e1809270144c
Binary file not shown.
Binary file added
BIN
+1.74 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-02-881d024c-5baf-4fe6-bc8f-53eda3845bde
Binary file not shown.
Binary file added
BIN
+2.05 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-03-1e085a57-4dcb-4131-bc79-353324ffad47
Binary file not shown.
Binary file added
BIN
+1.85 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-04-d17ed9aa-6b33-4b0b-85d5-578da32f7581
Binary file not shown.
Binary file added
BIN
+1.81 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-05-40d512f8-23ba-485e-aefa-47eced2bfe6d
Binary file not shown.
Binary file added
BIN
+1.81 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-06-9b2dc9c7-c8b1-4ed4-9056-20142b5f6658
Binary file not shown.
Binary file added
BIN
+1.63 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-07-b9a32d97-cb10-4158-aeaa-645dcea68ca7
Binary file not shown.
Binary file added
BIN
+1.96 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-08-c2a0123f-a3d4-4b80-9c21-73cb2bed0b63
Binary file not shown.
Binary file added
BIN
+1.78 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-09-ca197aee-6bfd-4068-b771-e9ca63551a7c
Binary file not shown.
Binary file added
BIN
+1.87 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-10-17048169-a98b-49ee-ae4d-62641023b3ac
Binary file not shown.
Binary file added
BIN
+1.58 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-11-c89858f5-4d78-4739-af31-308a1c257ff4
Binary file not shown.
Binary file added
BIN
+1.82 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-12-3e391e78-782d-495d-a29c-cacc56e1baf8
Binary file not shown.
Binary file added
BIN
+1.88 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-13-62566d28-e496-4538-a325-b567be66accf
Binary file not shown.
Binary file added
BIN
+1.88 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-14-8ab32ab7-15cd-4302-bb45-6b3dc02db5b6
Binary file not shown.
Binary file added
BIN
+1.88 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-15-c4301966-4fd8-4ea0-b439-b49a693bf683
Binary file not shown.
Binary file added
BIN
+1.8 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-16-8d638c2e-b1a5-4507-ba00-337a02e3f431
Binary file not shown.
Binary file added
BIN
+1.73 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-17-0c739863-b5fe-4e33-8f47-3e2751b599df
Binary file not shown.
Binary file added
BIN
+1.85 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-18-35d65ae7-5d1d-43f8-bb21-e6565874975e
Binary file not shown.
Binary file added
BIN
+1.98 KB
...0-partitions-with-overlap.mt/rows/rows/parts/part-19-2fd81de2-5d34-43db-809d-2f1fe1e67200
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters