You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Disq should run tests continuously on larger test data, as there are undoubtedly many code paths that don't get exercised on tiny data. Bams/crams that are a couple hundred MB in size would provide a good compromise between test suite speed and the need to test on more realistic data, I think. There are some suitable bams checked into the gatk repo under "large": https://github.com/broadinstitute/gatk/tree/master/src/test/resources/large
As part of this ticket, we'll have to decide how to host and version this large test data. The standard solution is git lfs (which is what we use in the gatk), but if there are other good alternatives out there we should evaluate those as well.
The text was updated successfully, but these errors were encountered:
+1 to maintaining a reference set of test data in git lfs, possibly in a separate repo under the disq-bio organization.
Then it would also be worthwhile to mirror these data in cloud storage at all the major providers.
I am willing to produce and help host these data after transformation into Parquet+Avro for comparison and later testing of conversion if/when that functionality migrates into Disq.
Disq should run tests continuously on larger test data, as there are undoubtedly many code paths that don't get exercised on tiny data. Bams/crams that are a couple hundred MB in size would provide a good compromise between test suite speed and the need to test on more realistic data, I think. There are some suitable bams checked into the gatk repo under "large":
https://github.com/broadinstitute/gatk/tree/master/src/test/resources/large
As part of this ticket, we'll have to decide how to host and version this large test data. The standard solution is
git lfs
(which is what we use in the gatk), but if there are other good alternatives out there we should evaluate those as well.The text was updated successfully, but these errors were encountered: