Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Pysam #265

Closed
creisle opened this issue Dec 28, 2021 · 3 comments · Fixed by #266 or #267
Closed

Upgrade Pysam #265

creisle opened this issue Dec 28, 2021 · 3 comments · Fixed by #266 or #267
Assignees
Milestone

Comments

@creisle
Copy link
Member

creisle commented Dec 28, 2021

Currently MAVIS is frozen to pysam 0.15.2 because there are some unexpected bugs in later versions. However this is getting harder and harder to support as setuptools/pip has issues with these lower versions

I have commented on this ticket but it was created in 2017 and there has been little to no movement: pysam-developers/pysam#527

When we run the test with the newer versions we see the following error

FAILED tests/end_to_end/test_convert.py::TestConvert::test_delly - assert 17396140 == (7059510 - 670)

After some preliminary debugging I can see that it seems like the END tag of the INFO column is being ignored. In previous versions of pysam (<=0.15.2) this would have been used to create the "stop" field on VariantRecord. Now however it is simply dropped.

Adding a summary of testing here (python3.7 was used, will test other versions once this one is working)

Python Version pysam version Hstlib disable Flags tested Error Message
3.7 0.15.2 lmza; bz2; libcurl ✔️
3.7 0.15.3 lmza; bz2; libcurl ✔️ FAILED tests/end_to_end/test_convert.py::TestConvert::test_delly - assert 17396140 == (7059510 - 670)
3.7 0.15.4 lmza; bz2; libcurl ✔️ E OSError: unable to parse next record
3.7 0.16.0 lmza; bz2; libcurl ✔️ E ImportError: libchtslib.cpython-37m-x86_64-linux-gnu.so: cannot open shared object file: No such file or directory
3.7 0.16.0.1 lmza; bz2; libcurl ✔️ E OSError: unable to parse next record; FAILED tests/end_to_end/test_convert.py::TestConvert::test_delly - assert 17396140 == (7059510 - 670)
3.7 0.16.0.1 bz2; libcurl ✔️ same as above
3.7 0.17.0 lmza; bz2; libcurl ✔️ same as above
3.7 0.18.0 lmza; bz2; libcurl ✔️ same as above
@creisle creisle added this to the v3.0.0 milestone Dec 28, 2021
@creisle creisle self-assigned this Dec 28, 2021
@creisle
Copy link
Member Author

creisle commented Dec 28, 2021

When I try this with pysam 0.15.4 I see a different error, this time in the manta vcf files

============================= test session starts ==============================
platform linux -- Python 3.7.2, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /projects/dat/workspace/creisle/mavis
plugins: cov-3.0.0
collected 11 items

tests/end_to_end/test_convert.py ...F.......                             [100%]

=================================== FAILURES ===================================
____________________________ TestConvert.test_manta ____________________________

self = <tests.end_to_end.test_convert.TestConvert object at 0x7f7e34e2c588>

    def test_manta(self):
>       result = self.run_main(get_data('manta_events.vcf'), SUPPORTED_TOOL.MANTA, False)

tests/end_to_end/test_convert.py:81: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/end_to_end/test_convert.py:41: in run_main
    main()
src/mavis/main.py:292: in main
    raise err
src/mavis/main.py:264: in main
    args.assume_no_untemplated,
src/mavis/main.py:36: in convert_main
    assume_no_untemplated=assume_no_untemplated,
src/mavis/tools/__init__.py:35: in convert_tool_output
    fname, file_type, stranded, log, assume_no_untemplated=assume_no_untemplated
src/mavis/tools/__init__.py:291: in _convert_tool_output
    rows = read_vcf(input_file, file_type, log)
src/mavis/tools/vcf.py:204: in convert_file
    for vcf_record in vfile.fetch():
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   OSError: unable to parse next record

pysam/libcbcf.pyx:4108: OSError
----------------------------- Captured stderr call -----------------------------
[W::vcf_parse] Contig '1    17051724   MantaBND:207:0:1:0:0:0:0    C   [1:234912188[GCCCCATC   36  PASS    SVTYPE=BND;MATEID=MantaBND:207:0:1:0:0:0:1;SVINSLEN=7;SVINSSEQ=GCCCCAT;BND_DEPTH=5;MATE_BND_DEPTH=4 GT:FT:GQ:PL:PR:SR 0/1:PASS:30:86,0,28:1,2:3,1   .   .   .' is not defined in the header. (Quick workaround: index the file with tabix.)
[E::bcf_hdr_parse_line] Could not parse the header line: "##contig=<ID=1    17051724   MantaBND:207:0:1:0:0:0:0    C   [1:234912188[GCCCCATC   36  PASS    SVTYPE=BND;MATEID=MantaBND:207:0:1:0:0:0:1;SVINSLEN=7;SVINSSEQ=GCCCCAT;BND_DEPTH=5;MATE_BND_DEPTH=4 GT:FT:GQ:PL:PR:SR 0/1:PASS:30:86,0,28:1,2:3,1   .   .   .>"
[E::vcf_parse] Could not add dummy header for contig '1    17051724   MantaBND:207:0:1:0:0:0:0    C   [1:234912188[GCCCCATC   36  PASS    SVTYPE=BND;MATEID=MantaBND:207:0:1:0:0:0:1;SVINSLEN=7;SVINSSEQ=GCCCCAT;BND_DEPTH=5;MATE_BND_DEPTH=4 GT:FT:GQ:PL:PR:SR 0/1:PASS:30:86,0,28:1,2:3,1   .   .   .'
=============================== warnings summary ===============================
venv3.7/lib/python3.7/site-packages/Bio/Alphabet/__init__.py:27
  /projects/dat/workspace/creisle/mavis/venv3.7/lib/python3.7/site-packages/Bio/Alphabet/__init__.py:27: PendingDeprecationWarning: We intend to remove or replace Bio.Alphabet in 2020, ideally avoid using it explicitly in your code. Please get in touch if you will be adversely affected by this. https://github.com/biopython/biopython/issues/2046
    PendingDeprecationWarning,

src/mavis/schemas/__init__.py:7
  /projects/dat/workspace/creisle/mavis/src/mavis/schemas/__init__.py:7: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    class ImmutableDict(collections.Mapping):

tests/end_to_end/test_convert.py::TestConvert::test_breakdancer
  /projects/dat/workspace/creisle/mavis/src/mavis/tools/breakdancer.py:40: FutureWarning: The default value of regex will change from True to False in a future version.
    df['num_Reads_lib'] = df['num_Reads_lib'].str.replace(bam, lib)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ============================
FAILED tests/end_to_end/test_convert.py::TestConvert::test_manta - OSError: u...
=================== 1 failed, 10 passed, 3 warnings in 2.50s ===================

Note: I install pysam with the following environment flag (from the setup.py)

export HTSLIB_CONFIGURE_OPTIONS='--disable-lzma --disable-bz2 --disable-libcurl'

Currently I am testing on the develop_v3 branch

@creisle
Copy link
Member Author

creisle commented Dec 28, 2021

note: the OS environ options do not appear to have any effect (probably b/c installing via wheel so it doesn't have to build from source). I will be leaving these options off for future runs

@creisle
Copy link
Member Author

creisle commented Dec 28, 2021

For the delly error it looks like it has something to do with this warning in the output

[W::vcf_parse_info] INFO/END=7059510 is smaller than POS at 19:17396810

however this should not be an issue since it is a translocation

Looks like it may be related to this issue: samtools/bcftools#1154

creisle added a commit that referenced this issue Dec 29, 2021
This was linked to pull requests Dec 29, 2021
@creisle creisle closed this as completed Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant