Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ability to provide bed file for sambamba slice #307

Merged
merged 7 commits into from
Aug 31, 2017

Conversation

godotgildor
Copy link
Contributor

I have added the ability to provide a bed file to specify regions for the sambamba slice operation. Users may specify either a bed file or the regions on the command line, but not both.

I have also added a test to the .test_suite.sh script to verify that it will slice and not return an error.

One note: the bed parser will combine regions, so if a bed file has two identical entries the sliced bam will only contain the given slice one time. This is unlike taking a region from the command line where no region combining is attempted. So if two identical regions are specified on the command line then the sliced bam would contain the given slice two times.

Copy link
Contributor

@lomereiter lomereiter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I put a few suggestions, but the only crucial bit is about array boundaries.

sambamba/slice.d Outdated
@@ -321,7 +332,13 @@ int slice_main(string[] args) {
if (args[2] == "*") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With new changes, if BED file is provided, this causes access outside array boundaries.

sambamba/slice.d Outdated

if (args.length < 3) {
if ( (bed_filename.length == 0 && args.length < 3) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more readable to first check if args.length < 2, then put args[2..$] into a variable and then check if it's empty/non-empty.

sambamba/slice.d Outdated
Region[] regions;
if (bed_filename !is null) {
auto bam_regions = parseBed(bed_filename, bam);
regions = cast(Region[])bam_regions.map!(r => Region(bam.reference(r.ref_id).name, r.start, r.end)).array;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast() seems superfluous

.test_suite.sh Outdated
testSliceMultipleRegionsBed() {
./build/sambamba slice -L test/chr2_test_region.bed test/issue_204.bam |\
./build/sambamba view -c /dev/stdin > /dev/null
assertEquals 0 $?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps check at least the number of reads? (see test below, the test above is admittedly not a good example)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also add another BED with overlapping/repeated entries.

@pjotrp
Copy link
Member

pjotrp commented Aug 4, 2017

@godotgildor thank you for contributing to sambamba! Wish we got more of those. @lomereiter I think we can fix these ourselves.

* Making the arg checks more readable
* Avoiding accessing out of bounds array element if bedfile provided
* Updating test to include
  * Actual check that # lines matches expected
  * Bed file has overlapping region
@godotgildor
Copy link
Contributor Author

Hi, sorry it took me a while to get to the requested revisions. I think I have addressed the above concerns in the latest commit. Let me know what you think.

@lomereiter lomereiter merged commit 3609790 into biod:master Aug 31, 2017
@lomereiter
Copy link
Contributor

Hi, no hurry, you're not the only busy person here.
The PR looks good now, thanks for the contribution!

@vladsavelyev
Copy link

vladsavelyev commented Nov 6, 2017

Hi Artem - I'm wondering if you make a quick release containing this change? I understand that making a proper release requires lots of testing and other associated work, but probably you could through some kind of v0.6.7b? I'm struggling to compile sambamba on my own under Red Hat :(

@pjotrp
Copy link
Member

pjotrp commented Nov 23, 2017

Released.

@vladsavelyev
Copy link

Much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants