-
Notifications
You must be signed in to change notification settings - Fork 22
paste
The paste program is a way to extract data, base by base, from multiple bigWigs simultaneously. Earlier versions of bwtool had programs to multiply or add two bigWigs together. Eventually it became clear that little is gained by adding complexity in that direction and bwtool be a lot more useful just by presenting the data in an easy-enough way that a small script can perform the calculation. Wig files have a vertical data format, but there is no guarantee the data in a wig file will properly align to another wig file base by base, line after line. Chromosomes may be stored in any order in a wig file, and data may be missing in different regions. paste takes care of this alignment issue, and can take an arbitrary number of bigWigs. Usage:
bwtool paste - simultaneously output same regions of multiple bigWigs
usage:
bwtool paste input1.bw input2.bw input3.bw ...
options:
-header put header with labels from file or filenames
-consts=c1,c2... add constants to output lines
-skip-NA don't output lines (bases) where one of the inputs is NA
With the two main example bigWigs:
using paste the default way will result in the following:
$ bwtool paste main.bigWig second.bigWig
chr 0 1 1.00 4.00
chr 1 2 2.00 2.00
chr 2 3 5.00 3.00
chr 3 4 6.00 4.00
chr 4 5 5.00 4.00
chr 5 6 3.00 3.00
chr 6 7 3.00 3.00
chr 7 8 5.00 7.00
chr 8 9 5.00 8.00
chr 9 10 5.00 7.00
chr 10 11 6.00 7.00
chr 11 12 6.00 5.00
chr 12 13 0.00 1.00
chr 13 14 2.00 2.00
chr 14 15 3.00 3.00
chr 15 16 3.00 3.00
chr 16 17 10.00 4.00
chr 17 18 4.00 4.00
chr 18 19 4.00 4.00
chr 19 20 2.00 2.00
chr 20 21 2.00 2.00
chr 21 22 2.00 2.00
chr 22 23 1.00 1.00
chr 23 24 NA 1.00
chr 24 25 NA 1.00
chr 25 26 NA 2.00
chr 26 27 NA 1.00
chr 27 28 2.00 2.00
chr 28 29 3.00 3.00
chr 29 30 4.00 4.00
chr 30 31 6.00 4.00
chr 31 32 6.00 2.00
chr 32 33 4.00 2.00
chr 33 34 4.00 2.00
chr 34 35 4.00 2.00
chr 35 36 2.00 2.00
As in the window program, it is useful to adjust the number of decimals with the -decimals option. Also useful is to keep track of which column represents which bigWig by adding a header with the -header option. And finally, NA values may or may not be useful, depending on the circumstance. If they're not desired, then the -skip-NA option can be used to eliminate output lines where any of the bigWigs has a missing base. All three of these options can be seen in this example:
$ bwtool paste main.bigWig second.bigWig -decimals=3 -header -skip-NA
#chrom chromStart chromEnd main.bigWig second.bigWig
chr 0 1 1.000 4.000
chr 1 2 2.000 2.000
chr 2 3 5.000 3.000
chr 3 4 6.000 4.000
chr 4 5 5.000 4.000
chr 5 6 3.000 3.000
chr 6 7 3.000 3.000
chr 7 8 5.000 7.000
chr 8 9 5.000 8.000
chr 9 10 5.000 7.000
chr 10 11 6.000 7.000
chr 11 12 6.000 5.000
chr 12 13 0.000 1.000
chr 13 14 2.000 2.000
chr 14 15 3.000 3.000
chr 15 16 3.000 3.000
chr 16 17 10.000 4.000
chr 17 18 4.000 4.000
chr 18 19 4.000 4.000
chr 19 20 2.000 2.000
chr 20 21 2.000 2.000
chr 21 22 2.000 2.000
chr 22 23 1.000 1.000
chr 27 28 2.000 2.000
chr 28 29 3.000 3.000
chr 29 30 4.000 4.000
chr 30 31 6.000 4.000
chr 31 32 6.000 2.000
chr 32 33 4.000 2.000
chr 33 34 4.000 2.000
chr 34 35 4.000 2.000
chr 35 36 2.000 2.000