Skip to content
Andy Pohl edited this page Oct 29, 2013 · 9 revisions

The window program extracts data as tiled windows, in an easy-to-see way.

The usage:

bwtool window - slide a window across the bigWig and at each step print data
   in a format like:

   chrom<TAB>start<TAB>end<TAB>val_start,val_start+1,val_start+2,...,val_end

usage:
   bwtool window size file.bw
options:
   -step=n         skip n bases when sliding window (default 1)
   -skip-NA        don't output lines (windows) containing any NA values
   -center         print start and end coordinates of the middle of the window
                   with size step such that the start/ends are connected each
                   line (if step < size)

Examples

Using the same example from the aggregate page:

I can do very simply ask for 5-base windows of data, every base:

$ bwtool window 5 main.bigWig 
chr	0	5	1.00,2.00,5.00,6.00,5.00
chr	1	6	2.00,5.00,6.00,5.00,3.00
chr	2	7	5.00,6.00,5.00,3.00,3.00
chr	3	8	6.00,5.00,3.00,3.00,5.00
chr	4	9	5.00,3.00,3.00,5.00,5.00
chr	5	10	3.00,3.00,5.00,5.00,5.00
chr	6	11	3.00,5.00,5.00,5.00,6.00
chr	7	12	5.00,5.00,5.00,6.00,6.00
chr	8	13	5.00,5.00,6.00,6.00,0.00
chr	9	14	5.00,6.00,6.00,0.00,2.00
chr	10	15	6.00,6.00,0.00,2.00,3.00
chr	11	16	6.00,0.00,2.00,3.00,3.00
chr	12	17	0.00,2.00,3.00,3.00,10.00
chr	13	18	2.00,3.00,3.00,10.00,4.00
chr	14	19	3.00,3.00,10.00,4.00,4.00
chr	15	20	3.00,10.00,4.00,4.00,2.00
chr	16	21	10.00,4.00,4.00,2.00,2.00
chr	17	22	4.00,4.00,2.00,2.00,2.00
chr	18	23	4.00,2.00,2.00,2.00,1.00
chr	19	24	2.00,2.00,2.00,1.00,NA
chr	20	25	2.00,2.00,1.00,NA,NA
chr	21	26	2.00,1.00,NA,NA,NA
chr	22	27	1.00,NA,NA,NA,NA
chr	23	28	NA,NA,NA,NA,2.00
chr	24	29	NA,NA,NA,2.00,3.00
chr	25	30	NA,NA,2.00,3.00,4.00
chr	26	31	NA,2.00,3.00,4.00,6.00
chr	27	32	2.00,3.00,4.00,6.00,6.00
chr	28	33	3.00,4.00,6.00,6.00,4.00
chr	29	34	4.00,6.00,6.00,4.00,4.00
chr	30	35	6.00,6.00,4.00,4.00,4.00
chr	31	36	6.00,4.00,4.00,4.00,2.00

But maybe those NAs are not desired. To simplify things downstream, you can use -fill=value, or -skip-NA:

$ bwtool window 5 main.bigWig -skip-NA
chr	0	5	1.00,2.00,5.00,6.00,5.00
chr	1	6	2.00,5.00,6.00,5.00,3.00
chr	2	7	5.00,6.00,5.00,3.00,3.00
chr	3	8	6.00,5.00,3.00,3.00,5.00
chr	4	9	5.00,3.00,3.00,5.00,5.00
chr	5	10	3.00,3.00,5.00,5.00,5.00
chr	6	11	3.00,5.00,5.00,5.00,6.00
chr	7	12	5.00,5.00,5.00,6.00,6.00
chr	8	13	5.00,5.00,6.00,6.00,0.00
chr	9	14	5.00,6.00,6.00,0.00,2.00
chr	10	15	6.00,6.00,0.00,2.00,3.00
chr	11	16	6.00,0.00,2.00,3.00,3.00
chr	12	17	0.00,2.00,3.00,3.00,10.00
chr	13	18	2.00,3.00,3.00,10.00,4.00
chr	14	19	3.00,3.00,10.00,4.00,4.00
chr	15	20	3.00,10.00,4.00,4.00,2.00
chr	16	21	10.00,4.00,4.00,2.00,2.00
chr	17	22	4.00,4.00,2.00,2.00,2.00
chr	18	23	4.00,2.00,2.00,2.00,1.00
chr	27	32	2.00,3.00,4.00,6.00,6.00
chr	28	33	3.00,4.00,6.00,6.00,4.00
chr	29	34	4.00,6.00,6.00,4.00,4.00
chr	30	35	6.00,6.00,4.00,4.00,4.00
chr	31	36	6.00,4.00,4.00,4.00,2.00

It should also be mentioned that the output of the window program can be severely large. Think for a minute whether you have the space to store 1000 bp windows every base, with 4 decimal precision, uncompressed, for a genome-wide bigWig. You might not. The best case is that this is immediately piped into something else, which promptly does a calculation of some sort. Otherwise, this is a good time to limit decimal places and perhaps use the -step option to not go base-by-base, but instead make jumps of a specified amount. For example:

$ bwtool window 5 main.bigWig -skip-NA -decimals=0 -step=3
chr	0	5	1,2,5,6,5
chr	3	8	6,5,3,3,5
chr	6	11	3,5,5,5,6
chr	9	14	5,6,6,0,2
chr	12	17	0,2,3,3,10
chr	15	20	3,10,4,4,2
chr	18	23	4,2,2,2,1
chr	27	32	2,3,4,6,6
chr	30	35	6,6,4,4,4

has trimmed things down quite a bit. At this point, one thing left that may seem convenient is to change coordinates to something that reflects what the window is boiling down to. If the idea is to pipe the data from the window into something that averages it, and then make a new bedGraph based on the average, using for example a small awk program (window_ave.awk):

$ bwtool window 5 main.bigWig  -skip-NA -step=3 -decimals=0 | awk -f window_ave.awk
chr	0	5	2.8
chr	3	8	3.4
chr	6	11	3.6
chr	9	14	3.4
chr	12	17	1.6
chr	15	20	4.2
chr	18	23	2
chr	27	32	3
chr	30	35	4

is not the answer, because overlapping intervals are not allowed in the bedGraph format. The -center option helps with this problem:

$ bwtool window 5 main.bigWig -skip-NA -step=3 -decimals=0 -center | awk -f window_ave.awk
chr	1	4	2.8
chr	4	7	3.4
chr	7	10	3.6
chr	10	13	3.4
chr	13	16	1.6
chr	16	19	4.2
chr	19	22	2
chr	28	31	3
chr	31	34	4
Clone this wiki locally