-
Notifications
You must be signed in to change notification settings - Fork 29
H_sapiens_54x_release
Instrument: PacBio RS II
Chemistry: C3
Enzyme: P5
The dataset released here contains the raw sequence data resulting from PacBio(R) SMRT(R)Sequencing for CHM1htert, a human cell line derived from a hydatidiform mole, as a resource for general community exploration. One ~20 kb long insert shotgun library was prepared from the same DNA sample. Size selection was performed using 7.5 kB and 10 kB elution cutoffs, respectively, on a BluePippin(TM) DNA size-selection system from SAGE Science. The genome was sequenced using P5-C3 chemistry and 3-hour SMRT Cell acquisitions to generate ~167 GB of sequence data.
Sequencing Data Statistics
Total number of reads: 21,856,161
Total number of post-filtered bases: 167,851,128,644 bp
Read length statistics
Half of sequenced bases in reads greater than: 10,739 bp
5% of reads longer than: 19,060 bp
Average read length: 7,680 bp
SMRTbell template statistics
Longest DNA insert sequenced: 42,774 bp
Average throughput/SMRT Cell: 608 Mb
To access the dataset, please navigate to http://datasets.pacb.com/2014/Human54x/fast.html. To reference the blog post, please visit http://blog.pacificbiosciences.com/2014/02/data-release-54x-long-read-coverage-for.html
To download, you can use wget or curl to go through the list of the file. For example, to download it with bash, save the list as file file_list and you can use a simple loop to download the files:
for f in `cat file_list`;do wget $f;done.
A more powerful download command is:
cat file_list | xargs -n 1 -P 4 wget --continue -P data/ #Download four files at once. Continue downloads if they are interrupted.
Please contact us via twitter @PacBio. We appreciate if you could follow us on twitter so that we can direct message in response.
Visit the PacBio Developer's Network Website for the most up-to-date links to downloads, documentation and more. Terms of Use | Trademarks | Contact Us