P4 C2 384 barcoded dataset

The long read lengths of PacBio’s SMRT® Sequencing enable detection of linked mutations across multiple kilobases of sequence. This feature is particularly useful in the context of protein engineering, where large numbers of similar constructs are generated routinely to explore the effects of mutations on function and stability.

We have developed a PCR-based barcoded sequencing method to generate high quality, full-length sequence data for batches of constructs generated in a common backbone. Individual barcodes are coupled to primers targeting a common region of the vector of interest. The amplified products are pooled into a single DNA library, and sequencing data are clustered by barcode to generate multi-molecule consensus sequences for each construct present in the pool.

As a proof-of-concept dataset, we have generated a library of 384 randomly mutated variants of the Phi29 DNA polymerase, a 575 amino acid protein encoded by a 1.7 kB gene. These variants were amplified with a set of barcoded primers, and the resulting library was sequenced on a single SMRT Cell. The data produced sequences that were completely concordant with independent Sanger sequencing, for a 100% accurate reconstruction of the set of clones.

A poster describing this work can be found here

The 384-barcoded data set described in the above poster can be found here

Visit the PacBio Developer's Network Website for the most up-to-date links to downloads, documentation and more. Terms of Use | Trademarks | Contact Us

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P4 C2 384 barcoded dataset

Clone this wiki locally