Skip to content
ice91 edited this page Aug 9, 2012 · 2 revisions

The next-generation sequencing technologies dramatically accelerate the throughput of DNA sequencing in a much faster rate than the growth rate of co mputer speed as predicted by the “Moore’s Law.” It is a problem even to load and run these sequencing data in memory. There is an urgent need for de novo assemblers to efficiently handle the huge amount of sequencing data using scalable commodity servers in the clouds. Here we present CloudBrush project, developing a parallel algorithm that runs on the MapReduce framework of cloud computing for de novo assembly of high-throughput sequencing data. The algorithm uses Myers’s bi-directed string graphs as its basis and consists of two main stages: graph construction and graph simplification.

Clone this wiki locally