Imagine for a moment that you have a 15-million piece puzzle that you are trying to put together. Now imagine that you bought 30 copies of the same puzzle, but the copies you bought were all on clearance because they are full of defects. On average, each puzzle piece contains one or two defects. Now imagine that you open all the puzzles, dump the pieces into a pile on the floor, and then throw away the puzzle boxes. You now have a pile of approximately 500 million puzzle pieces and no box to show you the final picture.
While this might sounds like a computer scientist’s worst nightmare, it is, in actuality, a real problem facing bioinformaticians today. Of course researchers are not buying defective puzzles and throwing away the boxes. But they are trying to assemble an individual’s complete DNA sequence, or genome. Researchers assemble a genome from millions of short DNA fragments, which we call short reads. Next-generation sequencing (NGS) machines are able to produce these short reads from a DNA sample in a massively parallel fashion, thereby producing vast amounts of data that researchers are having difficulty dealing with. Short reads vary in length, due to the drastically different technology used to sequence them (e.g., the next-generation sequencing machines), but they range from approximately 100 nucleotide bases up to 1k bases. The problem of trying to assemble a complete genome from these short reads in the absence of a reference genome is called de novo assembly.
Many approaches have been taken to try to solve this problem using traditional CPU clusters and compute resources. However, no matter which algorithm is used, the feedback from these different approaches always seems to be the same: the program takes too much memory and it takes too long to run.
In an effort to improve this situation, Pico Computing is asking the fundamental questions: what can we do to improve the de novo assembly time and resources required? Can we do de novo assembly on FPGAs? What are the compute requirements in order to tackle this? How do we do this with a finite amount of memory accessible from a single FPGA? Can we partition this problem in a manner that would allow us to take advantage of a scalable number of FPGAs? How do we do this without having to store all 500M puzzle pieces in an FPGA?”
I’ll be investigating these questions and more in a series of blog entries, white papers, and design offerings in the future. Stay tuned—we’re shooting for the stars, and it’s about to get interesting!