Take millions of puzzle pieces containing partial words and put them back together into full words, sentences, paragraphs and chapters until the book these random parts came from is rebuilt.That daunting process in not unlike sequencing an organism’s genome, says University of Oregon biologist Eric A. Johnson, a member of the UO Institute of Molecular Biology. His lab developed a patent-pending technology for discovering differences between genomes called restriction-site associated DNA markers, or RAD. They have now shown that RAD can also be used to help put a genome sequence together.
The original RAD technique, unveiled in 2005, led to the UO spinoff company Floragenex, which uses the technology in plant genetics. More recently, Johnson and UO colleague William A. Cresko used it to identify genetic differences in threespine stickleback, a fish, which evolved separately after environmental conditions had isolated some of the saltwater fish into freshwater habitats.
Now, after three years of research, adapting it along the way as sequencing tools advanced, Johnson, Cresko and three UO colleagues provide a proof-of-principle paper in the April issue of PLoS One, a publication of the Public Library of Science. The National Institutes of Health-funded research documents that the new method, called RAD paired-end contigs, works and provides accurate sequencing results.
“The RAD sequence is a placeholder that identifies one small region of a genome,” Johnson said. “We showed that this technique lets us gather together appropriate nearby sequences and piece them together.” In just seconds, a section is completed, he said. In a matter of hours, he added, an entire genome’s sequence emerges.
Using the book analogy, Johnson said: “We first asked if we can piece together one short sentence at a time instead of ordering all the words in the whole book at once. Next, can we put together one paragraph at a time? That’s like going from, say, 1,000 letters of the genome in a row to 5,000 at a time. Here, we show that we can do this. We can put the book back together.”
A RAD marker in the book analogy might be a word at the start of each sentence. Using that marker as an anchor, the rest of the words in the sentence are easily separated from all the words in the book, and then the words in the sentence are put in the right order, on an on until all the content is organized correctly, Johnson said. The end product, he added, contains few, if any, leftover or unexplained fragments, a problem occurring with current technologies that rely on clusters of computers, requiring extensive memory, to complete sequencing projects.
RAD technology can be applied to study the genetics of organisms for which genomes have not been completed, Johnson said.
The PLoS One paper detailed how RAD works on stickleback and the bacterium e-coli. Both involve small and rather simple genomes. There already is interest in its potential application in human genome sequencing, Johnson said.
At about the same time the PLoS One paper was being published, Johnson, who also is the chief scientific officer of Floragenex, was part of a company-sponsored, one-day RAD Sequencing and Genomics Symposium in Portland, Ore., on April 19. Nine researchers from five institutions (UO, Oregon State University, University of British Columbia, University of Tennessee and University of Washington) described how they are applying RAD in their sequencing projects.
“It was quite gratifying to hear them speak about this technology and how it is working for them,” Johnson said.
RAD technology also is being used in a three-year project — funded by a $1 million grant from the W.M. Keck Foundation and led by Cresko and UO colleague Hui Zong — to identify genetic changes that occur from the formation of a single mutation to full-fledged cancer. The project could lead to a new generation of molecular diagnostics to detect cancers in their earliest stages.
Citation: Etter PD, Preston JL, Bassham S, Cresko WA, Johnson EA (2011) Local De Novo Assembly of RAD Paired-End Contigs Using Short Sequencing Reads. PLoS ONE 6(4): e18561. doi:10.1371/journal.pone.0018561
Source: University of Oregon via EurekAlert!
Tags: genome sequencing