Honors Program: Iowa State University

Project Name: Cracking the Corn Genome Code

1 2 3 4 5 6 Page 2
Page 2 of 6

GOALS

Iowa State University researchers worked on developing new methods for large-scale DNA sequence analysis with the following goals:

1) Develop better computational methods that enable significantly faster analysis.

2) Develop methods that require significantly less memory.

3) Produce software that runs on multiple processors using open-source operating systems such as Linux and freely available third-party software such as MPICH.

4) Produce software that scales from inexpensive commodity clusters to supercomputers such as the Blue Gene/L.

5) Solve challenging problems, both that are time-consuming and those that could not be done earlier by any stand-alone software, to demonstrate capability and discover knowledge.

6) Develop Web portals to disseminate the knowledge and empower scientists in agriculture and biorenewables.

7) Distribute the software to spread the benefits and enable other research and commercial organizations.

METHODS AND SCOPE

The Iowa State team conducted research over a span of two years to develop efficient parallel methods for large-scale DNA sequence analysis. The developed methods achieve significant reduction in runtime and achieve good parallel efficiency. The memory consumption is significantly reduced to be about 80B per each DNA base in the input, while existing technology requires as much as the square of the number of the sequences. The memory usage is uniformly spread across processors to allow much lower memory per processor foot print.

ACHIEVEMENTS

1) Developed a software framework called PaCE, which can run on any multiprocessor system with freely available Message Passing Interface (MPI).

2) The software is used to produce the first assembly of gene-enriched sequences from maize, sequenced from NSF pilot projects worth more than $10 million. By making the results available roughly three and a half months ahead of other competing efforts, the software was instrumental in empowering agricultural genetics scientists early and spreading the benefits of public investment.

3) The Web portal http://www.plantgenomics.iastate.edu, created to disseminate the assembly, was used by thousands of researchers both from academia and industry (including BASF and DuPont).

4) The innovations in the technology behind this effort received the best paper award in the Applications track of the International Parallel and Distributed Processing Symposium, the IEEE flagship conference in parallel processing. Four best paper awards, one in each track, were selected from 531 submissions.

5) The NSF funded Iowa State to procure a 1,024-node IBM Blue Gene/L system for further work on maize genome sequencing.

6) The software is in use by over 50 groups from 10 countries and also by DuPont subsidiary Pioneer Hi-Bred International.

7) The Iowa State team gave tutorials at several top conferences to disseminate the software and highlight its potential applications.

1 2 3 4 5 6 Page 2
Page 2 of 6
7 inconvenient truths about the hybrid work trend
 
Shop Tech Products at Amazon