WEST LAFAYETTE, Indiana, January 13, 2010 (ENS) – The soybean, one of the world’s most important sources of protein and oil, is now the first major crop legume species with a published complete draft genome sequence.
Basically a parts list of the soybean genome, the sequence will help scientists use the plant’s genes to improve the plant’s characteristics. The soybean sequencing study appears as the cover story of today’s edition of the journal “Nature.”
Genetically modified soy improvements are expected to yield beans with a greater than 40 percent oil content, producing more oil for food and biodiesel. Another mutation can be used to select for a line that will improve the ability of animals and humans to digest soybeans. The sequence has provided access to the first resistance gene for the disease Asian soybean rust that can destroy up to 80 percent of crops.
“When soybeans were domesticated, they were selected for seed size and other traits, but there were a lot of potentially valuable genes left behind,” said agronomy professor Scott Jackson of Purdue, the corresponding author on the soybean genome paper. “There may be valuable genes associated with protein content or disease resistance in the stored lines that are not currently in the cultivated lines.”
Having the new soybean sequence as a reference is expected to increase the speed and reduce the costs of resequencing the 20,000 stored soybean lines.
The study was authored by Jeremy Schmutz of the Joint Genome Institute and the HudsonAlpha Genome Sequencing Center, the U.S. Department of Agriculture-Agricultural Research Service, Purdue University and the University of North Carolina at Charlotte and 43 other researchers from 18 institutions. The Department of Energy, the National Science Foundation, USDA and United Soybean Board supported the research.
Schmutz said that the soybean sequencing was the largest plant project done to date at the Institute. “It also happens to be the largest plant that’s ever been sequenced by the whole genome shotgun strategy – where we break it apart and reassemble it like a huge puzzle,” he said.
Of the more than 20 other plant genomes taken on by the Joint Genome Institute, those already sequenced include the black cottonwood (poplar) tree and the grain sorghum, both targeted because of their promise as biomass feedstocks for biofuels production.
The sequencing of the soybean genome began with the production of a physical map of the soybean genome by a research team funded by the National Science Foundation.
Production of this map was complicated by the complexities of the soybean genome, which include duplicate copies of genes that make up 70 to 80 percent of the genome’s 46,000 genes. These gene copies are scattered throughout the genome and so are particularly difficult to locate.
In addition, the soybean genome contains large numbers of transposable elements, called TEs. These are mobile DNA pieces that may impact gene expression, but TEs are difficult to distinguish from genes.
Jackson said the U.S. departments of Energy and Agriculture study found that the soybean has about 46,000 genes, but many of those – 70 percent to 80 percent – are duplicates. This duplication may make it difficult to target the genes necessary to improve soybean characteristics such as seed size, oil content or yield.
Adding to the difficulty, Jackson said many of the duplicated genes in the soybean genome have been shuffled, making it hard to predict where the duplicate copies of a gene might be. This complicates the genetics and breeding of soybeans.
“It really is going to change the way we ask questions about soybeans in research,” said Randy Shoemaker, a research geneticist from the U.S. Department of Agriculture’s Agricultural Research Service at Iowa State University and the paper’s co-author. “What used to take us literally years can take us weeks or months now. This is the entire genetic code in front of you.”
Having the genome in hand will allow scientists to compare different varieties of soybean plants and determine which genes are responsible for different characteristics, such as increased oil content or larger plants. One of the next steps in the research is to resequence the 20,000 soybean lines in the U.S. germplasm collection to find genes not common to all soybean cultivars.
“When soybeans were domesticated, they were selected for seed size and other traits, but there were a lot of potentially valuable genes left behind,” Jackson said. “There may be valuable genes associated with protein content or disease resistance.”
Containing so many TEs and gene duplicates, the soybean genome is “the most complicated genome sequenced to date,” said Jackson. And some of the same complexities that complicated the mapping and sequencing of the genome may also complicate the targeting of soybean genes.
“If I’m selecting for a gene, I may have difficulty locating all of the necessary duplicates of that gene, explains Jackson. “It has a lot of backup copies.”
Confident that such difficulties will be overcome, Jane Silverthorne of the National Science Foundation describes the new soybean sequence as “a valuable tool that will enable research towards a deeper understanding of the impacts of multiple genome copies on genome organization and function.”
The results of the sequencing project have already provided material for a second paper, which will appear in the journal “The Plant Cell” on January 15. Jianxin Ma of Purdue University and a member of the sequencing team says that this second paper will explain how TEs thrive in the host genome.
“We found that some ‘dead’ TEs can actually be revivified by swapping with their active TE partners, and thus restore or even enhance their ability to proliferate using the amplification machinery encoded by their partners,” said Ma. “Although TEs are ubiquitous, what we discovered has not been seen in any other organisms.”
“The soybean genome’s billion-plus nucleotides afford us a better understanding of the plant’s capacity to turn sunlight, carbon dioxide, nitrogen and water, into concentrated energy, protein, and nutrients for human and animal use,” said Anna Palmisano, Department of Energy’s associate director of science for biological and environmental research. “This opens the door to crop improvements that are sorely needed for energy production, sustainable human and animal food production, and a healthy environmental balance in agriculture worldwide.”
A video of Schmutz discussing the soybean genome project is online at the DOE JGI’s SciVee channel: http://www.scivee.tv/user/7476.