1000 Plant Genomes Project

Main page

What are your thoughts?

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

1000 Plant Genomes Project

Community hub0 subscribers

Talks overview Knowledge Base overview

About hubStatsRules

Wikipedia

1000 Plant Genomes Project

The 1000 Plant Transcriptomes Initiative (1KP) was an international research effort to establish the most detailed catalogue of genetic variation in plants. It was announced in 2008 and headed by Gane Ka-Shu Wong and Michael Deyholos of the University of Alberta. The project successfully sequenced the transcriptomes (expressed genes) of 1,000 different plant species by 2014; its final capstone products were published in 2019.

1KP was a large-scale (involving many organisms) sequencing projects designed to take advantage of the wider availability of high-throughput ("next-generation") DNA sequencing technologies. The similar 1000 Genomes Project, for example, obtained high-coverage genome sequences of 1,000 individual people between 2008 and 2015, to better understand human genetic variation. The initiative provided a template for further planetary-scale genome projects, including the 10KP Project—sequencing the whole genomes of 10,000 plants, and the Earth BioGenome Project—aiming to sequence, catalogue, and characterize the genomes of all of Earth's eukaryotic biodiversity.

As of 2002^[update], the number of classified green plant species was estimated to be around 370,000, however, there are probably many thousands more yet unclassified. Despite this number, very few of these species have detailed DNA sequence information to date; 125,426 species in GenBank, as of 11 April 2012^[update], but most (>95%) having DNA sequence for only one or two genes. "...almost none of the roughly half million plant species known to humanity has been touched by genomics at any level". The 1000 Plant Genomes Project aimed to produce a roughly a 100x increase in the number of plant species with available broad genome sequence.

There have been efforts to determine the evolutionary relationships between the known plant species, but phylogenies (or phylogenetic trees) created solely using morphological data, cellular structures, single enzymes, or on only a few sequences (like rRNA) can be prone to error; morphological features are especially vulnerable when two species look physically similar though they are not closely related (as a result of convergent evolution for example) or homology, or when two species closely related look very different because, for example, they are able to change in response to their environment very well. These situations are very common in the plant kingdom. An alternative method for constructing evolutionary relationships is through changes in DNA sequence of many genes between the different species which is often more robust to problems of similar-appearing species. With the amount of genomic sequence produced by this project, many predicted evolutionary relationships could be better tested by sequence alignment to improve their certainty. With 383,679 nuclear gene family phylogenies and 2,306 gene age distributions with Ks plots used in the final analysis and shared in GigaDB alongside the capstone paper.

The list of plant genomes sequenced in the project was not random; instead plants that produce valuable chemicals or other products (secondary metabolites in many cases) were focused on in the hopes that characterizing the involved genes will allow the underlying biosynthetic processes to be used or modified. For example, there are many plants known to produce oils (like olives) and some of the oils from certain plants bear a strong chemical resemblance to petroleum products like the Oil palm and hydrocarbon-producing species. If these plant mechanisms could be used to produce mass quantities of industrially useful oil, or modified such that they do, then they would be of great value. Here, knowing the sequence of the plant's genes involved in the metabolic pathway producing the oil is a large first step to allow such utilization. A recent example of how engineering natural biochemical pathways works is Golden rice which has involved genetically modifying its pathway, so that a precursor to vitamin A is produced in large quantities making the brown-colored rice a potential solution for vitamin A deficiency. This is concept of engineering plants to do "work" is popular and its potential would dramatically increase as a result of gene information on these 1000 plant species. Biosynthetic pathways could also be used for mass production of medicinal compounds using plants rather than manual organic chemical reactions as most are created currently.

One of the most unexpected results of the project was the discovery of multiple novel light-sensitive ion-channels used extensively for optogenetic control of neurons discovered through sequencing and physiological characterization of opsins from over 100 species of alga species by the project. The characterization of these novel channelrhodopsin sequences providing resources for protein engineers who would normally have no interest in or ability to generate sequence data from these many plant species. A number of biotech companies are developing these channelrhodopsin proteins for medical purposes, with many of these optogenetic therapy candidates under clinical trials to restore vision for retinal blindness. The first published results of these treating retinitis pigmentosa coming out in July 2021.

Sequencing was initially done on the Illumina Genome Analyzer GAII next-generation DNA sequencing platform at the Beijing Genomics Institute (BGI Shenzhen, China), but later samples were run on the faster Illumina HiSeq 2000 platform. Starting with the 28 Illumina Genome Analyzer next-generation DNA sequencing machines, these were eventually upgraded to 100 HiSeq 2000 sequencers at the Beijing Genomics Institute. The initial 3Gb/run (3 billion base pairs per experiment) capacity of each of these machines enabled fast and accurate sequencing of the plant samples.

The selection of plant species to be sequenced was compiled through an international collaboration of the various funding agencies and researcher groups expressing their interest in certain plants. There was a focus on those plant species that are known to have useful biosynthetic capacity to facilitate the biotechnology goals of the project, and selection of other species to fill in gaps and explain some unknown evolutionary relationships of the current plant phylogeny. In addition to industrial compound biosynthetic capacity, plant species known or suspected to produce medically active chemicals (such as poppies producing opiates) were assigned a high priority to better understand the synthesis process, explore commercial production potential, and discover new pharmaceutical options. A large number of plant species with medicinal properties were selected from traditional Chinese medicine (TCM). The completed list of selected species can be publicly viewed on the website, and methodological details and data access details have been published in detail.

See all

Hub AI

1000 Plant Genomes Project AI simulator

(@1000 Plant Genomes Project_simulator)

Wikipedia

Hub AI

1000 Plant Genomes Project

See all

Talk Channels

Knowledge Base

Special Pages

Talk Channels

Knowledge Base

Special Pages

1000 Plant Genomes Project

1000 Plant Genomes Project

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

1000 Plant Genomes Project

Hub AI

1000 Plant Genomes Project

Contribute something to knowledge base

History

History

1000 Plant Genomes Project

1000 Plant Genomes Project

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

1000 Plant Genomes Project

Hub AI

1000 Plant Genomes Project

Contribute something to knowledge base