Hubbry Logo
search
logo
40789

Genome skimming

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Genome skimming

Genome skimming is a sequencing approach that uses low-pass, shallow sequencing of a genome (up to 5%), to generate fragments of DNA, known as genome skims. These genome skims contain information about the high-copy fraction of the genome. The high-copy fraction of the genome consists of the ribosomal DNA, plastid genome (plastome), mitochondrial genome (mitogenome), and nuclear repeats such as microsatellites and transposable elements. It employs high-throughput, next generation sequencing technology to generate these skims. Although these skims are merely 'the tip of the genomic iceberg', phylogenomic analysis of them can still provide insights on evolutionary history and biodiversity at a lower cost and larger scale than traditional methods. Due to the small amount of DNA required for genome skimming, its methodology can be applied in other fields other than genomics. Tasks like this include determining the traceability of products in the food industry, enforcing international regulations regarding biodiversity and biological resources, and forensics.

In addition to the assembly of the smaller organellar genomes, genome skimming can also be used to uncover conserved ortholog sequences for phylogenomic studies. In phylogenomic studies of multicellular pathogens, genome skimming can be used to find effector genes, discover endosymbionts and characterize genomic variation.

The Internal transcribed spacers (ITS) are non-coding regions within the 18-5.8-28S rDNA in eukaryotes and are one feature of rDNA that has been used in genome skimming studies. ITS are used to detect different species within a genus, due to their high inter-species variability. These have low individual variability, preventing the identification of distinct strains or individuals. They are also present in all eukaryotes, have a high evolution rate and has been used in phylogenetic analysis between and across species.

When targeting nuclear rDNA, it is suggested that a minimum final sequencing depth of 100X is achieved, and sequences with less than 5X depth are masked.

The plastid genome, or plastome, has been used extensively in identification and evolutionary studies using genome skimming due to its high abundance within plants (~3-5% of cell DNA), small size, simple structure, greater conservation of gene structure than nuclear or mitochondrial genes. Plastids studies have previously been limited by the number of regions that could be assessed in traditional approaches. Using genome skimming, the sequencing of the entire plastid genome, or plastome, can be done at a fraction of the cost and time required for typical sequencing approaches like Sanger sequencing. Plastomes have been suggested as a method to replace traditional DNA barcodes in plants, such as the rbcL and matK barcode genes. Compared to the typical DNA barcode, genome skimming produces plastomes at a tenth of the cost per base. Recent uses of genome skims of plastomes have allowed greater resolution of phylogenies, higher differentiation of specific groups within taxa, and more accurate estimates of biodiversity. Additionally, the plastome has been used to compare species within a genus to look at evolutionary changes and diversity within a group.

When targeting plastomes, it is suggested that a minimum final sequencing depth of 30X is achieved for single-copy regions to ensure high-quality assemblies. Single nucleotide polymorphisms (SNPs) with less than 20X depth should be masked.

The mitochondrial genome, or mitogenome, is used as a molecular marker in a great variety of studies because of its maternal inheritance, high copy-number in the cell, lack of recombination, and high mutation rate. It is often used for phylogenetic studies as it is very uniform across metazoan groups, with a circular, double-stranded DNA molecule structure, about 15 to 20 kilobases, with 37 ribosomal RNA genes, 13 protein-coding genes, and 22 transfer RNA genes. Mitochondrial barcode sequences, such as COI, NADH2, 16S rRNA, and 12S rRNA, can also be used for taxonomic identification. The increased publishing of complete mitogenomes allows for inference of robust phylogenies across many taxonomic groups, and it can capture events such as gene rearrangements and positioning of mobile genetic elements. Using genome skimming to assemble complete mitogenomes, the phylogenetic history and biodiversity of many organisms can be resolved.

When targeting mitogenomes, there are no specific suggestions for minimum final sequencing depth, as mitogenomes are more variable in size and more variable in complexity in plant species, increasing the difficulty of assembling repeated sequences. However, highly conserved coding sequences and nonrepetitive flanking regions can be assembled using reference-guided assembly. Sequences should be masked similarly to targeting plastomes and nuclear ribosomal DNA.

See all
User Avatar
No comments yet.