Hubbry Logo
search
logo

UCSC Genome Browser

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
UCSC Genome Browser

The UCSC Genome Browser is an online and downloadable genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

The UCSC Genome Browser was developed in 2000 by graduate student Jim Kent and Professor David Haussler at the University of California, Santa Cruz (UCSC), to provide public access to the draft human genome sequence produced by the Human Genome Project. On July 7, 2000, UCSC released the first working draft of the human genome online, accompanied by an initial version of the Genome Browser. This release enabled researchers worldwide to access and explore the genome data interactively. The project received early funding from the Howard Hughes Medical Institute (HHMI) and the National Human Genome Research Institute (NHGRI). In 2002, the team published a detailed description of the Genome Browser in Genome Research, outlining its MySQL-based database and web interface. The browser featured various aligned annotation tracks, including gene predictions, mRNA/EST alignments, and SNP markers, all presented in a scrollable view. Users could also add custom tracks to visualize their data alongside official annotations. In that same year, the browser expanded to include the mouse genome, facilitating comparative genomics studies. Tools like BLAT (BLAST-like alignment tool) and LiftOver were introduced to enhance sequence alignment and coordinate conversion between different genome assemblies.

Between 2004 and 2010, the UCSC Genome Browser incorporated numerous additional genomes, including those of rat, chicken, dog, and chimpanzee, among others. The development of chain and net alignment algorithms allowed for whole-genome alignments between species, and the Conservation track visualized evolutionary conserved elements. To accommodate the influx of data from new genomic technologies, UCSC introduced Genome Graphs in 2007–2008, enabling users to plot genome-wide datasets, such as association study p-values, across entire genomes. The browser also implemented the BigBed and BigWig binary data formats in 2010, facilitating efficient visualization of large-scale sequencing datasets.

In 2011, UCSC launched Track Data Hubs, allowing external researchers to integrate their annotation tracks into the Genome Browser via remote URLs. UCSC played a pivotal role in the ENCODE (Encyclopedia of DNA Elements) project since its launch in 2003. This new feature significantly enhanced how researchers could interact with and visualize large-scale genomic datasets. The browser hosted a vast array of functional genomics data generated by ENCODE, including ChIP-seq, RNA-seq, and DNase hypersensitivity assays. The browser also integrated data from the 1000 Genomes Project, providing comprehensive access to human genetic variation data. In 2013, UCSC partnered with the GENCODE project to adopt its high-quality gene annotations. In 2015, the GENCODE gene set (GRCh38/hg38 assembly) replaced UCSC's in-house track as the default gene set of the human genome browser.

Beginning in 2016, the UCSC Genome Browser expanded its capabilities by integrating clinical and variant datasets, including those from ClinVar and various cancer genomics resources. In 2017, UCSC launched the UCSC Cell Browser, a companion platform designed to handle single-cell sequencing datasets and spatial transcriptomics. The browser has also integrated data from the Genotype-Tissue Expression (GTEx) project, providing visualization resources for gene expression across various human tissues. The browser now hosts over 180 genome assemblies from more than 100 species, including the fully telomere-to-telomere human genome assembly (T2T-CHM13) released by the T2T Consortium in 2022. Funding for the UCSC Genome Browser has transitioned to rely exclusively on NIH grants, with continued support from the NHGRI. In 2022, the browser was recognized as one of the inaugural Global Core Biodata Resources, highlighting its critical role in life science research and ensuring prioritized long-term funding. As of 2025, the UCSC Genome Browser continues to serve as an essential, freely accessible tool for researchers worldwide, accommodating daily usage by tens of thousands and regularly updating with new genomic data and functionalities.

In the years since its inception, the UCSC Browser has expanded to accommodate genome sequences of all vertebrate species and selected invertebrates for which high-coverage genomic sequences is available, now including 108 species. High coverage is necessary to allow overlap to guide the construction of larger contiguous regions. Genomic sequences with less coverage are included in multiple-alignment tracks on some browsers, but the fragmented nature of these assemblies does not make them suitable for building full featured browsers. (more below on multiple-alignment tracks). The species hosted with full-featured genome browsers are shown in the table. Updates to this section are dependent on new genome releases from sequencing centers; there was a 2 year difference between the last two genome additions.

Apart from these 108 species and their assemblies, the UCSC Genome Browser also offers Assembly Hubs, web-accessible directories of genomic data that can be viewed on the browser and include assemblies that are not hosted natively on it. There, users can load and annotate unique assemblies for which UCSC does not provide an annotation database. A full list of species and their assemblies can be viewed in the GenArk Portal, including 2,589 assemblies hosted by both UCSC Genome Browser database and Assembly Hubs. An example can be seen in the Vertebrate Genomes Project assembly hub. Below is a snippet of what users can find when they use the assembly hub:

The large amount of data about biological systems that is accumulating in the literature makes it necessary to collect and digest information using the tools of bioinformatics. The UCSC Genome Browser presents a diverse collection of annotation datasets (known as "tracks" and presented graphically), including mRNA alignments, mappings of DNA repeat elements, gene predictions, gene-expression data, disease-association data (representing the relationships of genes to diseases), and mappings of commercially available gene chips (e.g., Illumina and Agilent). The basic paradigm of display is to show the genome sequence in the horizontal dimension, and show graphical representations of the locations of the mRNAs, gene predictions, etc. Blocks of color along the coordinate axis show the locations of the alignments of the various data types. The ability to show this large variety of data types on a single coordinate axis makes the browser a handy tool for the vertical integration of the data.

See all
User Avatar
No comments yet.