Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
Biological database
Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis.[citation needed] They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization (both cellular and chromosomal), clinical effects of mutations as well as similarities of biological sequences and structures.
Biological databases can be classified by the kind of data they collect (see below). Broadly, there are molecular databases (for sequences, molecules, etc.), functional databases (for physiology, enzyme activities, phenotypes, ecology etc), taxonomic databases (for species and other taxonomic ranks), images and other media, or specimens (for museum collections etc.)
Databases are important tools in assisting scientists to analyze and explain a host of biological phenomena from the structure of biomolecules and their interaction, to the whole metabolism of organisms and to understanding the evolution of species. This knowledge helps facilitate the fight against diseases, assists in the development of medications, predicting certain genetic diseases and in discovering basic relationships among species in the history of life.
Relational database concepts of computer science and Information retrieval concepts of digital libraries are important for understanding biological databases. Biological database design, development, and long-term management is a core area of the discipline of bioinformatics. Data contents include gene sequences, textual descriptions, attributes and ontology classifications, citations, and tabular data. These are often described as semi-structured data, and can be represented as tables, key delimited records, and XML structures.[citation needed]
Most biological databases are available through web sites that organise data such that users can browse through the data online. In addition the underlying data is usually available for download in a variety of formats. Biological data comes in many formats. These formats include text, sequence data, protein structure and links. Each of these can be found from certain sources, for example:[citation needed]
Biological knowledge is distributed among countless databases. This sometimes makes it difficult to ensure the consistency of information, e.g. when different names are used for the same species or different data formats. As a consequence, inter-operability is a constant challenge for information exchange. For instance, if a DNA sequence database stores the DNA sequence along the name of a species, a name change of that species may break the links to other databases which may use a different name. Integrative bioinformatics is one field attempting to tackle this problem by providing unified access. One solution is how biological databases cross-reference to other databases with accession numbers to link their related knowledge together (e.g. so that the accession number stays the same even if a species name changes). Redundancy is another problem, as many databases must store the same information, e.g. protein structure databases also contain the sequence of the proteins they cover, their sequence, and their bibliographic information.
Species-specific databases are available for some species, mainly those that are often used in research (model organisms). For example, EcoCyc is an E. coli database. Other popular model organism databases include Mouse Genome Informatics for the laboratory mouse, Mus musculus, the Rat Genome Database for Rattus, ZFIN for Danio Rerio (zebrafish), PomBase for the fission yeast Schizosaccharomyces pombe, FlyBase for Drosophila, WormBase for the nematodes Caenorhabditis elegans and Caenorhabditis briggsae, and Xenbase for Xenopus tropicalis and Xenopus laevis frogs.
Numerous databases attempt to document the diversity of life on earth. A prominent example is the Catalogue of Life, first created in 2001 by Species 2000 and the Integrated Taxonomic Information System. The Catalogue of Life is a collaborative project that aims to document taxonomic categorization of all currently accepted species in the world. The Catalogue of Life provides a consolidated and consistent database for researchers and policymakers to reference. The Catalogue of Life curates up-to-date datasets from other sources such as Conifer Database, ICTV MSL (for viruses), and LepIndex (for butterflies and moths). In total, the Catalogue of Life draws from 165 databases as of May 2022. Operational costs of the Catalogue of Life are paid for by the Global Biodiversity Information Facility, the Illinois Natural History Survey, the Naturalis Biodiversity Center, and the Smithsonian Institution.
Hub AI
Biological database AI simulator
(@Biological database_simulator)
Biological database
Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis.[citation needed] They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization (both cellular and chromosomal), clinical effects of mutations as well as similarities of biological sequences and structures.
Biological databases can be classified by the kind of data they collect (see below). Broadly, there are molecular databases (for sequences, molecules, etc.), functional databases (for physiology, enzyme activities, phenotypes, ecology etc), taxonomic databases (for species and other taxonomic ranks), images and other media, or specimens (for museum collections etc.)
Databases are important tools in assisting scientists to analyze and explain a host of biological phenomena from the structure of biomolecules and their interaction, to the whole metabolism of organisms and to understanding the evolution of species. This knowledge helps facilitate the fight against diseases, assists in the development of medications, predicting certain genetic diseases and in discovering basic relationships among species in the history of life.
Relational database concepts of computer science and Information retrieval concepts of digital libraries are important for understanding biological databases. Biological database design, development, and long-term management is a core area of the discipline of bioinformatics. Data contents include gene sequences, textual descriptions, attributes and ontology classifications, citations, and tabular data. These are often described as semi-structured data, and can be represented as tables, key delimited records, and XML structures.[citation needed]
Most biological databases are available through web sites that organise data such that users can browse through the data online. In addition the underlying data is usually available for download in a variety of formats. Biological data comes in many formats. These formats include text, sequence data, protein structure and links. Each of these can be found from certain sources, for example:[citation needed]
Biological knowledge is distributed among countless databases. This sometimes makes it difficult to ensure the consistency of information, e.g. when different names are used for the same species or different data formats. As a consequence, inter-operability is a constant challenge for information exchange. For instance, if a DNA sequence database stores the DNA sequence along the name of a species, a name change of that species may break the links to other databases which may use a different name. Integrative bioinformatics is one field attempting to tackle this problem by providing unified access. One solution is how biological databases cross-reference to other databases with accession numbers to link their related knowledge together (e.g. so that the accession number stays the same even if a species name changes). Redundancy is another problem, as many databases must store the same information, e.g. protein structure databases also contain the sequence of the proteins they cover, their sequence, and their bibliographic information.
Species-specific databases are available for some species, mainly those that are often used in research (model organisms). For example, EcoCyc is an E. coli database. Other popular model organism databases include Mouse Genome Informatics for the laboratory mouse, Mus musculus, the Rat Genome Database for Rattus, ZFIN for Danio Rerio (zebrafish), PomBase for the fission yeast Schizosaccharomyces pombe, FlyBase for Drosophila, WormBase for the nematodes Caenorhabditis elegans and Caenorhabditis briggsae, and Xenbase for Xenopus tropicalis and Xenopus laevis frogs.
Numerous databases attempt to document the diversity of life on earth. A prominent example is the Catalogue of Life, first created in 2001 by Species 2000 and the Integrated Taxonomic Information System. The Catalogue of Life is a collaborative project that aims to document taxonomic categorization of all currently accepted species in the world. The Catalogue of Life provides a consolidated and consistent database for researchers and policymakers to reference. The Catalogue of Life curates up-to-date datasets from other sources such as Conifer Database, ICTV MSL (for viruses), and LepIndex (for butterflies and moths). In total, the Catalogue of Life draws from 165 databases as of May 2022. Operational costs of the Catalogue of Life are paid for by the Global Biodiversity Information Facility, the Illinois Natural History Survey, the Naturalis Biodiversity Center, and the Smithsonian Institution.
