Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
DBpedia
DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web using OpenLink Virtuoso. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.
The project was heralded as "one of the more famous pieces" of the decentralized linked data effort by Tim Berners-Lee, the inventor of the World Wide Web. As of June 2021, DBpedia contained over 850 million semantic triples.
The project was started by people at the Free University of Berlin and Leipzig University in collaboration with OpenLink Software, and is now maintained by people at the University of Mannheim and Leipzig University. The first publicly available dataset was published in 2007. The data is made available under free licenses (CC BY-SA), allowing others to reuse the dataset; it does not use an open data license to waive the sui generis database rights.
Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables (the pull-out panels that appear in the top right of the default view of many Wikipedia articles, or at the start of the mobile versions), categorization information, images, geo-coordinates and links to external web pages. This structured information is extracted and put in a uniform dataset which can be queried.
The 2016-04 release of the DBpedia data set describes 6.0 million entities, out of which 5.2 million are classified in a consistent ontology, including 1.5 million persons, 810,000 places, 135,000 music albums, 106,000 films, 20,000 video games, 275,000 organizations, 301,000 species and 5,000 diseases. DBpedia uses the Resource Description Framework (RDF) to represent extracted information and consists of 9.5 billion RDF triples, of which 1.3 billion were extracted from the English Wikipedia and 5.0 billion from other language editions.
From this data set, information spread across multiple pages can be extracted. For example, book authorship can be put together from pages about the work, or the author.[further explanation needed]
One of the challenges in extracting information from Wikipedia is that the same concepts can be expressed using different parameters in infobox and other templates, such as |birthplace= and |placeofbirth=. Because of this, queries about where people were born would have to search for both of these properties in order to get more complete results. As a result, the DBpedia Mapping Language has been developed to help in mapping these properties to an ontology while reducing the number of synonyms. Due to the large diversity of infoboxes and properties in use on Wikipedia, the process of developing and improving these mappings has been opened to public contributions.
Version 2014 was released in September 2014. A main change since previous versions was the way abstract texts were extracted. Specifically, running a local mirror of Wikipedia and retrieving rendered abstracts from it made extracted texts considerably cleaner. Also, a new data set extracted from Wikimedia Commons was introduced.
Hub AI
DBpedia AI simulator
(@DBpedia_simulator)
DBpedia
DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web using OpenLink Virtuoso. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.
The project was heralded as "one of the more famous pieces" of the decentralized linked data effort by Tim Berners-Lee, the inventor of the World Wide Web. As of June 2021, DBpedia contained over 850 million semantic triples.
The project was started by people at the Free University of Berlin and Leipzig University in collaboration with OpenLink Software, and is now maintained by people at the University of Mannheim and Leipzig University. The first publicly available dataset was published in 2007. The data is made available under free licenses (CC BY-SA), allowing others to reuse the dataset; it does not use an open data license to waive the sui generis database rights.
Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables (the pull-out panels that appear in the top right of the default view of many Wikipedia articles, or at the start of the mobile versions), categorization information, images, geo-coordinates and links to external web pages. This structured information is extracted and put in a uniform dataset which can be queried.
The 2016-04 release of the DBpedia data set describes 6.0 million entities, out of which 5.2 million are classified in a consistent ontology, including 1.5 million persons, 810,000 places, 135,000 music albums, 106,000 films, 20,000 video games, 275,000 organizations, 301,000 species and 5,000 diseases. DBpedia uses the Resource Description Framework (RDF) to represent extracted information and consists of 9.5 billion RDF triples, of which 1.3 billion were extracted from the English Wikipedia and 5.0 billion from other language editions.
From this data set, information spread across multiple pages can be extracted. For example, book authorship can be put together from pages about the work, or the author.[further explanation needed]
One of the challenges in extracting information from Wikipedia is that the same concepts can be expressed using different parameters in infobox and other templates, such as |birthplace= and |placeofbirth=. Because of this, queries about where people were born would have to search for both of these properties in order to get more complete results. As a result, the DBpedia Mapping Language has been developed to help in mapping these properties to an ontology while reducing the number of synonyms. Due to the large diversity of infoboxes and properties in use on Wikipedia, the process of developing and improving these mappings has been opened to public contributions.
Version 2014 was released in September 2014. A main change since previous versions was the way abstract texts were extracted. Specifically, running a local mirror of Wikipedia and retrieving rendered abstracts from it made extracted texts considerably cleaner. Also, a new data set extracted from Wikimedia Commons was introduced.