Data Aggregation

Gregor Hagedorn & Andreas Plank (JKI)

KEY RESOURCES

Explore a prototype interface
Hagedorn, Gregor, Daniel Mietchen, Robert Morris, Donat Agosti, Lyubomir Penev, Walter Berendsohn, and Donald Hobern. "Creative Commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information." {ZooKeys} 150 (2011): 127-149. http://dx.doi.org/10.3897/zookeys.150.2189
Berendsohn, Walter, Anton Güntsch, Niels Hoffmann, Andreas Kohlbecker, Katja Luther, and Andreas Müller. "Biodiversity information platforms: From standards to interoperability." {ZooKeys} 150 (2011): 71-87. http://dx.doi.org/10.3897/zookeys.150.2166

ViBRANT is developing a data delivery service with an open data strategy that will allow users to deliver their data to individuals or to major global biodiversity informatics initiatives such as e.g. PESI, EoL, GBIF.

The possibility of using a MediaWiki system, a free open-source wiki package, has been explored to provide semantic web extensions. Semantic MediaWiki turns the familiar wiki page into a powerful and flexible “collaborative database”. All data created within Semantic MediaWiki allows other systems to use these data seamlessly. In addition, Semantic Forms allows users to add, edit and query data using forms without needing to do any programming themselves.

The system provides a mark-up mechanism to enrich simple text data with defined semantic properties (linked to a concept) and combine human-readable text content with semantic mark-up (see also manuscript publishing and literature mark-up). Semantic mark-up itself is used for reasoning purposes within the wiki, and is exposed as RDF, Resource Description Framework, which is a standard for encoding metadata and other knowledge on the Semantic Web, allowing third parties to explore the full power of semantic machine reasoning.

  • The MediaWiki and server setup have been optimised for ViBRANT use.
  • The Semantic MediaWiki extension have been installed and extensively tested.
  • A prototype interface, terms.gbif.org,  was developed on http://terms.gbif.org/ to illustrate the possibilities of a collaborative community interface for an ontology standardisation.
  • An example vocabulary case using the TaxPub (mark-up for publishing) vocabulary has been imported and annotated on the ViBRANT platform in cooperation with the ViBRANT partner Pensoft .
  • As an external vocabulary to be used in the definition of new terms, the mapping relation definitions of the Simple Knowledge Organisation System (SKOS) have been imported. In SemanticMediaWiki it is possible to reuse external ontology vocabularies by creating a special import definition. Terms can then be related internally to each other by setting up sub-property relationships. Local term definitions within the Wiki system can be exported using the Resource Description Framework (RDF) export function of Semantic MediaWiki and thus they can be read for instance by RDF/ontology browsers.
  • To facilitate data inputs by biologists, web forms are provided, helping users to fill in appropriate data without having to know the technical background or the syntax of semantic properties. Appropriate form data can be provided as selectable options or by saving input data internally in semantic properties and let them be proposed using automatic word completion while typing in words.

Future work

We need to find a community of interested users with whom to collaborate to integrate this approach with the Drupal-based ViBRANT Scratchpads and the identification tools. Ideal would be a richly illustrated (or to be illustrated) glossary-like vocabulary that is available as open content (Creative Commons CC by, CC by-sa, or CC0). If you are interested in such a collaboration, please contact Gregor Hagedorn.