M4.21 - Ontology Tools: new prototype Biodiversity BioPortal

Date: 
14/08/2012
Deliverable or Milestone: 
Milestone

 

Description: Present a new prototype Biodiversity BioPortal based on the NCBO BioPortal. This platform will be used for hosting vocabularies and ontologies for new data types (in addition to the DwC Occurrence and DwC Taxon).

Partner: GBIF

Deadline: June 2012

Work package: 
Reporter: 
Dag Endresen
Completed: 
Completed

Comments

Darwin Core in NCBO BioPortal

Exploring the NCBO BioPortal for publishing biodiversity RDF vocabularies.

ABSTRACT
This report describes a case study where the Darwin Core RDF resource was loaded to the NCBO BioPortal. The objective of adding the Darwin Core RDF vocabulary to the NCBO BioPortal was to initiate the mapping and an evolving interoperability between the biodiversity data standards and the biomedical ontologies. The Darwin Core RDF resource declares a basic ontology with deliberately very limited declarations of relationship between the concepts. The Darwin Core RDF vocabulary can be seen as an OWL Full ontology and can be explored as such using the Protégé ontology management software tool.

INTRODUCTION
The NCBO BioPortal [1] provides a software platform for publishing ontologies used in biology and biomedical research. This platform was developed for sharing ontologies expressed using the web ontology language (OWL) (Noy et al., 2009; Whetzel et al., 2011).

Darwin Core provides a widely used vocabulary of terms and concepts for documentation of biodiversity information resources (Wieczorek et al., 2012). Darwin Core is ratified as a data standard by the Biodiversity Information Standards (TDWG) and well documented at [2] with a normative RDF representation available at [3].

The "Semantics of biodiversity workshop" [4] organized in May 2012 at Kansas University included participants from the Biodiversity Information Standards (TDWG) technical architecture group (TAG) and the Genomics Standards Consortium (GSC). This workshop proposed to establish a so-called "slice" in the NCBO BioPortal for Biodiversity KOS and ontology resources. Using the NCBO BioPortal was considered more appropriate and effective than an alternative implementation of a new and separate "Biodiversity Information Standards" BioPortal instance. Sharing Biodiversity ontologies with the Biomedical community is more efficient for cross-mapping and interoperability between these related domains of biology information resources. The "Semantics of biodiversity workshop" [4] has also introduced the development of an OWL ontology based on the Basic Formal Ontology (BFO) [5] for the Darwin Core terminology [6].

RESULTS
The original Darwin Core RDF vocabulary caused a “parsing error” message when loaded to BioPortal [7,8]. All concepts included in the Darwin Core are declared as properties (rdf:Property) or class concepts (rdfs:Class). The domain (rdfs:domain) or range (rdfs:range) was not declared for any of the Darwin Core concepts. The identification and removal of an undesired system character (byte-order-mark [BOM]) resulted in improved parsing when exploring the Darwin Core RDF vocabulary using Protégé, but the “parsing error” when loading to BioPortal remained. A new test including only a subset of the Darwin Core concepts replaced quote characters inside the XML nodes by html entities (") and parsed well when loaded to BioPortal. However when the complete list of Darwin Core concepts where loaded to BioPortal with the quote marks replaced in this manner the “parsing error” remained. A RDF/SKOS vocabulary with translations of the Darwin Core term descriptions was also loaded to BioPortal [9], but also caused “parsing error”.

[1] http://bioportal.bioontology.org/
[2] http://rs.tdwg.org/dwc/terms/index.htm
[3] http://rs.tdwg.org/dwc/rdf/
[4] http://www.biocodecommons.org/workshops/sob.html
[5] http://bioportal.bioontology.org/virtual/1332
[6] not yet published
[7] http://bioportal.bioontology.org/projects/168
[8] http://bioportal.bioontology.org/virtual/3058
[9] http://bioportal.bioontology.org/virtual/3085

REFERENCES

Noy, N.F., N.H. Shah, P.L. Whetzel, B. Dai, M. Dorf, N. Griffith, C. Jonquet, D.L. Rubin, M.-A. Storey, C.G. Chute, M.A. and Musen (2009). BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research 37:170–173. doi:10.1093/nar/gkp440 PMCID: PMC2703982. Available at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2703982/

Whetzel, P.L., N.F. Noy, N.H. Shah, P.R. Alexander, C. Nyulas, T. Tudorache, and M.A. Musen (2011). BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Research 39:541–545. doi:10.1093/nar/gkr469. Available at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3125807/

Wieczorek, J., D. Bloom, R. Guralnick, S. Blum, M. Döring, R. De Giovanni, T. Robertson, and D. Vieglais (2012). Darwin Core: An Evolving Community-developed Biodiversity Data Standard. PLoS ONE 7 (1). doi:10.1371/journal.pone.0029715.

Broken link

Note that the link (ref 7 above) to http://bioportal.bioontology.org/projects/168 does not resolve.