Communal Literature

David Morse & David King (OU), Guido Sautter (KIT)

Explore RefBank
King, David, David Morse, Alistair Willis, and Anton Dil. "Towards the bibliography of life." {ZooKeys} 150 (2011): 151-166.
Data Format Report

ViBRANT recognised the need for a bibliography of life, i.e. a freely accessible bibliography of every taxonomic paper ever published. None of the currently available aggregators were satisfactory, so we have chosen to extend the Plazi bibliographic tool, RefBank. There are two primary reasons for this choice, first the original developer, Guido Sautter, is a partner in ViBRANT and second, RefBank contains a parsing tool that will turn Rod Page's "cryptic text strings" into structured references that can be easily transformed into any of the other conventional forms (see the Data Format Report).

The bulk of RefBank's growth to date has come from ViBRANT contributed references, with 80,000 references being accumulated in the first six months of operation and another 85,000 references in the second six months. Work continues within ViBRANT to extract bibliographies from published works and parse them to generate more references. This work is to ensure that RefBank is seeded with sufficient references at launch so as to engage users. There was a significant development for RefBank in Autumn 2012 when it was the subject of a presentation and demonstration at TDWG 2012. Since when we have seen the addition of community contributed references. A more formal launch of RefBank, probably in conjunction with related ViBRANT developed tools, is planned for Summer/Autumn 2013.


Screenshot showing the results of a RefBank search for references where Linnaeus is the author.

Further work

The existing simplistic, but functional, user interface (UI) will be revised in conjunction with tighter coupling to the Scratchpads infrastructure. This work depends on development of the REFinder search tool, because REFinder can provide both a new front-end to RefBank and an indirect means of Scratchpads integration. Planning for the combined development of RefBank and REFinder is underway.

The implementation into production of the update and correction facilities deomnstrated at TDWG 2013.

A DOI and a URL to the publication itself will, where available, be added to the reference records. Access to the records both through the UI and programmatically through the API, will be extended to support this new feature, which also opens the possibility to cross-link references.

The key outstanding problem to address is de-duplication of entries, although that is a complicated issue as can be seen in the sample RefBank screenshot with references to the particularly popular Systema Naturae. Key techniques to address the problem are mechanisms that recognise variations in an individual author's name and variations in journal title, e.g. different abbreviations.

We have two approaches to ensure sustainability of RefBank.

  1. The first approach addresses sustainability of the software and aims to increase its uptake. Within ViBRANT this means increasing the number of RefBank nodes. At the end of Year Two there are four nodes compared to two at the end of Year One. We are also seeking other users of the underlying software, which could be re-purposed to contain taxonomic references for example.
  2. Our second approach looks to sustain the data content. We can achieve this by sharing the references. One option we are pursuing is to expose the bibliography to Mendeley, a reference manager and academic social network. As second option, also actively in progress, is to replicate the data in a BibServer, which not only gives us access to an alternative technology but also facilitates further data sharing through BibJSON, an increasingly popular format, which like Bibserver is endorsed by the Open Knowledge Foundation.