Manuscript publishing

Lyubomir Penev, Pavel Stoev, Teodor Georgiev (Pensoft) & Sandy Knapp (NHM)


e-Infrastructures for data publishing in biodiversity science
Taxonomy shifts up a gear: New publishing tools to accelerate biodiversity research
Smith, Vincent, and Lyubomir Penev. "Collaborative electronic infrastructures to accelerate taxonomic research." {ZooKeys} 150 (2011): 1-3.
Erwin, Terry, Pavel Stoev, Teodor Georgiev, and Lyubomir Penev. "ZooKeys 150: Three and a half years of innovative publishing and growth." ZooKeys 150 (2011): 5-14.

The workflow extends the prototype developed under EDIT (grant No 018340) and illustrated in [1] Fig 5, has been refined from that experience and realised in Drupal 7 during this year.

A single Drupal module (called “Publication”) has been prototyped to support the technical implementation of this workflow within the Scratchpads. This is available for Drupal 6 from the Scratchpad Git repository along with other Scratchpad project written dependencies. Software dependencies include the Drupal community’s Organic Groups module and Content Construction Kit modules, in addition to the Scratchpad project’s Species Profile Module (SPM) and Taxonomy Tree modules.

In summary the Publication module provides an intuitive interface that allows users to select and order content from their site and associate this with the publication, providing a many-to-many link between publication objects and other content types (e.g. Image, Bibliography). Thus for example, a single image can be used in many publications, and a single publication can have many images. The module also supports the communication between the user’s Scratchpad and the publisher transferring the TaxPub XML representation of the manuscript to ZooKeys during submission, revision and final acceptance. TaxPub is an extension of the National Library of Medicine (NLM) / National Center for Biotechnology Information (NCBI) Journal Archiving Document Type Definition (DTD) for the mark-up of taxonomic treatments.

The workflow is illustrated in the example of a new polychaete species description by [2].

Description of the workflow:

  1. An author creates a Publication project within a Scratchpad to which only a restricted set of users have access. The author(s) also provide additional information required by the article (e.g., title, author’s details etc.).

  2. The author(s) prepare species pages (including descriptions, images, specimens etc.) within the Scratchpad. In case of a new taxon description author(s) use a temporary name (a placeholder). This placeholder acts as a surrogate for the final taxon name to ensure that the new name is not disclosed until the description has been accepted by the journal. The placeholder is linked (tagged) to data on their site, and the placeholder taxon name is linked to the final name. The author(s) select data to be included in the manuscript. Additional sections are added to the manuscript using a structure that will accommodate most taxonomic descriptions and images uploaded. When the preparation stage is complete, the author(s) preview the manuscript to make sure it is satisfactory.

  3. Author(s) submit the manuscript, which creates an archive of the manuscript components. The submission process automatically generates an XML representation of the document according to the TaxPub DTD. This document is then automatically sent to the journal ZooKeys. Other destinations will be possible when other journals accept TaxPub submissions.

  4. ZooKeys organises the peer review. The reviewed paper, including reviewer’s comments, is sent by e-mail back to the corresponding author.

  5. Author(s) revise their manuscript and supporting data on their Scratchpad in response to the reviewers’ comments (if necessary).

  6. Author(s) re-submit the manuscript, which generates an updated XML file that is automatically sent back to ZooKeys. The publisher parses the final accepted XML document, adding additional XML mark-up for nomenclatural acts required by ZooBank registration, in addition to other semantic enhancements.

  7. ZooKeys publishes the paper adding DOIs for the paper and supplementary material. The printed published paper includes a link back to the accepted manuscript on the Scratchpad. The Scratchpad version of this article also includes link(s) to the dynamic descriptions of each taxon page showing versions of updated descriptions if they have been edited after publication. New taxa descriptions are registered online by the journal’s editorial office. In the future, ZooBank will provide receipt of an XML file from ZooKeys and create new records for published nomenclatural acts. The manuscript is submitted to PubMed /PubMedCentral for optimal distribution archival purposes.

  8. The manuscript and all supplementary data are unlocked on the Scratchpad and made public on the day of printed publication. At this time the placeholder taxon names are automatically substituted by the final published taxon name.

By default all Scratchpad data concerning the ZooKeys publication are kept private for steps 1 to 7 and made public as step 8, although the original taxon pages are normally public. However, the author(s) have the capacity to make all these data public from the outset.

On 30th June 2010, anticipating the start of ViBRANT, ZooKeys published a special issue ‘Taxonomy shifts up a gear: New publishing tools to accelerate biodiversity research’ which marked the innovative publishing model, based on XML editorial workflow and on the TaxPub XML schema. From that time on, ZooKeys has been published in four formats – full-colour print version, PDF, HTML, and XML. This happened simultaneously with the implementation in the editorial process of the Pensoft Mark Up Tool (PMT)[3], a program specially designed for XML tagging and semantic enhancements. Papers by [4];[5]; [6]; [7] that used three different types of manuscript submission, exemplified the process.

Realising the importance of Wiki environment for popularisation and dissemination of biodiversity data, in April 2011 Pensoft took another major step towards modernisation of its journals. Species descriptions from 3 papers ([8]; [9]; [10]) were integrated automatically on the day of publication to Species-ID, an open access Wiki-based resource for biodiversity information. This was achieved by programming a special tool, named Pensoft Wiki Convertor (PWC, [11]), that transforms the XML versions of the papers into MediaWiki-based pages.

A report and an open access publication ([12]) have been produced to serve as a basis for choosing the appropriate strategy in implementation of different XML schemas for mark up of taxonomic texts within the ViBRANT project and beyond. The three reviewed schemas – taxonX, TaxPub and taXMLit – cover the main tasks of taxonomy mark up rather well. TaxonX is a lightweight, object–centred schema focusing on taxonomic treatments extracted from legacy literature; taXMLit is a document–centred, very detailed schema covering mark up of legacy literature; TaxPub is an extension to the NLM journal publishing DTD and has been created to support prospective publishing in taxonomy. All three schemas have advantages and shortcomings outlined in the text; they also have passed the stages of creation, testing and implementing through a number of use cases.

Flowchart of mark-up, publication, dissemination and use of taxonomic information

The report does not recommend choosing one or some of the schemas. Rather, it proposes several cross-points that can be used to match common elements present in both legacy and present-day taxonomic literature. A common output, when needed, from documents marked up in the different schemas could be achieved through XSLIT conversions. Most important common elements in differently tagged text are: taxonomic names, taxon treatments, nomenclatural acts, literature references, as well as the overall structure of the published document and its bibliographic metadata. The report also outlines several questions to be answered when evaluating a certain schema to be used for the different goals of ViBRANT, Scratchpads and beyond.

The process of improving TaxPub was completed in 2011, when Pensoft finalised the testing and implementation of archiving of TaxPub-based articles in the PubMedCentral repository of the National Library of Medicine of the USA. This is the first case in the history of PubMedCentral, where a domain-specific XML schema (taxPub) is used, in the form of an extension to the NLM DTD, for archiving purposes and visualisation of some elements within the text, such as taxon treatments. Currently the whole content of ZooKeys and PhytoKeys is being exported and archived in PubmedCentral.

Automated export and harvesting mechanisms of bibliographical metadata is crucially important for increasing accessibility of taxonomic literature. In 2011, Pensoft successfully implemented harvesting mechanisms, based on OAI-PMH and POAI-MODS protocols to the CiteBank of the Biodiversity Heritage Library, Vifabio (an indexing service provided by a consortium of German libraries), Mendeley and others.


  1. Blagoderov, Vladimir, Irina Brake, Teodor Georgiev, Lyubomir Penev, David Roberts, Simon Ryrcroft, Ben Scott, Donat Agosti, Terrence Catapano, and Vincent S. Smith. "Streamlining taxonomic publication: a working example with Scratchpads and ZooKeys." ZooKeys 50 (2010): 17-28.
  2. Faulwetter, Sarah, Georgios Chatzigeorgiou, Bella S. Galil, Artemis Nicolaidou, and Christos Arvanitidis. "Sphaerosyllis levantina sp. n. (Annelida) from the eastern Mediterranean, with notes on character variation in Sphaerosyllis hystrix Claparède, 1863." ZooKeys 150 (2011): 327-345.
  3. Penev, Lyubomir, Donat Agosti, Teodor Georgiev, Terry Catapano, Jeremy Miller, Vladimir Blagoderov, David Roberts, Vincent S. Smith, Irina Brake, Simon Ryrcroft et al. "Semantic tagging of and semantic enhancements to systematics papers: ZooKeys working examples." ZooKeys 50 (2010): 1-16.
  4. Stoev, Pavel, Nesrine Akkari, Marzio Zapparoli, David Porco, Henrik Enghoff, Gregory Edgecombe, Teodor Georgiev, and Lyubomir Penev. "The centipede genus Eupolybothrus Verhoeff, 1907 (Chilopoda: Lithobiomorpha: Lithobiidae) in North Africa, a cybertaxonomic revision, with a key to all species in the genus and the first use of DNA barcoding for the group." ZooKeys 50 (2010): 29-77.
  5. Blagoderov, Vladimir, Heikki Hippa, and André Nel. "Parisognoriste, a new genus of Lygistorrhinidae (Diptera: Sciaroidea) from the Oise amber with redescription of Palaeognoriste Meunier." ZooKeys 50 (2010): 79-90.
  6. Brake, Irina, and Michael von Tschirnhaus. "Stomosis arachnophila sp. n., a new kleptoparasitic species of freeloader flies (Diptera, Milichiidae)." ZooKeys 50 (2010): 91-96.
  7. Taekul, Charuwat, Norman F. Johnson, Lubomír Masner, Andrew Polaszek, and Rajmohana K.. "World species of the genus Platyscelio Kieffer (Hymenoptera: Platygastridae)." ZooKeys 50 (2010): 97-126.
  8. Hendrich, Lars, and Michael Balke. "A simultaneous journal / wiki publication and dissemination of a new species description: Neobidessodes darwiniensis sp. n. from northern Australia (Coleoptera, Dytiscidae, Bidessini)." ZooKeys 79 (2011): 11-20.
  9. Stoev, Pavel, and Henrik Enghoff. "A review of the millipede genus Sinocallipus Zhang, 1993 (Diplopoda: Callipodida: Sinocallipodidae), with notes on gonopods monotony vs. peripheral diversity in millipedes." ZooKeys 90 (2011): 13-34.
  10. Bantaowong, Ueangfa, Ratmanee Chanabun, Piyoros Tongkerd, Chirasak Sutcharit, Samuel James, and Somsak Panha. "New earthworm species of the genus Amynthas Kinberg, 1867 from Thailand (Clitellata: Megascolecidae)." ZooKeys 90 (2011).
  11. Penev, L., G. Hagedorn, D. Mietchen, T. Georgiev, P. Stoev, G. Sautter, D. Agosti, A. Plank, M. Balke, L. Hendrich et al. "Interlinking journal and wiki publications through joint citation: Working examples from ZooKeys and Plazi on Species-ID." ZooKeys 90 (2011): 1-12.
  12. Penev, Lyubomir, Christopher Lyal, Anna Weitzman, David Morse, David King, Guido Sautter, Teodor Georgiev, Robert Morris, Terry Catapano, and Donat Agosti. "XML schemas and mark-up practices of taxonomic literature." ZooKeys 150 (2011): 89-116.