Research Article |
Corresponding author: Lyubomir Penev ( l.penev@pensoft.net ) Academic editor: Ellinor Michel
© 2016 Lyubomir Penev, Alan Paton, Nicola Nicolson, Paul Kirk, Richard Pyle, Robert Whitton, Teodor Georgiev, Christine Barker, Christopher Hopkins, Vincent Robert, Jordan Bisserkov, Pavel Stoev.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Penev L, Paton A, Nicolson N, Kirk P, Pyle RL, Whitton R, Georgiev T, Barker C, Hopkins C, Robert V, Biserkov J, Stoev P (2016) A common registration-to-publication automated pipeline for nomenclatural acts for higher plants (International Plant Names Index, IPNI), fungi (Index Fungorum, MycoBank) and animals (ZooBank). In: Michel E (Ed.) Anchoring Biodiversity Information: From Sherborn to the 21st century and beyond. ZooKeys 550: 233–246. https://doi.org/10.3897/zookeys.550.9551
|
Collaborative effort among four lead indexes of taxon names and nomenclatural acts (International Plant Name Index (IPNI), Index Fungorum, MycoBank and ZooBank) and the journals PhytoKeys, MycoKeys and ZooKeys to create an automated, pre-publication, registration workflow, based on a server-to-server, XML request/response model. The registration model for ZooBank uses the TaxPub schema, which is an extension to the Journal Tag Publishing Suite (JATS) of the National Library of Medicine (NLM). The indexing or registration model of IPNI and Index Fungorum will use the Taxonomic Concept Transfer Schema (TCS) as a basic standard for the workflow. Other journals and publishers who intend to implement automated, pre-publication, registration of taxon names and nomenclatural acts can also use the open sample XML formats and links to schemas and relevant information published in the paper.
Taxon names, nomenclatural acts, pre-publication registration, International Plant Name Index (IPNI), Index Fungorum, Mycobank, ZooBank
The process of indexing nomenclatural acts from published literature has a long tradition, in some cases dating as far back as the middle of 19th century for different taxonomic groups. As a result there are several nomenclatural indexes that aim to be comprehensive for their focal taxa, for example, Index Kewensis in botany, Index Fungorum or MycoBank in mycology, and Zoological Record and Index Animalium in zoology. Sherborn’s effort in Index Animalium surely stands as the giant among these efforts due to the sheer scale of described animal diversity (
Historically, these indexes have been compiled by a team of editors scanning the relevant literature. This is an inefficient process. The lists become outdated even as they are being produced, because newly described taxa are continually being added to the list. However, the increase in electronic publication of nomenclatural acts made possible by recent changes to the nomenclatural codes in zoology (
Electronic registration of nomenclatural acts in trusted online registries would have the advantage of ensuring nomenclatural novelties published according to the relevant code would be broadly disseminated and available for linkage into other systems. Registration needs to be developed in accordance with the revisions of the biological codes of nomenclature to make the most efficient use of developing web technologies. Mandatory registration would ensure that all new nomenclatural acts governed under the code were captured and treated consistently (
This paper deals with a specific and important part of the registration process, namely a common model for an automated, prior to publication, machine-to-machine, XML-based registration and associated workflow between publishers and indexes who could act as registries in further streamlining the process of registration and making it cost efficient.
There are several ways as to how registration (or indexing if registration is not yet mandated by the relevant code) can be best implemented. Different options and the relationship to the publication process have been extensively reviewed by
Despite the visible progress in recent years, four major questions remain to be answered:
When exactly should the registration of a nomenclatural act take place – before or after publication?
Who should be responsible for the registration of the act – authors, registry curators or publishers?
How is registration actually effected?
Who validates the accuracy of the bibliographic metadata for any registered act?
The International Botanical Congress in Melbourne in July 2011 had a major impact on streamlining the process by amending the International Code of Nomenclature for algae, fungi, and plants (ICNafp) such that, from 1 January 2013 to be validly published all new names of fungi must be registered before publication and identifiers for each name included in the publication (
Shortly thereafter, The International Commission on Zoological Nomenclature voted in favour of a revised version of the amendment to the International Code of Zoological Nomenclature that was first proposed in 2008. The purpose of the amendment is to expand and refine the methods of publication allowed by the Code, particularly in relation to electronic publication. The amendment establishes an Official Register of Zoological Nomenclature (with ZooBank as its online version), allows electronic publication after 2011 under certain conditions, and disallows publication on optical discs after 2012. The requirements for electronic publications are that the work be registered in ZooBank before it is published, that the work itself states the date of publication and contains evidence that registration has occurred, and that the ZooBank registration states both the name of an electronic archive intended to preserve the work and the ISSN or ISBN associated with the work. Registration of new scientific names and nomenclatural acts is not required. The Commission confirmed that ZooBank was ready to handle the requirements of the amendment [
The current situation with indexing and registration in the three domains of eukaryotic organisms can be summarized as follows:
FUNGI
Post-publication Indexing in Index Fungorum (IF) and MycoBank (MB)
Pre-publication registration mandatory for fungi since 1st of January 2013
Record identifiers must be published in the protologue
Three official registries are approved: MycoBank, Index Fungorum, Fungal Names
PLANTS
Post-publication indexing is a well-established practice of the International Plant Names Index (IPNI) which covers seed plants, ferns and lycophytes but not bryophytes or algae
Pre-publication indexing and inclusion of IPNI record identifiers in protologues piloted with Phytokeys, PLoS ONE and Kew Bulletin
ANIMALS
Post-publication indexing is a well-established practice of Zoological Record (now published by Thomson Reuters)
Pre-publication registration in ZooBank mandatory since 1st of January 2012 for e-only publications
Record identifiers should be published in the original description
Registration of many new nomenclatural acts might be a tedious and extremely time-consuming process if done “by hand”, especially in the recently introduced but increasingly submitted “turbo-taxonomic” papers, combining molecular data, concise morphological descriptions and digital imaging (
There are significant differences in the scope and number of nomenclatural acts that are tracked by the current indexes and registries (Table
Nomenclatural acts that are recorded by the indexing services and could potentially be a subject of pre-publication registration in botany, mycology and zoology.
Taxonomic / nomenclatural act |
IPNI (botany: vascular plants) |
Index Fungorum (mycology) | MycoBank (mycology) | ZooBank (zoology) |
---|---|---|---|---|
New taxon: | ||||
- suprafamilial | - | + | + | |
- familial | + | + | + | + |
- infrafamilial | + | + | + | + |
- generic | + | + | + | + |
- infrageneric | + | + | + | + |
- specific | + | + | + | + |
- infraspecific | + | + | + | + |
- hybrids |
+ | + | + | n/a |
New replacement name | + | + | + | |
New combination | + | + | + | |
Tautonym |
+ | + | + | n/a |
Typifications |
||||
- holotype | + | + | + | |
- lectotype | - | + | + | |
- neotype | - | + | + | |
- epitype | - | + | + | n/a |
In our view, the registration (or indexing in groups where registration is not yet mandated by the code) of nomenclatural acts and the quality control of the bibliographic metadata in these registries should be a primary responsibility of publishers and registry curators and, to a lesser extent, of authors. Registration of a nomenclatural act could be initiated by an author, at the pre-submission or pre-acceptance for publication stage. However, we prefer the publisher-initiated model as it avoids registry curators curating data that may never be published according to the rules of the relevant code. Such a practice may lead to “over-saturation” of the registries with names that are not validly published, causing confusion. Focusing on names accepted for publication also allows these curators more time to focus on the published act and this may allow these specialist staff to assist publication by identifying inconsistencies with the relevant code. Moreover, the publishers’ role is essential in checking and correcting the pre-publication registration details against the ultimately published information. The model presented below could easily be adapted for author initiation, though we envisage that there would be a greater curatorial overhead and a greater likelihood of errors being created. However, we accept that the model needs to be flexible and allow alternatives if it is to receive community support.
In the “journal-centric” model, the registration of taxonomic and nomenclatural acts involves two main classes of actors: (1) publishers, or editors, and (2) registry curators. The publisher takes the responsibility for initiating the registration of nomenclatural acts so that the workflow can be performed following a common stepwise model (see also Fig.
Step 1. XML message from the publisher to the registry on acceptance of the manuscript containing the type of act, taxon names, and preliminary bibliographic metadata; the registry will store the data but not make these publicly available before the final publication date.
Step 2a. Response XML report containing the unique identifier of the act as supplied by the registry and/or any relevant error messages.
Step 2b. Error correction and de-duplication performed manually: human intervention, at either registry’s or publisher’s side (or at both).
Step 3. Inclusion of registry supplied identifiers in the published treatments (protologues, nomenclatural acts).
Step 4. Making the information in the registry publicly accessible upon publication, providing a link from the registry record to the article.
The registration process should be as automated as possible. There are several reasons to maximize automation of registration, the most significant being:
Increasing cases of bulk, “turbo-taxonomic”, descriptions of new taxa within a single paper, sometimes counted in hundreds, which creates significant overhead on the authoring and editorial process.
Decreased risk of errors caused by human intervention (e.g. re-typing).
Disambiguation of the dates of acceptance and publication of a manuscript.
Efficient and accurate validation of final published data and metadata through automated export from the publisher to the registry on the day of publication.
Within the framework of the EU FP7 project pro-iBiosphere, and in close collaboration with Zoological Record, ZooBank, IPNI, MycoBank and Index Fungorum, as well as with the Global Names project (www.globalnames.org) we are developing a workflow and associated XML formats to streamline the registration of nomenclatural acts within the pre-publication process. The workflow was piloted by IPNI for higher plants and ZooBank for animals and the journals PhytoKeys and ZooKeys, respectively. The formats differ between the two main biological codes, ICNafp and ICZN, hence we describe these separately below.
The pre-publication indexing of new plant taxa and nomenclatural acts in IPNI and inclusion of the IPNI identifiers in the protologues was first trialled in the journal PhytoKeys since the publication of its first issue in 2010 (
The XML query is submitted to IPNI’s Application Programming Interface (API) through a POST request and replied back with automatically inserted IPNI identifiers.
The registration workflow of Index Fungorum (IF) will adopt that of IPNI after the IF system has moved to Royal Botanic Gardens Kew to run alongside IPNI.
The following methods of the MycoBank API are enough for a straightforward implementation:
SearchMycoBankWithFilters
InsertUserProfile
UpdateUserProfile
InsertMycobankRecord
UpdateMycobankRecord
Using the combinations (1, 2, 3) and (1, 4, 5) one can implement the Upsert (Update if exists, Insert otherwise) semantics required for the the Common query/response registration model.
As there are multiple fungi registries (MycoBank, Index Fungorum, Fungal Names), another approach would be to perform the registration with only one of them and rely on the synchronization mechanisms (currently being built) to propagate the information to the other databases.
Similarly to the case of PhytoKeys, ZooKeys was the first journal that implemented a mandatory registration of new taxon names in zoology, since the publication of its first issue in 2008 (
The registration workflow and XML formats published in this article are free to use for anyone who would like to implement it. To ensure broader adoption of the registration model, the data exchanged through the workflow should be encoded in a standard. For zoology, journals should adopt the TaxPub XML schema (
The Suppl. materials
Once the editorial workflow is defined, and structured data can be produced according to these standards, journal editors should contact registries for access to their Application Programming Interfaces (APIs).
The authors of this article, staff at the registries and at Pensoft are available to consult journals who intend to implement the automated registration process. Future changes to the automated registration workflow will be published on the Wiki page of the pro-iBiosphere project at http://wiki.pro-ibiosphere.eu/wiki/Pilot_2.
The pro-iBiosphere project (Coordination & Policy Development in Preparation for a European Open Biodiversity Knowledge Management System, Addressing Acquisition, Curation, Synthesis, Interoperability & Dissemination, Contract no. RI-312848, www.pro-ibiosphere.eu) supported Pensoft and Royal Botanical Gardens Kew in developing, testing and implementation of the automated registration workflow. Pensoft has received also financial support by the EU FP7 projects ViBRANT (Virtual Biodiversity Research and Access Network for Taxonomy, www.vbrant.eu, Contract no. RI-261532) for designing the basic concept of the workflow. We are thankful also to Nigel Robinson from Zoological Record for the discussions of the early stages of the process. The work of ZooBank team was supported by the Global Names NSF project (DBI-1062441).
XML response of IPNI
Data type: (measurement/occurence/multimedia/etc.)
Explanation note: XML query sent from Pensoft to IPNI on the day of acceptance of the manuscript for publication [exemplified with the paper of
IPNI response XML
Data type: (measurement/occurence/multimedia/etc.)
Explanation note: XML response of IPNI to the query in Suppl. material
XML response of TaxPub
Data type: (measurement/occurence/multimedia/etc.)
Explanation note: TaxPub XML of a ready-to-publish manuscript submitted from Pensoft to ZooBank [exemplified with the paper of
TaxPub response XML
Data type: (measurement/occurence/multimedia/etc.)
Explanation note: TaxPub XML returned from ZooBank to Pensoft containing UUIDs of the article, authors and new taxon names [exemplified with the paper of