Citation: Virgilio M, White I, De Meyer M (2014) A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae). ZooKeys 428: 97–108. doi: 10.3897/zookeys.428.7366
Tephritidae, Africa, identification key
Tephritid fruit flies, or "true" fruit flies (Diptera, Tephritidae) include approximately 500 genera and 4800 valid species (
Currently, identification of tephritid flies is a specialized task largely performed by a restricted pool of experienced taxonomists, a group that is constantly becoming smaller due to the well-known problems related to the general loss of taxonomical expertise on insects as well as on most taxonomic groups (
The morphological identification of African tephritids largely depends on the use of classical single-entry (dichotomous) keys. These keys are available for most African genera (e.g.,
To try and reduce the effects of some of the aforementioned issues, we developed a set of freely available multi-entry identification keys for African fruit flies. The keys provide a professional identification tool that is also accessible to non-specialised morphologists (i.e., people that might be interested in fruit fly identification such as students, technicians, agronomists, quarantine officers, ecologists, farmers, molecular biologists, etc.). Matrixes containing scores for 340 characters from 400 African species belonging to the genera Bactrocera, Capparimyia, Carpophthoromyia, Ceratitis, Dacus, Neoceratitis, Perilampsis and Trirhithrum were compiled from data sets that were used within the framework of previous taxonomic revisions (
Different character sets were considered for each genus (range 11-90 characters and 22–204 character states). The complete lists of species, characters, character states and dependencies considered for each key are provided as supplementary files (SF1, SF2). Each character state was scored in LUCID as either "present and common" or "absent" (other options such as "present but rare", "common and misinterpreted" etc. were not implemented). The "not scoped" option was used to generate unfolding keys, i.e. keys with characters that are initially not shown but appear only when a pre-defined subset of species remains to be identified. We built unfolding keys whenever character scores were only available for subsets of a maximum of 5 congeneric taxa. Dependencies between characters were also generated. Positive dependencies were defined whenever a character was only meaningful in relation to a previously defined character state (e.g. in the Ceratitis key, the character "number of frontal setae" is positively dependent from the character state "frontal setae: yes"). Conversely, negative dependencies were generated to discard characters that were not meaningful after a previous character state was selected (e.g. in the Ceratitis key, the character "females, aculeus tip with small notch" is negatively dependent on the character state "sex: male"). To facilitate identification, characters were grouped into head, thorax, wings, legs and abdomen character sets. The character "sex" was always placed first, in order to reduce the character list by discarding all negative dependencies controlled by the character states "male" and "female".
We considered that the number of morphological characters used in the largest identification keys (i.e. keys to Bactrocera/Dacus, Ceratitis, Trirhithrum) might also represent an obstacle to non-specialists. Hence, we arbitrarily defined three subsets of characters for these keys including (1) only characters of very straightforward use (included in the subset "step1: use only the most straightforward characters to get a short list of candidate species"), (2) all characters except the ones of most difficult use (subset "step 2: try identification by excluding only the most difficult characters") and (3) all characters, including "easy", "average", and "difficult" ones (subset "step 3: use also difficult characters if step 2 does not bring to species identification"). The user has the possibility of following a three steps identification procedure that considers characters of straightforward use at first, followed by characters of more and more difficult interpretation. This procedure should facilitate identification and reduce the risk of misidentification (particularly if a species can be identified only through step 1 or through step 1 and 2). We also defined a subset for species of economical importance. The use of this subset should speed up the identification of the more commonly trapped / intercepted taxa. When using this subset, identification should be carefully verified a posteriori (through the hyperlink to species description, see below) as all the less common species not included in this subset might be erroneously identified as species of economical importance (false positives). Of course, character and species subsets can all be ignored and the user can either arbitrarily score any of the characters available from the full list or use the "best" option provided by the LUCID software which should allow choosing characters with the highest discrimination power (the "best" option can be repeatedly used after eliminating redundant characters through the "prune" option). In any case, being a multi-entry key, the user can always decide either to skip characters, to choose multiple answers whenever he is uncertain about the correct score and/or to restrict the identification only to the most common species.
We tried to make the technical terminology used in the single-entry keys more accessible to non-specialists by adopting a consistent framework of character names and indicating in parentheses alternative names of the same character in the published scientific literature (as it happens for example with the Ceratitis subapical / cubital / preapical wing band). We then embedded images that clearly illustrate name and position of each character on the insect body as well as images showing how the same character state looks in different species. An initial set of 2300 images was assembled from the databases of the Royal Museum for Central Africa (RMCA) and of the London Natural History Museum (NHM). Images were grouped according to species name and body part (head, thorax dorsal, thorax lateral, abdomen, wings, legs), divided in groups and, when possible, assigned to each combination of character state and species name. This generated a database of approximately 28000 repeated images (for example, the same thorax image of a particular species was repeatedly used to illustrate postpronotal lobe, scutum and scutellum characters for that species). The large set of embedded images aims at clearly illustrating the morphological variability of the same character state across species. In fact, we consider that many terms used to describe morphological variation (such as "small / large, darker / paler, thicker / thinner etc.") while being straightforward for a tephritid taxonomist (who can rely on the experience accumulated after the examination of large numbers of specimens) are not always clear to non-specialised users. Therefore, we dedicated particular attention to provide multiple images to show, for example, how "narrow" a wing discal band should be, before being considered as "broad" or how "small" a postpronotal spot can be before being scored as "occupying most of postpronotum".
Once a tentative identification is obtained (or when the list of candidate species is reduced to a few taxa), the keys give the possibility of verifying the correspondence between the examined voucher and (1) the species description as it appears in the published scientific literature and (2) images from the RMCA and NHM tephritid collections. Discrepancies between the examined voucher and available images (as it might result from the occurrence of multiple character states for a species) can then be verified through hyperlinks to either the species description or to all character states considered for that species in the LUCID input matrix. Information regarding the taxonomic status, geographic distribution and collection specimens of each taxon is also available through hyperlinks to Encyclopedia of Life (EOL) and to the Belgian Biodiversity Platform (BeBIF, a section of GBIF, the Global Biodiversity Information Facility). Links to the Barcoding of Life Database website (BOLD) allow verifying the availability and geographical coverage of DNA barcodes for each species. In some cases, the available character list will not always allow the unambiguous identification of a taxon (as it happens, for example, with females of the subgenus Ceratitis (Pterandrus)). Under these circumstances, the direct comparison of species descriptions and distributions is the best strategy to try and resolve the short list of candidate taxa.
The keys can be accessed online (http://keys.lucidcentral.org/keys/v3/fruitflies/) or freely downloaded and used from a computer hard drive (supplementary files SF3-10). The first option is only recommended for a preliminary overview of the key structure, while downloading and running the keys (e.g. from a memory stick used as a removable device) should allow a faster and more effective use of the software. A quick start guide providing basic information about the key functioning is associated to the downloadable version.
This work has been co-funded by the Belgian Directorate-General for Development Cooperation (through framework agreement with the Royal Museum for Central Africa) and by the International Atomic Energy Agency (IAEA - Vienna, project “Development of a Web Based Multi Entry Key for Fruit Infesting Tephritidae", contract n. 16859). The last author greatly acknowledges travel grants of the Research Foundation - Flanders (FWO-Vlaanderen) for study visits to the Natural History Museum (London, UK), the Plant Protection Research Institute (Pretoria, South Africa), and the International Institute of Tropical Agriculture (Cotonou, Benin) to examine specimens in preparation of the character matrices. An earlier version of the Ceratitis and Trirhithrum keys were developed through the U.S. Agency for International Development (USAID, PCE-G-00-98-0048-00) and the U.S. Department of Agriculture (USDA) / the National Institute of Food and Agriculture (CSREES) / the Initiative for Future Agricultural and Food Systems (IFAFS) grants to Texas A&M University (00-52103-9651). We are grateful to Alain Reygel (RMCA - Tervuren) and to Georg Goergen (International Institute of Tropical Agriulture - Cotonou) for their contribution to the image dataset as well as to Myriam Vandenbosch for practical and moral support.
List of species, characters and character states considered in each identification key
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
List of positive and negative character dependencies in each identification key
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to genera
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to genera.
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to Capparimyia
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to Capparimyia.
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to Carpophthoromyia
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to Carpophthoromyia.
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to Ceratitis
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to Ceratitis.
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to Dacus and Bactrocera
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to Dacus and Bactrocera.
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to Neoceratitis
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to Dacus and Neoceratitis.
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to Perilampsis
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to Dacus and Perilampsis.
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.
Key to Trirhithrum
Authors: Massimiliano Virgilio, Ian White, Marc de Meyer
Data type: multimedia
Explanation note: A set of multi-entry identification keys to African frugivorous flies (Diptera, Tephritidae): key to Dacus and Trirhithrum
Copyright notice: This dataset is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.