Morphology lies: a case-in-point with a new non-biting midge species from Oriental China (Diptera, Chironomidae)

Abstract Morphological traits are generally indicative of specific taxa, and particularly function as keys in taxonomy and species delimitation. In this study, a non-biting midge species with an Einfeldia-like superior volsella makes it hard to accurately determined based on its morphological characteristics. Molecular genes of two ribosomal genes and three protein-encoding genes were compiled to construct a related genera phylogeny and to address the taxonomic issues. Phylogenetic inference clearly supports the undetermined species as belonging to Kiefferulus. Therefore, a new species classified in the genus Kiefferulus is described and figured as an adult male from Oriental China. The species could be easily distinguished from other species in having an Einfeldia-like superior volsella and a triangular tergite IX.


Introduction
For hundreds of years, taxonomists have been mainly focused on morphological characteristics for classification, taxonomy, and species identification. The most essential part of traditional taxonomy is based on similarities and differences to create systematics. Linnaeus (1753) simplified and standardized the nomenclature into the binomial system of genus and species. However, the system created is mainly based on visible characteristics by taxonomists' own professional experience, which is unstable and difficult to test. Discoveries and naming of new organisms aim to seek natural groupings with different proxies, such as morphology, genes, ecology, and behavior (Holstein and Luebert 2017). Nevertheless, classification of insects has been based on morphological characteristics to a great extent, which means that one species is deemed to be related with another based on shared characteristics of the same origin (synapomorphies).
With the burgeoning of molecular technology, there have been heated debates among scientists on whether the traditional system should be retained (Garnett and Christidis 2017;Thomson et al. 2018). Some think that the classification of complex organisms is in chaos and hampers species conservation, while others argue that taxonomy is necessary for global species conservation. After more than 250 years of the predominance of comparative morphology in species discovery, advanced methods and technology, especially molecular data, are rapidly expanding the realm of taxonomy (Padial et al. 2010). In addition, molecular information of certain species is increasingly registered or recorded and made available via several global initiatives, such as National Center for Biotechnology Information (NCBI), Barcode of Life Data System (BOLD), and different local barcode libraries (Ratnasingham and Hebert 2007). However, integrative taxonomy requires both detailed morphology description and molecular inference, which is time-consuming. Recently, regarding new species description, it is preferable to provide both morphology and COI barcodes, but COI-based phylogeny inference is unstable and not always convincing. Consequently, with the acceleration of new species descriptions, there would be much likely the peril of erroneous species hypotheses and unstable names (Padial et al. 2010) Chironomidae is a large family of diverse flies and commonly called non-biting midges. It is the most widely distributed of all aquatic insect families occurring in all zoogeographical region of the world, including Antarctica (Cranston et al. 1989). It also shows adaptions to different extreme niches, surviving at elevations of 5,600 m of Himalaya Mountains (Kohshima 1984) and at more than 1,000 m depth in Lake Baikal (Linnevich 1971).
Kiefferulus was described by Goetghebuer (1922) to accommodate Tanytarsus tedipediformis from Belgium (Chaudhuri and Ghosh 1986). However, it was later recognized as a subgenus of Pentapedilum Kiefer by Edwards (1929) and of Chironomus Meigen by Townes (1945), after which Hamilton et al. (1969) restored its generic status. The male Kiefferulus is easily recognized by its characteristic hypopygium, such as the broadly sickle-shaped superior volsella with numerous long setae on the inner margin and long microtrichia reaching the distal part, and the distal inferior volsella being strongly expanded (Cranston et al. 1989).
Herein, we used sequences from two ribosomal genes (18S and 28S ribosomal DNA), three protein-encoding genes [cytochrome oxidase I (COI), CPSase region of carbamol-phosphate synthase-aspartate transcarbamolylase-dihydroorotase (CAD), and phosphogluconate dehydrogenase (PGD)] to explore the undetermined chirono-mid species' systemic position. Through phylogenetic relationships, it is recognized as a new species of Kiefferulus based on molecular phylogeny analysis. We also discuss whether morphological traits can be independently used to define species within nonbiting midges. Finally, Kiefferulus trigonum sp. nov. is presented and described.

Taxon sampling
The morphological nomenclature follows Saether (1980). The examined specimens were mounted on slides following the procedure by Saether (1969). Measurements are given as ranges followed by the mean when there are four or more specimens examined. All types are deposited in College of Life Science, Nankai University.
Digital photographs were captured with a Leica DFC420 camera using a Leica DM6000 B compound microscope and under the application of the software Leica Suite at the NTNU university Museum, NTNU (Trondheim, Norway). Photograph postprocessing were done in Adobe photoshop and Illustrator (Adobe Inc., California, USA).

DNA extraction, PCR amplification, sequencing, and alignment
Tissues for total genome DNA extraction were removed from the thorax, heads of adult, and abdomen of larvae. The extraction procedure followed the Qiagen DNeasy Blood and Tissue kit except for elusion buffer ranging from 100-150 µl according to different body sizes. After extraction, the exoskeletons were cleared and mounted to corresponding voucher numbers. We amplified two ribosomal genes (18S and 28S) and four protein coding gene segments including fragments of one mitochondrial gene (COI-3P), two sections of the CPSase region of carbamoylphosphate synthase-aspartate transcarbamoylase-dihydroorotase (CAD1 and CAD4), and phosphogluconate dehydrogenase (PGD). Besides, universal primers LCO1490 and HCO2198 were used for the standard COI barcode sequences.
Polymerase Chain Reaction (PCR) amplifications were done in a 25 µl volume including 12.5 µl 2 × Es Taq MasterMix (CoWin Biotech Co., Beijing, China), 0.625 µl of each primer, 2 µl of template DNA and 9.25 µl deionized H 2 O, or 2.5 µl 10× Takara ExTaq buffer (CL), 2 µl 2.5 mM dNTP mix, 2 µl 25 mM MgCl2, 0.2 µl Takara Ex Taq HS, 1 µl 10 µM of each primer, 2 µl template DNA and 14.3 µl ddH 2 O. PCR was performed on a PowerCylcer Gradient SL (Biometra Gmbh, Göttingen, Germany). For the mitochondrial gene, the program was set as follows: an initial denaturation step of 95 °C for 5 min, then followed by 34 cycles of 94 °C for 30 s, 51 °C for 30 s, 72 °C for 1 min and final extension at 72 °C for 3 min. The program of ribosomal genes and nuclear protein coding genes were referred to Cranston et al. (2012), alternatively for the protein coding genes that a touchdown program: initial denaturation step of 98 °C for 10 s, then 94 °C for 1 min followed by five cycles of 94 °C for 30 s, 52 °C for 30 s, 72 °C for 2 min and 7 cycles of 94 °C for 30 s, 51 °C for 1 min, 72 °C for 2 min and 37 cycles of 94 °C for 30 s, 45 °C for 20 s, 72 °C for 2 min 30 s and one final extension at 72 °C for 3 min. PCR product were confirmed on a 1 % agarose gel and sequenced in both directions with ABI 3730 or ABI 3730XL capillary sequencers at Beijing Genomics Institute Co., Ltd, Beijing, China.
DNA sequences were edited and assembled with BioEdit 7.0.1 (Hall 1999). We applied the appropriate IUPAC code when editing the raw sequences in case of ambiguous bases but use "?" instead of the ambiguity symbol ''N" in the matrix. Sequence matrix of protein coding genes were aligned by their amino acid sequences using Muscle (Edgar 2004) in MEGA7 (Kumar et al. 2016). Introns in CAD and PGD were recognized and deleted according to refence sequences and "GT-AG" rule before analysis. For two ribosomal genes sequences were aligned by muscle and then removed the poorly aligned positions using Glocks online server (http://phylogeny.lirmm.fr/ phylo_cgi/one_task.cgi?task_type=gblocks) (Castresana 2000;Dereeper et al. 2008).

Phylogenetic analysis
Maximum likelihood (ML) trees were constructed in raxml-GUI v1.5b2 (Silvestro and Michalak 2012), with 1000 bootstrap replicates in a rapid bootstrap analysis, using GTR+G+I substitution model with partitions. Bayesian inference analysis (BI) was performed in two parallel runs in MrBayes (Nylander et al. 2004), consisting each of four chains of six million generations with a sampling frequency 1000 generation for one tree and burin of 25%. Partitions were in PartionFinder using greedy search and selected according to aicc (Lanfear et al. 2012). Result was as follows: TRN+I+G for 18S, COI3_ 2; GTR+I+G for COI3p_p1, 28S; GTR+I+G for CAD4_P1, CAD1_P1, PGD_P1; GTR+I+G for CAD1_P2, CAD4_P2, PGD_P2; GTR+I+G for CAD4_P3, PGD_P3; HKY+G for COI3p_P3; HKY+I+G for CAD1_P3. The convergence was checked in Tracer v1.7 (Rambaut et al. 2014) and terminated when ESS were superior to 200 with the initial 25% trees as burn in.

Results
The initial sequences of genes are CAD1 909bp, CAD4 846 bp, PGD 747 bp, 18S 933 bp, COI3P 826 bp, and 28S 743 bp (DOI: dx.doi.org/10.5883/DS-KIFFER). To reduce the effects of missing data, we trimmed the beginning and end of the protein coding genes and delete highly variable regions of 18S and 28S and finally concatenated to 4335 bp (CAD1 828 bp, CAD4 760 bp, PGD 747 bp, COI3P 662 bp, 18S 852 bp, 28S 455 bp) (SI). Both ML and BI inference show the same topology (Fig. 1) and agree on the simple phylogenetic scenario: the odd species conflict with the morphotype genus of Einfeldia but are clearly supported as species of Kiefferulus. The new species was not identified using morphological taxonomic keys for adult Chironomidae (Cranston et al. 1989). The superior volsella with a large hairy base and a digitiform bare projection was recognized as an important and diagnostic definition of Einfeldia sensu lato, which makes it a pre-identification as an Einfeldia sp. Nevertheless, the typical superior volsella is not exclusive, and also occurs in Benthalia, Chironomus (including its subgenera Chironomus and Lobochironomus), Conochironomus, Glyptotendipes, and Tribelos. From morphological parsimony analysis, these genera sharing similar superior volsella are not closely related (Andersen et al. 2017). Molecular phylogeny of the related genera in this study, and in Cranston et al. (2012) show that Conochironomus and Glyptotendipes are not closely related to Einfeldia. Consequently, generic complexes or species groups with Einfeldia-like superior volsella are not genetically monophyletic clades (Fig. 1). Such cases of convergent characters are likely to causes serious problems in phylogenetic analysis, and lead to misplacement of species or genus. The hypothesis of generic diagnosis has raised great confusion within adult taxonomy. The case is not unique in Chironomidae: the marine species Dicrotendipes sinicus Qi & Lin was suggested as a new genus within the subfamily Chironominae. However, the analysis of genetic data revealed that the marine species nested within the genus Dicrotendipes (Qi et al. 2019).
To clearly illustrate the species' systemic position, it was included in the molecular phylogeny of related genera. Surprisingly, the morphologically identified species fall within the clade of Kiefferulus (Fig 1). Obviously, morphology-based identification conflicts with the molecular phylogeny. Considering the morphological homoplasy and phenotypic changes, we clarified the Einfeldia-like species within the genus Kiefferulus. While the Einfeldia-type superior volsella is unique in Kiefferulus, the new species is named as Kiefferulus trigonum sp. nov.
When defining a species new to science, almost no taxonomists would test its systemic position, which would be time-consuming and costly. Hierarchical classifications based on appropriate morphological characters provide a main backbone of the life tree, while molecular data provide corroboration, resolution, and support (Scotland et al. 2003). Genera defined and recognized by clear morphological characters as in Chironomidae, such as Wiederholm (1989) for adults, Andersen et al. (2013) for larvae, and Wiederholm (1986) for pupae have not been tested with a full molecular phylogeny. Morphology alone was not enough to make a correct placement especially for some hyper diverse genera or monotypic genera and the traditional taxonomy needs revisions according to molecular phylogeny. Palpomere lengths (in µm): 38-55; 47; 115-153, 128; 123-163,141; 170-245, 208. Length of 5 th palpomere / 3 rd palpomere 1.42-2.04, 1.61.

Remarks
Morphological characters such as the anal point narrow basally, distally broad, the superior volsella with microtrichia, and the gonostylus distally constricted positively and molecular phylogeny provide clues indicating the genus Kiefferulus. Morphologically, the new species shows great similarity with Einfeldia species with pad-like microtrichose and setose bases and a finger-like projection inwards to the apex of the anal point that clearly distinguishes them from species of Kiefferulus.