﻿A DNA barcode library for katydids, cave crickets, and leaf-rolling crickets (Tettigoniidae, Rhaphidophoridae and Gryllacrididae) from Zhejiang Province, China

﻿Abstract Barcode libraries are generally assembled with two main objectives in mind: specimen identification and species discovery/delimitation. In this study, the standard COI barcode region was sequenced from 681 specimens belonging to katydids (Tettigoniidae), cave crickets (Rhaphidophoridae), and leaf-rolling crickets (Gryllacrididae) from Zhejiang Province, China. Of these, four COI-5P sequences were excluded from subsequent analyses because they were likely NUMTs (nuclear mitochondrial pseudogenes). The final dataset consisted of 677 barcode sequences representing 90 putative species-level taxa. Automated cluster delineation using the Barcode of Life Data System (BOLD) revealed 118 BINs (Barcodes Index Numbers). Among these 90 species-level taxa, 68 corresponded with morphospecies, while the remaining 22 were identified based on reverse taxonomy using BIN assignment. Thirteen of these morphospecies were represented by a single barcode (so-called singletons), and each of 19 morphospecies were split into more than one BIN. The consensus delimitation scheme yielded 55 Molecular Operational Taxonomic Units (MOTUs). Only four morphospecies (Imax > DNN) failed to be recovered as monophyletic clades (i.e., Elimaeaterminalis, Phyllomimusklapperichi, Sinochloraszechwanensis and Xizicushowardi), so it is speculated that these may be species complexes. Therefore, the diversity of katydids, cave crickets, and leaf-rolling crickets in Zhejiang Province is probably slightly higher than what current taxonomy would suggest.


Introduction
Accurate specimen identification and species discovery are fundamental to taxonomic research and essential prerequisite for many fields of research such as ecology, biogeography, and conservation biology (Agapow et al. 2004;Collins and Cruickshank 2013). DNA barcoding using a standardized gene region (5´ region of the mitochondrial gene Cytochrome c oxidase subunit I, COI-5P) provide a powerful tool for specimen delimitation (Hebert et al. 2003). It can quickly distinguish species even with high morphological similarity, and it identifies cryptic genetic lineages within species, but it can fail if lineage sorting is incomplete (Yassin et al. 2009;Asis et al. 2016;Anderson et al. 2020). Specimen identification based on DNA barcodes does not rely on taxonomic expertise and can exclude the influence of human subjectivity in traditional morphological taxonomy. In recent years, increasing taxonomic practices have involved both morphological traits and DNA barcodes (DeSalle et al. 2005;Collado et al. 2021;Sabatelli et al. 2021). DNA barcodes have gained wide adoption for animal cryptic species recognition, species discovery, taxonomic revisions, and faunal assessments (Hebert et al. 2004, Tembe et al. 2014, Lone et al. 2020. Cryptic species generally refer to highly genetically differentiated, but morphologically indistinguishable species (Van Campenhout et al. 2015). The discovery of cryptic species was critical for assessing biodiversity (Kundu et al. 2019). In the last 20 years, numerous studies using DNA barcoding have revealed cryptic species in several insect groups, such as Lepidoptera (Schonrogge et al. 2002;Burns et al. 2008), Thysanoptera (Tyagi et al. 2017), Diptera (Gajapathy et al. 2016;Chan-Chable et al. 2019). In morphological stasis, cryptic species within a complex or sister group remain highly morphologically similar for long periods of time, even tens of millions of years (Struck et al. 2017). Cryptic species may represent morphological stasis among related species experiencing similar environment conditions, but it may also reflect frequent, recent and/ or rapid speciation (Cerca et al. 2020).
Effective identification of a query specimen through DNA barcode sequence requires reliable reference libraries of known taxa. The process of assembling comprehensive and high-quality reference libraries of DNA barcodes allows the identification of newly collected specimens and accelerates taxonomic progress. The use of DNA barcoding for specimen identification and species discovery is greatly facilitated by the Barcode of Life Data System (BOLD, http://www.boldsystems.org).
Much research has been done on Zhejiang katydid and related ensiferan groups (Wang and Tong 2014;Wu et al. 2014;Liu et al. 2018). Currently, 115 species of Tettigoniidae, 12 species of Gryllacrididae, and 18 species of Rhaphidophoridae have been recorded from Zhejiang Province, China (see Suppl. material 1). Here, we present the next step in building-up a DNA barcode reference library for the katydids, cave crickets, and leaf-rolling crickets from Zhejiang Province, China. These DNA barcodes can help greatly in flagging unusual specimens that merit more careful revision using morphological characters.

Sampling of specimens
Collections were performed throughout Zhejiang Province, China in the period of 2011-2019. Collection information (Fig. 1) can be found in the BOLD system under the public dataset DS-ZJCK. All specimens were preserved in absolute ethanol and identified by Yizheng Zhao using morphological traits, i.e., body shape, pronotum, and genitalia (Gorochov and Le 2002;Liu and Kang 2007;Guo and Shi 2012;Di et al. 2014;Jiao et al. 2014;Shi and Wang 2015;Bian and Shi 2016;Feng et al. 2016;Qin et al. 2016;Shi et al. 2016;Qin et al. 2017;Zhu and Shi 2018;Liu et al. 2019Liu et al. , 2021.

DNA extraction and COI barcode region sequencing
Total genomic DNA was extracted from hind legs of adults (N = 676) and nymphs (N = 5) using the Dneasy Blood and Tissue Kit (Tiangen Biotech, Beijing, China) according to the manufacturer's specifications. The remainder of the specimen was retained as a voucher stored at the Katydids Lab of Hebei University, China. The COI barcode region was amplified with primers COBU (5´-TYT CAA CAA AYC AYA ARG ATA TTG G-3´) and COBL (5´-TAA ACT TCW GGR TGW CCA AAR AAT CA-3´) (Pan et al. 2006). PCR amplification reactions were performed as follows. The 50 µL of PCR mix contained 25 µL of Premix Taq (TaKaRa), 5 µL of each primer, 3 µL of templated DNA and 12 µL of ddH 2 O. The PCR cycling protocol included an initial denaturation at 94 °C for 3 min, followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at 49 °C for 30 s, extension at 72 °C for 1 min, with a final extension at 72 °C for 8 min. All amplicons were sent to GENEWIZ (Tianjin, China) for bidirectional sequencing using ABI 3730XL DNA sequencers.

Data analyses
Forward and reverse sequences were trimmed, edited, and assembled to produce a consensus barcode sequence using SeqMan Pro (DNA star, Inc., Madison, Wisconsin, USA) for each specimen. All COI-5P barcode sequences were examined for potential stop codons using Editseq (DNA star, Inc., Madison, Wisconsin, USA). All sequences were aligned by employing MUSCLE (codons) algorithm (Edgar, 2004) with default parameters in MEGA ver. 7.0 (Kumar et al. 2016). The resulting alignments were cropped to a length of 658 bp. The COI-5P barcode sequences, trace files, and voucher information (i.e., collection data, photograph, taxonomic assignment) for each specimen are available in the BOLD dataset DS-ZJCK. All sequences meeting required quality criteria (> 500 bp, < 1% Ns, no stop codon or contamination flag) were assigned to a BIN by the BOLD system . Taxon ID Tree, BIN discordance, genetic distance analysis, and Barcode Gap Analysis were performed using analytical tools in BOLD ver.4 on 1 Dec., 2021. The NJ tree was generated on BOLD with the Taxon ID tree tool using a Kimura-2-Parameter model, which is the mostly applied model in DNA barcoding studies (Hebert et al. 2003). The NJ tree was visualized using FigTree ver.1.4.4 (Rambaut 2018). BIN Discordance analysis on BOLD employs the Refined Single Linkage (RESL) algorithm for assigning barcode sequences to MOTUs independent of the BIN registry . There were four possible patterns of association between Linnaean species and Barcode Index Numbers (BINs), e.g., MATCH, SPLIT, MERGE, and MIXTURE. It should be noted that the BIN system is dynamic and dependent on the underlying data. Intraspecific distances and Barcode Gap Analysis could only be calculated for the 55 non-singleton species. The Barcode Gap Analysis provides mean and maximum intraspecific variations and a minimum genetic distance to the nearest-neighbour species (i.e., minimum interspecific distance).
In addition to BIN Discordance analysis, we also used other molecular delineation methods to delineate MOTUs. To minimize the risk of oversplitting (Talavera et al. 2013), the dataset was collapsed to retain only unique haplotypes. Four species delimitation approaches were employed: Assemble Species by Automatic Partitioning (ASAP) (Puillandre et al. 2021), jMOTU (Jones et al. 2011), General Mixed Yule Coalescent (GMYC) (Fujisawa and Barraclough 2013), and bPTP (Zhang et al. 2013). ASAP analysis was performed on the Web interface (https://bioinfo.mnhn.fr/abi/public/asap/asapweb.html) applying the K2P model, using default parameters (Puillandre et al. 2021). The jMOTU analysis was performed at cutoffs from 1 to 40 bp, covering a range between 0.15% and 6.08% divergence across the 658 bp COI-5P barcode. The General Mixed Yule Coalescent (GMYC) is a likelihood method for delimiting species, which tries to find the threshold between divergence events at the species level (modelled by a Yule process) and coalescent events between lineages within species (modelled by the coalescent). Both single-threshold GMYC (sGMYC) (Pons et al. 2006) and multi-threshold GMYC (mGMYC) (Monaghan et al. 2009) were computed. The best-fit nucleotide evolution model GTR+F+G4 was chosen by ModelFinder (Kalyaanamoorthy et al. 2017) under the Bayesian Information Criterion (BIC). The Bayesian Inference (BI) tree used for GMYC analysis was constructed using BEAST (Drummond and Rambaut 2007) using the Yule model and a constant clock. We checked runs for convergence and proper sampling of parameters [effective sample size (ESS) >200] using Tracer ver.1.7.1 (Rambaut et al. 2018). The BI tree was converted to the Newick format using FigTree ver.1.4.4 (Andrew 2016). The R package SPLITS (Ezard et al. 2009) was used for sGMYC and mGMYC analyses. The bPTP analysis models species formation events based on the number of substitutions in a given branch (Zhang et al. 2013). We used the BEAST tree created above to compare the generated outputs. The bPTP analysis was run using an online web server (https:// species.h-its.org/ptp/) with default parameters except setting root tree, removing outgroup and MCMC generation = 500,000.
The results of different species delimitation methods were pairwise compared. Firstly, match ratio [2×N match /(N A +N B )] (Ahrens et al. 2016), where N match is the num-ber of molecularly delimited species using two different methods exactly matching, N A and N B is the number of delimited species by methods A and B, respectively. Secondly, taxonomic index of congruence [C tax (AB)= n(A∩B)/n(A⋃B)] (Miralles and Vences 2013), where A∩B represents the number of speciation events shared by methods A and B, and A⋃B represents the total number of speciation events inferred by method A and/or B. Thirdly, relative taxonomic resolving power index [R tax A = nA/ n(A⋃B⋃C⋃D⋃E)] (Miralles and Vences 2013), where A, B, C, D, E represent the five species delimitation methods tested, nA represents the number of speciation events inferred by method A, and the denominator represents the cumulative number of speciation events inferred by all methods. Although large R tax implies small type II error, it does not necessarily imply correct delimitations (i.e., can lead to over splitting) (Blair and Bryson 2017).

Results
The COI-5P of 681 specimens of katydids, cave crickets, and leaf-rolling crickets were sequenced. Among these specimens, 601 (88.25%) specimens were identified to 69 morphospecies (formally described species that are typically defined by distinct morphological characters) and the remaining 80 specimens were only identified at genus level (Tables 1, 2 and 3). The number of specimens per species ranged from one (14 singletons) to 58 in Gampsocleis sinensis Walker, 1869. Approximately half of these 69 morphospecies have five or more DNA barcodes. All sequences met the quality criteria (< 1% N and length > 500 bp) for BIN assignment. No insertions or deletions were observed.

Removal of problematic specimens
The preliminary "BIN Discordance" analysis (using BOLD ver.4 on 28 Dec., 2021) revealed five cases of merging, where each of the five BINs included two species from different genera or higher taxonomic taxa (Table 1). Species pairs in these five cases were distinctly morphologically different (Fig. 2). Five sequences located in apparently wrong positions on the NJ tree. To exclude contamination, DNA extraction from different leg and sequencing of these samples were repeated. Repeated experiments revealed that Conocephalus gladiatus Redtenbacher, 1891 DBTZC033-21 appeared Table 1. Results of the internal BIN discordance report for the five BINs of 83 specimens. # sequences have been resubmitted, * possibly NUMT coamplification.
to have resulted from experimental operation errors, and the remaining four cases could not be explained by contamination or lab errors. We had updated the sequences of C. gladiatus DBTZC033-21 prior to our analysis, and it clustered with other C. gladiatus specimens on the NJ tree. It sharing ADE4649 with Diestramima austrosinensis Gorochov, 1998 was the result of initial BIN assignment based on the previous incorrect sequence. Four records (Orophyllus montanus Beier, 1954 RBTC2009-18, Phryganogryllacris DBTZC097-21, Sinochlora szechwanensis Tinkham, 1945 RBTC2050-18, and Tegra novaehollandiae Haan, 1843 DBTZC057-21) were highly likely COI-5P nuclear mitochondrial pseudogene (NUMT) intrusions and were excluded from our final dataset, since they each grouped separately from other individuals of the same species in the NJ tree (Fig. 5). Subsequent analyses focused on 677 barcode sequences, which were collapsed into 360 unique haplotypes. These records belong to three families, including Gryllacrididae (N = 35), Rhaphidophoridae (N = 23), and Tettigoniidae (N = 619).

Genetic divergence
Genetic distances for the resulting sequences were calculated in the BOLD System Distance Summary and Barcode Gap Analysis tools based on the K2P model. Table 4 provides sequence divergences (K2P) for differing levels of taxonomic affinity. The maximum intraspecific genetic distances (I max ) of the 55 non-singleton species averaged 3.17% (range 0-21.64%), in which 24 species were above 2% (Table 2). Fourteen species are represented by only a single record, not allowing us to estimate intraspecific divergence. The genetic distance to the nearest neighbour (DNN) averaged 13.14% (ranging 3.31-19.38%), with the minimum nearest-neighbour distance occurring between Xizicus laminatus Shi, 2013 andXizicus howardi Tinkham, 1956 (Table 2). Not a single haplotype was shared between species within our DNA barcode library. A barcode gap was present in 51 of 55 (92.73%) non-singleton species. Intraspecific distances were inflated by the presence of very high variation within some taxa, resulting in no significant barcode gaps (Fig. 3). The maximum intraspecific distance was higher than its nearest-neighbour distance in four species, including Elimaea terminalis Liu, 1993 Fig. 3). Eighteen of 24 species with deep intraspecific divergence (K2P model, I max > 2%) were split into two or more BINs (Table 2). Interestingly, Ruspolia dubia Redtenbacher, 1891 were also split into two BINs, although the intraspecific divergence was relatively low (I max = 1.55%).

Barcode index numbers (BINs) assignment and species delimitation
For the final dataset, 677 COI-5P records were assigned to 118 BINs that belong to 90 putative taxa. Among these, 68 corresponded to morphospecies, while another 22 belonged to a unique BIN that was currently only identified at genus level and highly likely to represent an unrecognized species. Of 68 morphospecies defined by morphology, a total of 49 contained only a single BIN, while 19 were represented by multiple BINs In contrast, specimens of the remaining three genera were heterogeneous and split into two or more BINs: Elimaea BOLD:ADE1399, ADM8940, and ADB3477; Atlanticus BOLD:ADB5602, ADB6974, ADR7192, ADE2402, ADB3445, ADE1821, and ADB3462; Kuwayamaea BOLD:ADB4962, ADE2183, ADE1620, ADB6899, ADB4961, ADB5240, and ADB4960. The NJ tree was employed to assess support for detected BINs, not to reconstruct the phylogenic relationships. The NJ tree showed the majority of non-singleton species and BINs were recovered as monophyletic (Fig. 5). All BIN species represented by two or more specimens, except ADE4649, formed a monophyletic lineage. High intraspecific divergence values also reflected deep splits in the NJ tree. All non-singleton morphospecies are clearly distinguishable through COI-5P, forming non-overlapping clades except for several species with deep intraspecific divergence exceeding DNN, namely Elimaea terminalis (I max = 10.68%, DNN = 6.98), Melaneremus fuscoterminatus (I max = 14.73%, DNN = 3.78%), Sinochlora szechwanensis (I max = 8.71%, DNN = 5.68%) and Xizicus howardi (I max = 21.64%, DNN = 3.31%) (Fig. 5, Table 2). ASAP analysis identified 99 MOTUs with an asapscore of 9.00 (Fig. 5). jMOTU analysis delimited 101 at a 20 bp (3%) cut-off divergence (Fig. 5). The GMYC single-threshold method estimated 105 MOTUs, while the GMYC method with multiple thresholds delimited 132 MOTUs (Fig. 5). The bPTP analysis delimited 119 and 120 MOTUs based on the maximum likelihood and highest Bayesian supported analyses, respectively (Fig. 5).
Capnogryllacris melanocrania showed deep intraspecific divergence (I max = 5.01%), and was split into four BINs (ADF2751, AEJ4972, ADF2750, AEJ9445), and these four BINs formed nearest-neighbour clusters. All species delimitation methods treated C. melanocrania as three MOTUs (ADF2750 and AEJ9445 were placed in a single MOTU, while ADF2751 and AEJ4972 were each placed in their own MOTU),   (Table 2). Three Sinochlora species formed two clades: one was composed of specimens identified as S. szechwanensis (ADB3463) and S. sinensis, and the other was composed of S. szechwanensis (ACI0121) and S. longifissa (Fig. 4). Phyllomimus klapperichi showed deep intraspecific divergence (I max = 17.47%), and was split into three BINs (ADM7559, ADB9999, and ADB4775) (Fig. 4), reflecting three distinctly different subclusters of the P. klapperichi cluster on the NJ tree. All species delimitation methods split P. klapperichi into three MOTUs, except for mGMYC that split P. klapperichi into four MOTUs. Elimaea terminalis showed deep intraspecific divergence (I max = 10.68%) and was split into two BINs (ADB3392 and ADB3394). Two E. terminalis BINs corresponded to two clades in the NJ tree: one contained three E. terminalis ADB3392 specimens and was sister to Elimaea ADM8940, whereas the other contained three E. terminalis ADB3394 specimens, which was sister to Elimaea ADB3477 and Elimaea annamensis Hebard, 1922 (Fig. 4). All species delimitation methods treated two E. terminalis BINs as separate MOTUs. Both Xizicus howardi (I max = 21.64%) and X. szechwanensis (Max Intra-Sp = 4.8%) showed deep intraspecific divergence, and were split into four BINs (ACD5539, ADE3141, ADB5688, and AEJ3139) and two BINs (ADB3348 and ADE0823), respectively. Five Xizicus species formed three clades on the NJ tree: the first composed by specimens identified as X. concavilaminus, X. laminatus, and three X. howardi BINs (ACD5539, ADE3141, ADB5688); the second composed of all X. szechwanensis specimens and X. howardi AEJ3139, and the third composed of only a single X. biprocerus. All species delimitation methods except mGMYC revealed consistent results with BIN assignments. mGMYC split X. howardi ADB5688 as two MOTUs (Figs 4, 5). Therefore, X. howardi in Zhejiang might be a species complex of at least four species. Detailed comparative analyses of additional specimens was needed to evaluate the taxonomic status of X. howardi. Although Ruspolia dubia (I max = 1.55%) was split into two BINs (ACD5503, ADE5391), all species delimitation methods treated R. dubia as a single MOTU. Tachycines meditationis (I max = 3.31%) was split into three BINs (AEJ6894, AEK0279, AEJ9615). mGMYC, bPTP-ML, bPTP-BI approaches agreed on the subdivision of T. meditationis BINs while ASAP, jMOTU, and sGMYC analyses treated T. meditationis as a single MOTU. A BIN was assigned to the Match category when all of its specimens were assigned to single MOTU. Fiftyfive out of 90 species-level taxa were recovered by all species delineation methods, suggesting that they may be a single species. R tax values ranged from 0.71 for ASAP to 0.94 for mGMYC (  Tables 5. Calculation of match ratio, taxonomic index of congruence (C tax ), and relative taxonomic resolving power index (R tax ) for different species delimitation methods. The lower triangle shows C tax , and the upper triangle shows Match ratio.

Discussion
In the past several hundred years, species diagnostics have been traditionally based on morphological characterizations. Morphology-based specimen identification is time consuming and requires high levels of taxonomic expertise. Compared with traditional taxonomy, DNA barcoding is a fast and inexpensive method for species identification.
Numerous studies have revealed cryptic species using DNA barcodes (Kondo et al. 2016;Lassance et al. 2019;Farkas et al. 2020). However, using only DNA barcodes may lead to classification errors, and it is important to combine morphology and barcodes. The utility of DNA barcoding heavily depends on the taxonomic coverage of an associated DNA barcode reference library. Barcode libraries are generally assembled with two main objectives in mind: specimen identification and aiding species discovery/delimitation (Knebelsberger et al. 2014;Blagoev et al. 2016;Khamis et al. 2017;Ashfaq et al. 2019;Delrieu-Trottin et al. 2019;D'Ercole et al. 2021). The significant increase in studies of specific insect taxa using DNA barcodes in recent years, especially in some regions, has laid the foundations for building a comprehensive library of DNA barcodes at the continental-scale (D'Ercole et al. 2021;Dincă et al. 2021;Pesic et al. 2021). Only a few barcode studies of katydids and related ensiferan groups have been conducted in China, South Korea, Central Europe (Guo et al. 2016;Hawlitschek et al. 2017;Zhou et al. 2019;Kim et al. 2020). Our study provided 677 COI-5P barcode sequences, including 68 morphospecies and 80 specimens only identified to genus level.
BIN sharing between different species might be explained by mitochondrial introgression following hybridization, recent divergence with or without incomplete lineage sorting, inadequate taxonomy, misidentification (Geiger et al. 2021). One large-scale study for European Lepidoptera showed that more than half (58.6%) of the detected cases of non-monophyletic species are likely to be due to operational factors such as misidentification, oversplitting of species, overlooked synonymies or potential cryptic species (Mutanen et al. 2016). For the DNA barcode library of Central and Northern European Odonata, six of 31 BINs containing records of mixed taxonomic annotations conflict at generic levels, which is most likely due to misidentification, sample mix-up in the laboratory, sample number mix-up of specimens, or nomenclatural changes not applied to all affected datasets in BOLD (Geiger et al. 2021). Our previously mentioned example of BIN sharing between Conocephalus gladiatus DBTZC033-21 and Diestramima austrosinensis was caused by a sample confusion. The accuracy of DNA barcoding can be severely impacted when there are atypical NUMTs that lack the characteristic mutations (including in-frame stop codons and indels), which were difficult to identify and remove from the barcode dataset. NUMTs are rarely reported in DNA barcoding studies, despite a fairly frequent abundance across various insect groups (Hausberger et al. 2011;Jordal and Kambestad 2014;Hawlitschek et al. 2017). Our study also revealed four records shared with other species, which were highly likely the erroneous amplification of nonfunctional nuclear copies of COI-5P. The specimens of different species were admixed in a single cluster on the NJ tree, which often arises as the result of misidentification, contamination, or NUMTs (Mutanen et al. 2016). Previous studies found that NUMTs are coamplified using universal primers LCO1490/HCO2198, even across families: Anabrus simplex (Tettigoniidae) vs. Schistocerca americana (Acrididae) (Moulton et al. 2010). Many NUMTs not having stop codons or indels may represent mitochondrial heteroplasmy, but this is a highly unusual phenomenon in insects (Jordal and Kambestad 2014).
Our analyses revealed 19 of 55 non-singleton morphospecies (34.55%) with multiple BINs. Most of these intraspecific BINs formed nearest-neighbour clusters to each other, reflecting the discrimination of geographical subclades within a currently recognized species. Previous studies have shown that BINs provide a very good reflection of classical taxonomy (Hausmann et al. 2013). For example, our prior study has shown a three-quarter species-BIN correspondence in katydids from China (Zhou et al. 2019). Species with BIN splits and high divergences are likely to represent a cryptic species complex (Ashfaq et al. 2019). Likewise, high levels of 'intraspecific' barcode variation also reflect overlooked species, but there is no fixed level of divergence that indicates species status (Huemer et al. 2020). Although the presence of a barcoding gap, intraspecific variation threshold, or monophyly of each putative species are sufficient conditions to ensure specimen correct identification, these are not essential criteria (Meyer and Paulay 2005;Yang and Rannala 2017).
Applying multiple species delimitation methods to the same dataset can provide a more reliable picture of species-level clustering. We obtained more MOTUs based on both the distance-based species-delimitation (ASAP, jMOTU) and the phylogeny-based methods (GMYC and bPTP) than the number of morphospecies. Several species with deep intraspecific divergence were split into more than one MOTU, and most of these additional BINs formed nearest-neighbour subclusters on the NJ tree. It was worth exploring the large intraspecific genetic distances for the same species although they were clustered together. Inconsistencies in delimitation results occur frequently as the result of different species delimitation methods. The mGMYC analysis produced a considerably higher number of MOTUs than other methods. R tax values ranged from 0.71 for ASAP to 0.94 for mGMYC ( Table 5), suggesting that mGMYC may overestimate the number of species. The performance of phylogeny-based methods is sensitive to multiple factors, such as general phylogenetic history, sampling intensity, DNA sequence length, speciation rate, and differences of effective population size among species (Esselstyn et al. 2012). The number of species can be underestimate or overestimate with ancestral polymorphism (Esselstyn et al. 2012), but previous studies showed that the sGMYC performs better than mGMYC (Talavera et al. 2013). Three indicators (Match ratio, C tax , and R tax ) suggest that different species definition methods also diverge in terms of the location of species boundaries (Miralles and Vences 2013). Concordance among results of different species delimitation methods revealed that both P. klapperichi and E. annamensis may contain undocumented cryptic species. At present, we speculate that this is due to allopatric isolation due to the mountainous barriers between the samples. Despite the lack of clear morphological differentiation, the geographically and genetically distinct clusters suggest the existence of cryptic diversity. Therefore, our study indicates that the diversity of katydids, cave crickets, and leaf-rolling crickets in Zhejiang Province is slightly higher than the currently accepted taxonomy would suggest. The concordance among different species delimitation methods often implies higher reliability and should be used as primary taxonomic hypotheses that are subsequently tested with other types of data as part of an integrative taxonomic framework (Fujita et al. 2012;Blair and Bryson 2017).

Conclusions
Our DNA barcode library represents an important step for the molecular characterization of katydids, cave crickets, and leaf-rolling crickets in Zhejiang, China. Although some specimens still lack a Linnean name, their BIN assignments are treated as putative species in ecology, conservation biology and other biodiversity research (Sharkey et al. 2021). The number of detected BINs higher than traditionally accepted species suggests that DNA barcoding will complement morphology-based taxonomic system by revealing overlooked species complexes (Schmidt et al. 2015). The consensus delimitation scheme yielded 55 MOTUs, each of which may be a single species. Only three species (I max > DNN) failed to be identified as monophyletic (e.g., Elimaea terminalis, Sinochlora szechwanensis, and Xizicus howardi), so we speculate that these may be species complexes. If a species is split into two or more MOTUs implying cryptic diversity, then the number of katydid species in Zhejiang may be more than what is currently identified. However, prior to formal taxonomic changes, results should be subsequently tested using an integrative approach. This Barcode library was effective in assigning newly encountered specimens to either one or a few closely allied species. We expect it to be useful for future katydid taxonomic and conservation work.