DNA barcoding as a screening tool for cryptic diversity: an example from Caryocolum, with description of a new species (Lepidoptera, Gelechiidae)

Abstract We explore the potential value of DNA barcode divergence for species delimitation in the genus Caryocolum Gregor & Povolný, 1954 (Lepidoptera, Gelechiidae), based on data from 44 European species (including 4 subspecies). Low intraspecific divergence of the DNA barcodes of the mtCOI (cytochrome c oxidase 1) gene and/or distinct barcode gaps to the nearest neighbor support species status for all examined nominal taxa. However, in 8 taxa we observed deep splits with a maximum intraspecific barcode divergence beyond a threshold of 3%, thus indicating possible cryptic diversity. The taxonomy of these taxa has to be re-assessed in the future. We investigated one such deep split in Caryocolum amaurella (Hering, 1924) and found it in congruence with yet unrecognized diagnostic morphological characters and specific host-plants. The integrative species delineation leads to the description of Caryocolum crypticum sp. n. from northern Italy, Switzerland and Greece. The new species and the hitherto intermixed closest relative C. amaurella are described in detail and adults and genitalia of both species are illustrated and a lectotype of C. amaurella is designated; a diagnostic comparison of the closely related C. iranicum Huemer, 1989, is added.


Introduction
The genus Caryocolum Gregor & Povolný, 1954 is one of the most species-rich genera of European Gelechiidae (Huemer and Karsholt 2010). Having been revised in monographic papers (Klimesch 1953, Huemer 1988, its taxonomy seemed well established. However, in the last decade new species were found in, e.g. Sicily, southern France and Greece (Bella 2008, Grange and Nel 2012, Huemer and Nel 2005 Karsholt 2010) raising the number of described species to 51. Most of the species are considered indisputable based on their morphology and distinct biology -as far as known, these species are closely linked to Caryophyllaceae as their exclusive larval host-plant family. We investigate, for the first time in Caryocolum, the congruence of traditional morphological species delineation and molecular data from the COI barcode region for a vast majority of the European fauna, covering altogether 44 species, including four subspecies. Surprisingly, the potential for cryptic diversity proved extraordinarily high for a supposedly well-known genus and we newly describe one of the hitherto overlooked species.

Material and methods
Extensive generic descriptions and diagnoses of European species of Caryocolum have been published in several reviews, particularly Huemer and Karsholt (2010) and Huemer (1988), and are thus not repeated here.
Specimens. Our study is based on about 50 specimens of the Caryocolum amaurella (Hering, 1924) species-group and an uncounted number of European Caryocolum, exceeding 1000 specimens, but only partially used for genetic analysis (see below). Most of the material was traditionally set and dried or alternatively spread; a few specimens are only pinned. Genitalia preparations followed standard techniques (Robinson 1976) adapted for male genitalia of Gelechiidae and (some) female genitalia of Caryocolum by the so-called "unrolling technique" (Pitkin 1986, Huemer 1987. DNA Barcodes. Full-length lepidopteran DNA barcode sequences are a 648 basepair long segment of the 5' terminus of the mitochondrial COI gene (cytochrome c oxidase 1). DNA samples (dried leg) were prepared according to the accepted standards. Legs from 250 specimens of Caryocolum were processed at the Canadian Centre for DNA Barcoding (CCDB, Biodiversity Institute of Ontario, University of Guelph) to obtain DNA barcodes using the standard high-throughput protocol described in deWaard et al. (2008). Sequences longer than 500 bp were included in the analysis.
Successfully sequenced voucher specimens are listed in Suppl. material 1. Sequences were submitted to GenBank; further details including complete voucher data and images can be accessed in the public dataset "Lepidoptera of Europe Caryocolum" dx.doi.org/10.5883/DS-LECARY in the Barcode of Life Data Systems (BOLD; Ratnasingham and Hebert 2007). Degrees of intra-and interspecific variation in the DNA barcode fragment were calculated under Kimura 2 parameter (K2P) model of nucleotide substitution using analytical tools in BOLD systems v3.0. (http://www. boldsystems.org). A neighbour-joining tree of DNA barcode data of European taxa was constructed using Mega 5 (Tamura et al. 2011) under the K2P model for nucleotide substitutions.
Photographic documentation. Photographs of the adults were taken with an Olympus SZX 10 binocular microscope and an Olympus E 3 digital camera and processed using the software Helicon Focus 4.3 and Adobe Photoshop CS4 and Lightroom 2.3. Genitalia photographs were taken with an Olympus E1 Digital Camera from Olympus BH2 microscope.

Abbreviations of institutional collections BMNH
The

Molecular analysis
Forty-four of 51 European species were successfully sequenced, resulting in a fulllength barcode fragment for 191 specimens and more than 500 bp for further 26 specimens ( Fig. 1, Table 1, Suppl. material 1). Nine shorter sequences were not included in the analysis and sequencing of 24 specimens failed. The maximum intraspecific K2P distance varies from 0% in several species to 6.27% in C. fibigerium. Ten species have a high maximum intraspecific divergence greater than 2%. In six species (newly described species excluded) with a medium divergence greater than 3% potential cryptic diversity should be investigated. Furthermore, the intraspecific divergence of more than 3% in C. schleichi, a species separated into 3 allopatric subspecies, is beyond variation typically found within species, supporting their status as valid species. The only  other subspecies we have examined are nominotypical C. marmorea and the recently separated C. marmorea mediocorsa with a very low divergence of 0.3%. Sequences of the COI barcode region of all analysed morphospecies reveal significant interspecific genetic distances with barcode gaps ranging from a minimum of 3.11% to the nearest neighbour (C. pullatella -C. marmorea) to a maximum of 6.61% (C. saginella -C. cauligenella).

Taxonomy
The Caryocolum amaurella species-group as defined by Huemer (1988) differs from other congeners mainly by the characteristic shape of the sacculus, which is unique in the genus. Until now it only included C. amaurella and C. iranicum (Huemer 1988(Huemer , 1989b. Based on the DNA barcode divergence and diagnostic morphological characters combined with biological data we describe the new species C. crypticum. Due to the mix-up of C. crypticum with C. amaurella in recent identification guides the latter species is also re-described here in detail.
Caryocolum crypticum sp. n. http://zoobank.org/5E1FB9E5-3A65-49C6-80BF-A5CA7C4FFF99 http://species-id.net/wiki/Caryocolum_crypticum Figs 2-3, 6-7, 10-11, 14-15 Type material. Holotype: ♀ (Fig. 2) Diagnosis. Caryocolum crypticum sp. n. is externally similar to several other species of the genus and can be best recognized by the largely unmarked forewings with cream costal and tornal spots. From its closest relatives C. amaurella and C. iranicum it differs by the rusty brown distal half of the thorax and the concolorous tegulae, the dark brown forewings with rusty brown scales, and the cream colours of the costal and tornal spots. The male genitalia of C. crypticum are very similar to those of C. amaurella but the valva is more slender and slightly longer (see Figs 6-7, 10-11 versus 8-9, 12-13). The similar C. iranicum differs by the shape of the sacculus with almost straight dorsal margin (see Huemer 1989b: Figs 14-16). However, the most striking diagnostic characters of the new species are found in the female genitalia which differ from C. amaurella particularly by the short lateral sclerites of the ductus bursae and the much longer and more slender signum hook (see Figs 14-15 versus 16-17). The female genitalia furthermore differ from C. iranicum by the weakly cup-shaped rather than funnel-shaped antrum, shorter lateral sclerites of the ductus bursae, and the shorter apophysis anterior which is almost twice the length of segment VIII in C. iranicum.
Description. Adult (Figs 2-3). Wingspan 10.5-14 mm. Segment 2 of labial palpus with a few cream-coloured scales on inner and upper surface, blackish brown on outer and lower surface; segment 3 almost black with light tip. Antenna black, indistinctly lighter ringed. Head with light yellow frons and black neck; thorax blackish brown with rusty brown posterior part; tegulae rusty brown except for blackish brown base. Forewing blackish brown, mottled with some rusty brown, particularly in proximal half; supplementary black spots in fold and in cell obscure; costal and tornal spot small, cream, separated. Hindwing light grey.
Variation. No variation observed except for size, which differs considerably in two reared specimens from Italy and Greece.
Female genitalia (Figs 14-15). Segment VIII without processes, subgenital plate sub-triangular, with numerous narrow folds, separated from sclerotized lateral plates by membranous zone; apophysis anterior about length of segment VIII; antrum short, about one quarter length of apophysis anterior, nearly cup-shaped; posterior part of ductus bursae with pair of short sclerites, extending to middle of apophysis anterior, and with two tiny sclerites anteriorly; signum with crescent-shaped base, long and slender, strongly bent hook.
Molecular data. The intraspecific divergence of the barcode region is low with mean intraspecific divergence of 0.21% and maximum intraspecific divergence of 0.31% (n=3). The distance to the nearest neighbour C. mucronatella is 5.41%, the divergence to the morphologically closest C. amaurella is 6.82%.
Etymology. The name "crypticum" refers to the cryptic morphology of the species and is derived from the latinized adjective crypticus.
Distribution. The species is known from widely separated localities in northern Italy, Switzerland and Greece, indicating a more widespread distribution in Sub-Mediterranean and Mediterranean Europe. However, the host-plants are much more widespread, ranging to northern Europe in the north and to Central Asia in the east. No sympatric occurrence with C. amaurella is reported though the two taxa can occur close to one another in the Alps.
Bionomics. The larva has been found in early spring, feeding in the stem of Silene otites (L.) Wibel (Caryophyllaceae) (Burmann 1990) and Silene nutans L. (Huemer 1989) but detailed descriptions of feeding habits and larval morphology are missing. The adult occurs from early July (reared material dates from mid-June to mid-July) to September and it is attracted to light. C. crypticum prefers xerophilous steppes Figures 6-7. Male genitalia. 6 Caryocolum crypticum sp. n., paratype, Italy, slide GU 86/041 P.Huemer 7 C. crypticum sp. n., paratype, Italy, slide GEL 1215 P.Huemer. and rocky habitats with sparse vegetation. Vertical distribution: from about 500 to 1300 m, restricted to mountainous areas.
Remarks. Huemer (1988) already examined females reared from Silene otites in Switzerland by Whitebread but in the absence of males considered them as deviating C. amaurella.

Caryocolum amaurella
Diagnosis. See above. Description. Adult (Figs 4-5). Wingspan 10-14 mm. Segment 2 of labial palpus bone-white on inner and upper surface, blackish grey on outer and lower surface; segment 3 almost black with light tip. Antenna black, indistinctly lighter ringed. Head with light yellow frons and black neck; thorax and tegula black mottled with brown. Forewing blackish grey mottled with some light brown; base black; two indistinct black spots in fold; one oblique spot above it and one in cell; some white scales before and after these spots; costal and tornal spot small, white, rarely fused. Hindwing light grey.
Variation. The colour of the forewings varies from greyish to blackish. Worn specimens look lighter than fresh ones. Sometimes there are no white scales in the middle of the wing.

Molecular data.
The intraspecific divergence of the barcode region is high with mean intraspecific divergence of 3.01% and maximum intraspecific divergence of 4.62% (n=9). The distance to the nearest neighbour C. mucronatella is 5.21%, the divergence to the morphologically closest C. crypticum is 6.82%. The extraordinary high intraspecific divergence with 4 haplotypes is partially related to geographical pattern. However, we also found two haplotypes within one population in Finland and morphology does not support cryptic diversity.
Distribution. With certainty known from scattered records from northern and Central Europe and Turkey. All the specimens from north of the Alps that we have been able to cross-check are correctly attributed to C. amaurella. However, recent records from Ukraine (Bidzilya and Budashkin 2009) and Russia (southern Ural Mountains) (Junnilainen et al. 2010) have to be re-examined due to a possible mix-up with C. crypticum. Records from Switzerland are dubious, and at least in one instance refer to the new species, whereas those from France (Nel 2003) are confirmed (see Huemer and Karsholt 2010, Fig . 154c).
Bionomics. The larva has been recorded feeding on Silene viscaria (L.) Jess (= Lychnis viscaria L. (Caryophyllaceae) (Huemer and Karsholt 2010), while the other stated host-plants, namely Silene otites (L.) Wibel (Burmann 1990) and S. nutans L. (Huemer 1989a), refer to C. crypticum. Schütze (1926Schütze ( , 1931 gives a detailed account of the life-history. The larva feeds in April and May in the young terminal leaves which arewithout spinning -attached to a tube where the larva is hidden. Dark frass is frequently extruded at the tip of the larval dwelling. Later it bores into the stem and the shoots often become swollen and stunted. Pupation takes place on the ground in a cocoon among debris. The adult occurs from late June to early September and it is attracted to light. C. amaurella is restricted to warm and sunny habitats such as dry meadows and pastures. Vertical distribution: from lowland localities to about 2200 m in the Alps. Remarks. Lita amaurella was described from an unspecified number of specimens of both sexes ('♂, ♀') from Finland (Bromarf) (Hering 1924). In order to stabilize nomenclature, a male, labelled as type, in ZMUH is here designated as lectotype (see data above). Lita viscariae was described from 67 specimens reared from Silene viscaria from Eastern Germany (near Rachlau) (Schütze 1926). No type material was traced during this and earlier studies (Huemer 1988), but the original descriptions and topotypical material leave no doubt about the identity.
Turkish specimens of C. amaurella examined by us differ from European specimens of this species by the thorax with rusty brown posterior part and the rusty brown tegulae with blackish brown base, similar to C. crypticum, and they are thus hardly separable from the latter on external characters. The genitalia of both sexes of C. amaurella from Turkey agree in all details with those of European C. amaurella and, because no contradicting genetic data is currently available, we consider them as belonging to that species.
One of the examined specimens of C. amaurella from Turkey was collected in the same locality (Kizildaĝ Geçidi, prov. Erzincan) as a specimen C. iranicum in ZMUC. The latter species, which is only known from a few specimens, differs, as stated above, in characters of the male genitalia.

Discussion
The genus Caryocolum is a rare example of European Microlepidoptera which has gained significant attention from specialists during the last decades. Several monographic papers, from Klimesch (1953-54) to Huemer and Karsholt (2010), are a sound base for a stable taxonomy and a pre-requisite to test congruence of classical morphologicallydriven species delineation with that of molecular data. DNA barcoding has evolved as a widely accepted method for preliminary species delimitation (Monaghan et al. 2009, Hendrich et al. 2010, Kekkonen and Hebert 2014 and therefore the animal DNA barcode region seemed an appropriate genetic marker to be used for this purpose. Indeed, barcoding resulted in an excellent support for all of the 44 studied species with a distinct barcode gap to the nearest neighbour ranging from about 3% to nearly 7% interspecific divergence. Intraspecific variation shows a different pattern. The majority of species has a low (<2%) maximum intraspecific divergence and thus seems taxonomically well defined. However, a remarkable number of species (8 species, nearly one quarter of all, 9 species with only one sample not considered) is characterized by maximum divergence exceeding 3% (Fig. 1). Such deep intraspecific splits often suggest the possibility of cryptic diversity (for examples in Lepidoptera, see Dinca et al. 2011, Hausmann et al. 2009, Huemer and Hebert 2011, Huemer et al. 2012, Kaila and Mutanen 2012, Landry and Hebert 2013, Mutanen et al. 2012a, b, 2013, Segerer et al. 2011, Wilson et al. 2010. A morphological cross-check in one of these taxa, Caryocolum amaurella, proved the existence of a hitherto overlooked species with validity independently supported by morphology, biological data, and the DNA barcode. The potential of DNA barcoding for screening of cryptic diversity is obvious in this case, where morphological characters, particularly the normally well-separated male genitalia, are weak and thus have been neglected so far. Although deep intraspecific splits may alternatively refer to mitochondrial introgression, historical polymorphism or Wolbachia infection (Hurst andJiggins 2005, Funk andOmland 2003), there is a considerable possibility of further cryptic diversity in the genus. In C. schleichi it seems most appropriate that the three sequenced subspecies should be considered as different species since host-plants and genitalia morphology differ as well (see i.e. Huemer and Karsholt 2010). The subspecies of C. schleichi are geographically isolated making their delimitation both rather artificial and very sensitive to the species concept applied (Mutanen et al. 2012c). An integrative revision of this group is in preparation by the authors. In contrast, the expected low divergence in subspecies is reflected by a very low divergence in C. marmorea and its subspecies C. marmorea mediocorsa. Diagnostic morphological characters seem present in further taxa from first examined samples, namely C. fibigerium and C. peregrinella with a maximum intrapecific divergence of 6.27% and 5.69% related to three deep phylogeographic splits in both species. Similar deep splits are observed in C. alsinella and in C. cauligenella. For all these taxa with subtle character differences a careful re-examination of morphology has to be undertaken in the future.