Species delimitation in the Grayling genus Pseudochazara (Lepidoptera, Nymphalidae, Satyrinae) supported by DNA barcodes

Abstract The Palaearctic Grayling genus Pseudochazara encompasses a number of petrophilous butterfly species, most of which are local endemics especially in their centre of radiation in SW Asia and the Balkans. Due to a lack of consistent morphological characters, coupled with habitat induced variability, their taxonomy is poorly understood and species delimitation is hampered. We employed a DNA barcoding approach to address the question of separate species status for several European taxa and provide first insight into the phylogeny of the genus. Unexpectedly we found conflicting patterns with deep divergences between presumably conspecific taxa and lack of divergence among well-defined species. We propose separate species status for Pseudochazara tisiphone, Pseudochazara amalthea, Pseudochazara amymone, and Pseudochazara kermana all of which have separate well supported clades, with the majority of them becoming local endemics. Lack of resolution in the ‘Mamurra’ species group with well-defined species (in terms of wing pattern and coloration) such as Pseudochazara geyeri, Pseudochazara daghestana and Pseudochazara alpina should be further explored using nuclear molecular markers with higher genetic resolution.


Introduction
Depending on which systematic order of classification is adhered to, the genus Pseudochazara comprises 27-32 species of Graylings (Gross 1978, Lukhtanov 2007, Savela 2015. It has a wide distribution in the Palaearctic region from North Africa to the Himalayas and Mongolia (Tennent 1996, Tshikolovets 2005, Yakovlev 2012). In addition to vague species delimitation, large intraspecific variation has resulted in the description of over 100 subspecific taxa (Lukhtanov 2007) in this intensively studied taxon.
The main reason for the extensive variation in phenotype can be linked with the specific ecological requirements of these butterflies. They are mostly petrophilous and limited to specific rock substrate to which they are perfectly adapted with their camouflaged underside wing pattern and cryptic coloration. Local adaptation to mimic the coloration of the rock substrate is, therefore, one of the main drivers for such large scale diversification (Lorković 1974, Weiss 1980, Hesselbarth et al. 1995, Tennent 1996, but see Anastassiu et al. 2009).
There has been no attempt to reconstruct the phylogeny of the genus or validate species status using molecular markers. Only the taxonomic position within subtribe Satyrina and a sister relationship to Chazara has been established (Peña et al. 2011).
In order to resolve the relationship among Pseudochazara species and re-evaluate their species status, in particular of some European taxa, we employed DNA barcoding -using a standardized gene region (5' segment of the mitochondrial gene cytochrome c oxidase subunit I = COI) which enabled us to utilize additional Pseudochazara sequences available in the Barcode of Life Database (BOLD 2015). DNA barcodes have been widely and successfully used in Lepidoptera taxonomy and species delimitation as an additional set of characters which are independent of habitat conditions (Hebert et al. 2004, Nazari and Sperling 2007, Nazari et al. 2010, Dinca et al. 2011, Yang et al. 2012, Lukhtanov and Novikova 2015, Pazhenkova et al. 2015. However, there are several limitations of this method (see e.g. Wiemers and Fiedler 2004, Brower 2006, Ritter et al. 2013, Song et al. 2008, Toews and Brelsford 2012 which should be taken into account in the interpretation of the gene tree.

Sample collection, DNA extraction, amplification, sequencing, and alignment
With the aim of achieving consistency, we adopt the nomenclature of the most recent list of Pseudochazara species by Lukhtanov (2007). Following the discovery of Pseudochazara mamurra amymone in Albania (Eckweiler 2012), we initially sampled all the Pseudochazara taxa from the Balkan Peninsula, a hotspot of Pseudochazara diversity in Europe , Gascoigne-Pees et al. 2014. We then broadened the range of our sampling adding additional species from Turkey and the Middle East, the main areas of Pseudochazara diversification. Altogether 27 specimens belonging to 10 species of Pseudochazara, for which the barcoding gene COI was successfully amplified, were included in the study (see Appendix 1). All specimens were dried prior to DNA extraction. In addition, we included COI sequences from 81 individuals belonging to 14 species from the BOLD database (BOLD 2015). Only specimens that could be unambiguously identified by the voucher photos were selected. Following the nomenclature guidelines proposed by Lukhtanov (2007) a total of 34 taxa belonging to 20 species were included in the analysis. As outgroups, we added several sequences of the closely related Satyrine genus Chazara from GenBank, based on the results of the phylogenetic study of Satyrinae by Peña et al. (2011).
Total genomic DNA was extracted from single legs, following the Mammalian tissue preparation protocol (GenElute Mammalian Genomic DNA miniprep kit from Sigma-Aldrich). For each sample a 657 bp fragment of the first subunit of the mitochondrial gene cytochrome c oxidase (COI) was amplified using primers LCO1490 and HCO2198 (Folmer et al. 1994). Amplification followed a standard protocol described in Verovnik et al. (2004). PCR products were visualized on an agarose gel to verify amplification success and sequenced by Macrogen in both directions on an Applied Biosystems 3730xl sequencer.

Phylogenetic analysis
We used Bayesian inference to reconstruct a phylogenetic tree. To achieve more clarity the tree was constructed on a subset of samples including only unique haplotypes belonging to the same taxon. A hierarchical likelihood test was employed in order to test alternative models of evolution, using JModeltest v.0.1.1 (Posada 2008). A GTR (Generalised time reversible) model of nucleotide substitution with gamma distributed rate heterogeneity and a significant proportion of invariable sites was selected in accordance with the Akaike Information Criterion. Bayesian analysis was performed with MrBayes v.3.1.2 implementing the best fit substitution model (Huelsenbeck and Ronquist 2001). Markov chain Monte Carlo search was run with four chains for 4 × 10 6 generations, taking samples every 100 generations. The approximate number of generations needed to obtain stationarity of the likelihood values (''burn-in'') of the sampled trees was estimated graphically to 2000 trees. From the remaining trees posterior probabilities were assessed for individual clades based on their observed frequencies. Trees were visualised using Figtree v.1.4.2 (Rambaut 2014). Genetic distances (p-) were calculated with MEGA 6.0 (Tamura et al. 2013). In addition, a statistical parsimony network analysis was performed with TCS 1.21 (Clement et al. 2000).

Results
No insertions or deletions were observed in the mitochondrial COI gene and therefore the alignment was unambiguous. For the COI dataset 63 unique haplotypes among 108 Pseudochazara sequences were detected. 114 (17.5%) sites were variable and 95 (14.6%) were parsimony informative. The average interspecific genetic distance was 4.9%, but in the case of P. mniszechii the intraspecific diversity ranged from 0 to 6.7% with highly distinct divergent sequences of P. mniszechii tisiphone. No evident barcoding gap was observed separating intraspecific from interspecific pairwise genetic distances ( Fig. 1). On the contrary, sharing of identical haplotypes was observed in the following taxa: P. graeca / P. mamurra amymone, P. mamurra mamurra / P. daghestana, and P. beroe aurantiaca / P. alpina. On the other hand, 82% of species comparisons showed high (≥2%) interspecific distances.
The topology of the Bayesian Inference tree of all Pseudochazara samples, including the selected outgroup species (Fig. 5), confirms the monophyly of the genus. High posterior probability values support a basal position of P. atlantis, the only species of the genus present in (and confined to) North Africa. This is somewhat surprising as Figure 1. Frequency distribution of pairwise intra-and interspecific p-distances of the COI sequences in the genus Pseudochazara. No "barcoding gap" exists between these two data series. P. anthelea and P. thelephassa are considered to be morphologically the most distinct and separate species within the genus (Gross 1978). P. atlantis has tentatively been placed into two groups, the 'mamurra' species group (Brown 1976), based on androconia shape, and the 'pelopea' species group (Wakeham-Dawson and Dennis 2001), on account of the shape of male genitalia. P. atlantis is also distinctive according to the TCS analysis and forms a separate network. In addition, the second basal split within Pseudochazara is well supported, and, apart from some single species clades, three species groups tentatively named as the 'pelopea', 'hippolyte' and 'mamurra' clades received high support. We present the results for these clades separately:

'Pelopea' group
This group, which forms a distinct network in the TCS analysis ( Fig. 2), includes two species, P. pelopea and P. mniszechii. However, there is no genetic differentiation between them, with P. pelopea persica and P. pelopea caucasica intermixed with P. mnisze-chii. Two well supported clades pertain to geographically isolated subspecies of P. pelopea, the Levant region (nominotypic P. pelopea pelopea) and Kopet Dhag in NE Iran (P. pelopea tekkensis). Both subspecies are morphologically distinct from P. pelopea persica, in particular the latter, with much wider and more pronounced orange submarginal bands on their forewings. P. pelopea tekkensis is considered a separate species by Nazari (2003). P. mniszechii is also polyphyletic due to the separate position of the subspecies tisiphone from the southern Balkans, which is clearly not closely related, and belongs to the 'hippolyte' group.

'Hippolyte' group
The 'hippolyte' clade sensu stricto includes the widely distributed P. hippolyte complex which has a vast range from southern Spain to central China (Tshikolovets 2011) together with a number of local endemics from the southern Balkan Peninsula: P. cingovskii in the Republic of Macedonia, P. orestes from north-eastern Greece and the neighbouring part of Bulgaria, P. mniszechii tisiphone from north-western Greece and southern Albania and P. euxina from the Crimean Peninsula. Both, the haplotype net- work analysis (Fig. 3) and the phylogeny (Fig. 5) show that P. mniszechii tisiphone is not a subspecies of P. mniszechii despite superficial resemblance in wing patterns and coloration. In fact, it is closely related to two other local endemics from the Balkan Peninsula, P. cingovskii and P. orestes. The presence of P. mniszechii tisiphone in the western part of Turkey, near Bursa (Hesselbarth et al. 1995) remains to be verified. The single haplotype of P. euxina is nestled among samples of P. hippolyte, so our preliminary results do not support its current status as a separate species. Within this clade P. hippolyte williamsi from southern Spain appears basally, however with low posterior probability and it is not monophyletic. All other described subspecies (P. hippolyte pallida, P. hippolyte doerriesi, P. hippolyte mercurius) are less distinct from the nominotypical subspecies, with two Central Asiatic subspecies (P. hippolyte pallida, P. hippolyte mercurius) sharing haplotypes.  The sister relationship of P. thelephassa and P. anthelea, which is indicated by genital morphology (the presence of a distinct costal process on the dorsal side of the valve) and wing pattern (the presence of a well-defined black area in the forewing discal cell) Figure 5. Phylogeny of Pseudochazara species derived from the barcoding gene COI using Bayesian inference analysis. Values on major branches are Bayesian posterior probabilities. Branches with support lower than 50% were collapsed manually. Branch names combine taxon name and sample ID (see Appendix 1). Nomenclature follows Lukhtanov (2007). (Aussem 1980b, Hesselbarth et al. 1995, Wakeham-Dawson and Dennis 2001, could not be corroborated as P. anthelea appears to be a sister clade to the 'hippolyte' group sensu strictu with high posterior probability. P. kanishka from Tajikistan is a sister species of the anthelea-hippolyte clade, while P. thelephassa is sister taxon to the antheleahippolyte-kanishka clade, however, with low support. These results concur with wing pattern, i.e. a well-defined black area in the forewing discal cell, also present in specimens of P. kanishka. It is important to note that the average genetic distance between two geographically separated subspecies, P. anthelea anthelea from Asia Minor and neighbouring islands, and P. anthelea amalthea from the Balkan Peninsula was 1.5%. This result is indicative for differentiation into distinct species as predicted by Kudrna et al. (2011).
In the TCS analysis, this group is split into 3 networks: a) the hippolyte clade sensu stricto (Fig. 3), b) P. anthelea, and c) P. thelephassa.

'Mamurra' group
The only two entirely Central Asian species available for analysis, P. turkestana and P. lehana, form a well-supported clade together with the 'mamurra' group, indicating their close relationship, but with a separate network for each in the TCS analysis. All other sequences form a single network (Fig. 4). Although the species sampling in Central Asia is incomplete, there is no evidence of a deep split between Asiatic and European/African taxa as predicted by Wakeham-Dawson and Dennis (2001). The 'mamurra' group is monophyletic, and includes several well-defined species (in terms of wing patterns, androconia and genitalia) with identical or very similar haplotypes. The following taxa could not be distinguished based on COI haplotypes as they do not form separate monophyletic clades: P. mamurra, P. beroe, P. geyeri, P. daghestana, P. alpina, and P. lydia. Only a single sequence was obtained for P. geyeri and P. lydia, so their position within this group is tentative. However, it is clear that P. lydia is closely related to P. mamurra with which it shares similarities e.g. the shape of the androconia (Wakeham-Dawson 2005). P. alpina shares the haplotype with P. beroe and they appear closely related, however, this is again based on the inclusion of a single sequence.
Within the 'mamurra' group the only well supported clade includes the taxa P. schahkuhensis, P. mamurra kermana, P. graeca and P. mamurra amymone. While P. schahkuhensis is sympatric in part of its range with P. mamurra, all other taxa have geographically isolated ranges. P. graeca and P. mamurra amymone are present in the southern part of the Balkan Peninsula with partial range overlap (Pamperis 2009). Both species are clearly morphologically distinct, but genetically not identifiable in COI haplotypes. Clearly this relationship puts in question the status of P. mamurra amymone as a subspecies of P. mamurra. The same conclusion can be drawn for P. mamurra kermana from Iran (Kerman province), which is also well placed within this clade as a sister species to both southern Balkan Peninsula taxa.

Discussion
Our study supports the monophyly of the genus Pseudochazara with high posterior probability values of the COI gene tree. Within the genus, however, two conflicting patterns appear with, unexpectedly, deep divergences between presumably conspecific taxa on the one hand and lack of divergence among well-defined species on the other. This is to some extent concordant with similar studies in related genera in the subfamily Satyrinae (Kodandaramaiah and Wahlberg 2009, Nazari et al. 2010, Kreuzinger et al. 2014. The basal position of P. atlantis from North-western Africa as sister group to all remaining Pseudochazara species falls into the first category. Based on distinct male genitalia morphology and wing shape/patterns P. anthelea and P. thelephassa were considered to form the basal split within the genus (Gross 1978, Aussem 1980b, Hesselbarth et al. 1995, Wakeham-Dawson and Dennis 2001. The basal position of P. atlantis is difficult to explain in terms of biogeography, as it indicates a North African origin of the genus, which has its centre of divergence much further eastwards in the Middle East (Hesselbarth et al. 1995, Tshikolovets 2011. P. atlantis is an alpine species distributed only in the Atlas Mountains of Morocco (Tennent 1996), therefore its isolation from the main distribution of the genus could possibly have preceded the last land bridge connections with Europe at the end of the Miocene (Garcia-Castellanos et al. 2009). Hence, its basal position could be an artefact of long-branch attraction (Bergsten 2005) and/or incomplete sampling of the entirely Asiatic species. Therefore, confirmation with additional genetic markers and additional sampling is required.
Another unexpected result is a deep split between P. mniszechii and P. mniszechii tisiphone, species which are very similar in wing patterns/coloration and considered conspecific in current literature (Hesselbarth et al. 1995, Kudrna et al. 2011, Tshikolovets 2011, Eckweiler 2012) and databases (Lukhtanov 2007, Savela 2015, Fauna Europaea 2016. Based on the COI gene tree P. tisiphone Brown, 1980 (stat. n.) is a separate species closely related to two local endemics from the southern part of the Balkan Peninsula, P. orestes and P. cingovskii. Actually P. tisiphone was originally described as a subspecies of P. cingovskii (Brown 1980) and its close relationship was hypothesised also by Wakeham-Dawson and Dennis (2001) based on the similarity of the male genitalia. The low level of genetic differentiation between P. tisiphone, P. orestes, and P. cingovskii indicates a relatively recent speciation, however, we are inclined towards supporting their separate species status based on constant differences in wing patterns/coloration and also their ecological specialization (Pamperis 2009, Verovnik et al. 2013. A split between P. anthelea anthelea from Asia Minor and P. anthelea amalthea from the Balkan Peninsula has been suggested based on minor differences in male genitalia and consistent differences in female wing coloration between both taxa (Olivier 1996, Wakeham-Dawson andDennis 2001). They are considered separate morphospecies by Kudrna et al. (2011). We can agree with separate species status as the split between the two taxa is much older compared to almost no differentiation in three morphologically and ecologically well defined species: P. tisiphone, P. orestes, and P. cingovskii. Following this reasoning, P. pelopea tekkensis from NE Iran could also be considered a distinct species, however, inclusion of more samples is needed to confirm this status.
Given the high resolution of the basal clades within the COI gene tree, the lack of differentiation between taxa within the 'mamurra' and 'pelopea' group was unexpected. In particular, species like P. geyeri and P. daghestana are among the most easily recognisable species in the genus with uniform and very distinct wing patterns/coloration. There are several possible hypotheses to explain this lack of differentiation: -Incomplete lineage sorting: recent speciation could result in unresolved relationships among these closely related species; however, well-defined species borders in terms of constant wing pattern differentiation coupled with broad overlaps in species ranges challenges this hypothesis. -Recent gene flow: gene flow between closely related taxa is a known phenomenon (Descimon and Mallet 2009) and masks relationships among species especially with mitochondrial DNA (Gompert et al. 2008). The species involved have broadly overlapping ranges and could sometimes be found syntopic (Aussem 1980c, Hesselbarth et al. 1995, so hybridization is possible. Actually hybridization is documented even among the most distantly related species such as P. anthelea and P. geyeri (Aussem, 1980c). Nuclear markers with higher genetic resolution (e.g. microsatellites, SNPs) would be required to study the contact zones between these taxa to confirm ongoing gene flow. It must be noted that partial exclusion is evident when two or more Pseudochazara species are syntopic, as one is always dominant, while the others appear in very low frequencies (Hesselbarth et al. 1995). -Pseudogenes or Wolbachia infections: both are common in invertebrates, particularly in arthropods (Bensasson et al. 2011, Gerth et al. 2014, Leite 2012, Ritter et al. 2013. As the vast majority of the haplotypes in the 'mamurra' and 'pelopea' clades originate from the BOLD database it is impossible to check or correct for this potential error. The most enigmatic taxon among the 'mamurra' group is P. mamurra amymone from northern Greece and Albania (Eckweiler 2012. Apart from the author's original description (Brown 1976) little has been published regarding this elusive taxon for a long time. Failed attempts to locate the vaguely described type locality (Cuvelier 2010) have led to several misleading hypotheses, resulting in speculation that it may even be a rare hybrid between P. tisiphone and P. anthelea Dennis 2001, Kudrna et al. 2011). Somewhat surprisingly, the COI gene tree suggests it has a close relationship with P. graeca, another species from the southern Balkan Peninsula. These two taxa have distinct and constant wing patterns and differ in their habitat requirements, with P. mamurra amymone inhabiting steep and hot rocky gorges at lower elevations (Gascoigne-Pees et al. 2014) while P. graeca is predominantly a montane (high elevation) species endemic to Greece (Anastassiu et al. 2009). Thus, despite paraphyly of P. amymone Brown, 1976 (stat. n.) in relation to P. graeca, we believe they both represent valid species within the 'mamurra' group. Consequently P. kermana Eckweiler, 2004 (stat. n.), sister species to P. amymone and P. graeca combined, should also be elevated to species rank, although additional populations of P. mamurra in Iran should be examined to confirm this status. Alternatively, all the taxa within the 'mamurra' group, including the monophyletic P. schakuhensis, a sister species to the amymone-graeca-kermana clade, should be treated as a single very polymorphic species, a rather more destructive approach given the current taxonomy.
Although we are aware of the pitfalls of using single gene trees in the interpretation of phylogenetic patterns (Nichols 2001), we believe that strongly supported basal branching and splits between taxa, considered conspecific, represent valid insights into speciation in the Pseudochazara genus and together with distinct morphology and ecology allows species delimitation. Hence, we propose separate species status for the following taxa: P. tisiphone, P. amalthea, P. amymone, and P. kermana. This has important conservation implications, as most of these species are local endemics and therefore potentially threatened . Wider taxon sampling and inclusion of nuclear markers would undoubtedly help to a better understanding of the taxonomy of this fascinating butterfly genus.