New record of Microtusmystacinus in eastern Kazakhstan: phylogeographical considerations

Abstract The Eastern European vole (Microtusmystacinus) is an arvicoline rodent distributed across northern and eastern Europe, the Balkans, Turkey, Armenia, NW and N Iran, Russia as far east as the Tobol River in W Siberia, and W and N Kazakhstan. We present a novel records from eastern Kazakhstan (the village of Dzhambul – 49°14'21.3"N, 86°18'29.9"E and the village of Sekisovka – 50°21'9.18"N, 82°35'46.5"E) based on mtDNA and we discuss implications of this findings on biogeography of eastern Kazakhstan populations. Marine Isotope Stage 11 is considered an important period for the diversification of the arvalis species group. In the context of our study, it is important to analyse genetically discontinuous Siberian populations, and the current distribution of Microtusmystacinus in new localities in eastern Kazakhstan.

The distribution and habitat preferences of the Eastern European vole are relatively well known due to the intensive attention devoted to it (see Kryštufek and Vohralík 2005;Musser and Carleton 2005;Shenbrot and Krasnov 2005;Kryštufek 2017, and references therein). It prefers to live in places with high and dense herbaceous or grassy vegetation, hedgerows, and stands of reeds and it avoids short-grass meadows and dry areas (Kryštufek and Vohralík 2005;Aulagnier et al. 2009;Kryštufek 2017). The distribution range of the Eastern European vole, to date, extends from southern Finland, the Baltic eastwards to western Siberia with patches in the southern Urals, the Novosibirsk suburbs to the southwest margin of Lake Baikal and Buryatia, the southern Caucasus, northern Iran to Turkey, connecting to Greece and the majority of the Balkan Peninsula to Ukraine (Baskevich 1996;Gileva et al. 1996;Yakimenko and Kryukov 1997;Musser and Carleton 2005;Shenbrot and Krasnov 2005;Pavlova and Tchabovsky 2011;Ghorbani et al. 2015;Baskevich et al. 2016;Kryštufek 2017;Moroldoev et al. 2017).
When considering the distribution of M. mystacinus within Kazakhstan, there are records from the western or north-western parts. The easternmost record is from the Karabalyk district (Kovalskaya 1994;Meyer et al. 1996). Here, we report an additional record of M. mystacinus from eastern Kazakhstan and comment on it from a phylogeographic point of view.

Materials and methods
A survey of small mammals conducted in eastern Kazakhstan provided the surprising discovery of three specimens of M. mystacinus, that are characterized here based on molecular methods. The first sample (Kazakhstan 1) was collected in July 2006 on pasture land near the village of Dzhambul (GPS coordinates: 49°14'21.3"N, 86°18'29.9"E) by FS and two more specimens (Kazakhstan 2, 3) were collected in September 2017 near a pond not far from the village Sekisovka (GPS coordinates: 50°21'9.18"N, 82°35'46.5"E) by AM and JV.
DNA extraction was carried out using the Genomic DNA Mini Kit -tissue (Geneaid, New Taipei, Taiwan). We amplified the mitochondrial gene cytochrome b (cyt b hereinafter) using universal primers L14724, L15162, H15149 and H15915 (Irwin et al. 1991). Amplification conditions for cyt b consisted of 37 thermal cycles, an initial denaturation step at 94 °C for 3 min, denaturation at 94 °C for 30 seconds, annealing at 50 °C for 1 min, extension at 72 °C for 1.5 min and final extension at 72 °C for 10 min. Sequences were obtained using the Sanger sequencing (Sanger et al. 1977) services at laboratory SEQme s.r.o. (Dobříš, Czech Republic).
We obtained 1137 base pairs long sequences that satisfied the quality of base pairs (GenBank access number LT970847-LT970849). These were compared using available sequences from GenBank, specifically with 250 specimens that comprise all available sequences of M. mystacinus (under names M. levis, M. rossiameridionalis and M. mystacinus), and representative sequences of particular clades in M. arvalis and M. obscurus associated with previous studies (Baker et al. 1996a, b;Haynes et al. 2003;Fink et al. 2004;Jaarola et al. 2004;Triant and DeWoody 2007;Bužan et al. 2010;Thanou et al. 2012;Tougard et al. 2013;Stojak et al. 2016;Mahmoudi et al. 2017). Several more sequences (M. kirgisorum, accession number AY513809, AY513810; M. socialis, accession number AY513830, AY513831; and M. transcaspicus, accession number KX581067-KX581075) were downloaded from GenBank as potentially outgroups. The obtained sequences were aligned using the ClustalW algorithm implemented in GENEIOUS v.10.0.5 (Kearse et al. 2012). We employed a likelihood (ML) and Bayes-ian inference method (BI) for phylogenetic analyses. Likelihood phylogenetic analyses were conducted using the PhyML plugin for GENEIOUS. Final Bayesian phylogenetic analyses were conducted in BEAST 2.4.5.0 , where phylogenetic relationships were reconstructed under the Yule speciation process (Steel and McKenzie 2001) with the GTR model of evolution detected in JModelTest 2.1.7 (Nylander 2004) under the Akaike Information Criterion (AIC). The nucleotide data were run for 30 000 000 generations with a sampling frequency of every 1000 th generation; with final burn-in set at 20%. Time estimations were also computed in BEAST2  for the topology detected by the Bayesian phylogenetic analysis. We adopted one fossil calibration point (0.475±0.025 Mya for the origin of M. arvalis: Miesenheim I; Tougard et al. 2013) to estimate divergence time in studied taxa and to compare estimations with Mahmoudi et al. (2017) (which are based on the following proposed molecular clock rate, 3.27×10 -7 mutations/site/year for M. arvalis; Martínková et al. 2013). The split time with 95% highest posterior density was applied to a relaxed-clock model assuming a constant population size. The convergence and stability of estimated parameters was checked using TRACER 1.6 (Rambaut et al. 2017) and the maximum clade credibility trees were obtained with TREEANNOTA-TOR 2.4.5.0, and visualized in FIGTREE 1.4.3 (Rambaut 2009).
Some analyses were applied for M. mystacinus only. Specifically, haplotype characteristics were identified using DnaSP version 5.0 (Rozas et al. 2003) and the degree of diversification was estimated based on average pairwise distances using the Kimura two-parameters model of substitutions in MEGA5 (Tamura et al. 2011). The detailed haplotype network was conducted in POP ART 1.7 using the median-joining method (Bandelt et al. 1999).

Results and discussion
The obtained sequences of 1137 base pairs from three specimens exhibited close relationships with available cyt b sequences of Microtus mystacinus, in all comparisons. Specifically, they were nested inside this species, so our study identified this species in eastern Kazakhstan (see also below). All sequences of M. mystacinus form a sister group to the M. obscurus + M. arvalis , in accordance with previous comprehensive studies (e.g., Haynes et al. 2003;Fink et al. 2004;Jaarola et al. 2004;Triant and DeWoody 2007;Tougard et al. 2013;Stojak et al. 2015Stojak et al. , 2016Mahmoudi et al. 2017).
Considering the intraspecific structure in Microtus mystacinus, we can distinguish two deep lineages (Iran, abbreviated as IR) and the rest of populations mostly from Europe, additionally divided into several sub-lineages (TU, EU, GK), concordantly in ML and BI phylogenetic trees and the haplotype network (see Figure 1). This structure, specifically groups IR, TU, and EU, were identified firstly by Mahmoudi et al. (2017). TU lineage consists of Turkish and Armenian samples (without specimen Armenia 1), EU lineage of samples from the majority of Europe, mainly from Ukraine and Romania except for specimens from Greece, which comprise GK lineage, as well as samples from eastern Kazakhstan and the specimen 1 from Armenia. This pattern indicates a complex diversification of M. mystacinus across its former and current distribution. In general, Microtus mystacinus exhibited rather low intraspecific cyt b distances (except for the Iranian subset) and the obtained interspecific cyt b distances (see Table 1 As the intraspecific divergence for Microtus mystacinus and its cryptic diversity was intensively discussed by Mahmoudi et al. (2017), we would like to note only that the genetic distances cannot be presented as an absolute criterion for deciding whether two operational taxonomic units are distinct species (for detail see Groves et al. 2017), and in the case of species within the arvalis-group, some currently recognized species with rather low genetic distances exhibit infertile hybrids or hybrids with a reduced fertility (Meyer et al. 1985;Golenishchev et al. 2000;Jaarola et al. 2004).
The estimated clade divergence times varied substantially according to the calibration used (see Table 2). In summary, our estimations are more similar with other esti-   mates based on fossil calibration points (albeit slightly higher) than with estimations based on mutation rates (see Table 2). Focusing on the most studied species, M. arvalis, we estimate its time to the most recent common ancestor (TMRCA) as 0.490 Mya, Tougard et al. (2008)  It is not easy to judge which values are realistic, but our estimates seem to be compatible with other phylogenetic studies (e.g., Mazurok et al. 2001;Bannikova et al. 2010) and the fossil record (e.g., Cuenca-Bescós et al. 2001;Markova et al. 2012). Based on this compatibility, we adhere to the values of our estimations. In any case, it would be worth to compare different calibrations methods under different calibrations points and proposed mutations rates in future (e.g., methods of Baker et al. 1996a;Jaarola and Searle 2002), and also to consider the potential biases of the fossil record (e.g., incomplete nature, process of geological dating, reliability of species identification; cf. Ho 2007). Evolution and diversification of arvicoline rodents, including the arvalis-group, has been closely related to Quaternary climatic oscillations and the associated abiotic and biotic environmental factors (e.g., Horáček and Ložek 1988;Horáček 1990;Chaline et al. 1999;Stojak et al. 2016;Tougard 2017 and references therein). For the arvalis-group, interglacial periods are considered to be periods of species expansions and glacials as periods of retractions with potential survival of particular species in refu-gia (e.g., Golenishchev et al. 2000;Tougard et al. 2008;Stojak et al. 2015;Stojak et al. 2016). Golenishchev et al. (2000) considered one of the ancient alpine glaciations as responsible for disrupting the geographic range of M. arvalis and M. obscurus, whereas Tougard et al. (2008) considered interglacials as the agents of speciation. Based on our time estimations, the diversification of M. mystacinus + (M. arvalis + M. obscurus) group has happened within the last 0.79 Mya, thus comprising several interglacial and glacial periods (Gates 1993;Sirocko et al. 2007;Mahmoudi et al. 2017).
In our data, we observed synchronous, deep intraspecific divergences in all three species around 0.49-0.41 Mya (see Figure 2; in M. mystacinus we operated with separate timelines for the Iranian lineage (IR) and the remainder (sub-lineages TU, EU, GK) because the Iranian populations are divergent from the others; pairwise distance shows significant variation, see Table 1). This interval corresponds to the Holstein interglacial period (considering the stratigraphy of Western Europe) that is considered to be equivalent to Marine Isotope Stage (MIS) 11 (Sirocko et al. 2007; see Figure 2). The influence of the Holstein on the arvalis-group diversification can be explained by two historical scenarios. First, the preceding period, MIS 12, was characterized by a pronounced cold period (around 0.460 Mya), during which the earliest pan-Eurasian mammoth fauna associated with tundra-steppe habitats (called mammoth steppe, see Guthrie 2001) was formed. Second, the warmest phase of MIS 11 is the phase with the highest temperatures in the last 500 thousand years, persisting, persisting two times longer than the Eemian interglacial and three times longer than the Holocene (Sirocko et al 2007). Interglacial conditions may have disrupted the mammoth steppe biome due to an increase in precipitation, temperature, and associated forest expansions (for Late Quaternary see Řičánková et al. 2018). Tougard et al. (2008) recognized that the evolutionary history of temperate small mammals is much more complex than previously suggested. Individual species responded to various factors in multiple ways, and at different times during the Pleistocene (Lorenzen et al. 2011). Therefore, we tend to be reserved about whether observed pulses in diversification could be interpreted as expansion alongside some geographical/biotope barriers or fragmentation of some particular populations.
To conclude, our study proved an additional occurrence of Microtus mystacinus in Kazakhstan. The studies of Kovalskaya (1994), Meyer et al. (1996) and Okulova et al. (2014) specified the distribution of this species from western or northwestern parts of Kazakhstan, with the easternmost observation from the Karabalyk district (Kovalskaya 1994). Other localities of this species are known around Novosibirsk, several hundred kilometres away from the Kazakhstani border (Pavlova and Tchabovsky 2011). Although our material is not suitable to establish the full distribution range in Kazakhstan, it enables us to extend the range of this species further south.
The distribution of M. mystacinus could be partly human-induced, as documented by Tiunov et al. (2013) when regarding the railway across Siberia and the Far East of Russia (e.g., Olkhon Island, Pavlova and Tchabovsky 2011;Buryatia, Moroldoev et al. 2017). If we consider this possibility, the locality near Sekisovka is approx. 30 km distant from the nearest railway from Ust-Kamenogorsk to Ridder, but our second locality (near Dzhambul) is more than 150 km distant from the nearest  Table 2 for time estimates.
i. railway at Zyryanovsk (built after 1930; according to official web page of KTZ -КАЗАКСТАН TEMIP ЖОЛЫ). In Russian territory, this species shows pathways of invasion around the Transbaikalia railway and the surrounding agricultural landscape (e.g., Tiunov et al. 2013, Moroldoev et al. 2017. As the Kazakhstani specimens are significantly divergent from other available sequences (approx. 100 kya), we could consider the distribution of M. mystacinus in Kazakhstan as natural, but additional evidence is welcomed. Based on the presented network-phylogenetic relationship of samples it seems that a potential route of colonization for Kazakhstan populations could have originated somewhere between the Balkans and sites north of the Black and Caspian seas, whereas populations in Turkey and parts of Armenia were colonized from a southern route.

MIS
Our study is the first genotyping of M. mystacinus from the eastern part of its distribution, where its' occurrence is more discontinuous. In the context of our study, it is important to analyse genetically these Baikal and Far Eastern populations, and further map out the extent of M. mystacinus occurrence in East Kazakhstan.