A revision of the geographical distributions of the shrews Crocidura tanakae and C. attenuata based on genetic species identification in the mainland of China

Abstract The Taiwanese gray shrew (Crocidura tanakae) and Asian gray shrew (C. attenuata) are so similar in size and morphology that the taxonomic status of the former has changed several times since its description; C. tanakae has also been regarded as an endemic species of Taiwan Island. In recent years, molecular identification has led to several reports of C. tanakae being distributed in the mainland of China. In this study, we determine the geographical distribution of C. attenuata and C. tanakae based on more than one hundred specimens collected during 2000 to 2018 over a wide area covering the traditional ranges of the two species in the mainland of China, and show a substantial revision of their distributions. Among 110 individuals, 33 C. attenuata and 77 C. tanakae were identified by Cytb gene and morphologies. Our results show, (1) C. attenuata and C. tanakae are distributed sympatrically in the mainland of China; (2) contrary to the previous reports, the distribution range of C. attenuata is restricted and much smaller than that of C. tanakae in the mainland of China; (3) Hainan Island, like Taiwan Island, is inhabited by C. tanakae only according to the present data.


Introduction
The Taiwanese gray shrew (Crocidura tanakae Kuroda, 1938) and Asian gray shrew (C. attenuata Milne Edwards, 1872) are distinct species with very similar morphological characters and measurements, such that the taxonomic status of C. tanakae has been changed several times by taxonomists. Crocidura tanakae was originally described from Taiwan as a new species by Kuroda (1938); however, because it could not be distinguished from C. attenuata in morphological characters and measurements, C. tanakae was thereafter regarded as a synonym or subspecies, C. a. tanakae by many authors (Ellerman and Morrison-Scott 1951;Jameson and Jones 1977;Corbet and Hill 1992;Hutterer 1993;Fang et al. 1997;). Motokawa et al. (2001) recognized the distinct taxonomic position of C. tanakae by chromosomal data, and regarded it as the endemic species of Taiwan Island.
In recent years, the application of molecular identification techniques led to reports of C. tanakae populating the mainland of China. Esselstyn et al. (2009) and Esselstyn and Oliveros (2010) genetically identified specimens collected in Vietnam and the Hunan and Guizhou Provinces of China and found most of their specimens belonged to C. tanakae; only a few were attributed to C. attenuata. Bannikova et al. (2011) and Abramov et al. (2012) reported that C. tanakae was also found in Vietnam and Laos, and it was a widespread species in Vietnam, whereas C. attenuata inhabited only the north and east of the Red River; Chinese scientists recently reported C. tanakae was collected from the mainland of China including Mount Emei of Sichuan Province, Mount Fanjing of Guizhou Province, Pingbian and Funing of Yunnan Province and Xingshan of Hubei Province (Cheng et al. 2017;Chen et al. 2018;Lei et al. 2019). However, these reports only provided the data for several distribution areas and were not sufficient to generalise the overall distributions of the two species in the mainland of China. The current IUCN distribution maps of C. attenuata and C. tanakae presented in Figure 1 are revised by this study.
We accumulated more than one hundred specimens from 19 areas of C. attenuata and C. tanakae in our field surveys in the mainland of China from 2000 to 2018, which expands the previous distributions from the aforementioned reports from a few localities. A re-evaluation of geographical distributions of the two species is important to a range of studies and practical needs, such as zoogeography, geophylogeny, agriculture animal management, health and epidemic prevention. Here we report the wide geographical distributions of C. attenuata and C. tanakae in the mainland of China.

Phylogenetic analyses
Cytb gene sequences were aligned using BioEdit v.7.2.5 (Hall 1999). Each specimen was molecularly identified for species by blasting on GenBank and confirmed by ML (maximum likelihood) phylogenetic tree construction in MEGA 5 (Tamura et al. 2011) based on TN93+G model. We used the Akaike Information Criterion (AIC) in jModeltest1.0 (Posada 2008) to select the best-fit model of sequence evolution for the locus alignment. The bootstraps were obtained using a rapid bootstrapping algorithm with 1000 replicates. We calculated the genetic distance of Kimura-2-parameter (K2P) of Cytb between the two species.
We also included Cytb sequence data from several earlier studies (Ohdachi et al. 2004(Ohdachi et al. , 2006Bannikova et al. 2006Bannikova et al. , 2009Bannikova et al. , 2011Jenkins et al. 2009Jenkins et al. , 2013Esselstyn and Oliveros, 2010;Abramov et al. 2012;Chen et al. 2016) to place the shrews from type locality and Vietnam into a phylogenetic context, the sequence information was showed in Suppl. material 1, Table S2. Suncus murinus was selected as outgroup (Suppl. material 1, Table S2). GenBank accession numbers for the original sequences used in this study were MK765682-MK765791 (Suppl. material 1, Table S1).

Morphological analyses
In order to attribute these genetic lineages to taxonomically correct species names, we photographed the dorsal, ventral, lateral of skull and lateral view of the mandible of C. attenuata from type locality -Baoxing (Moupin), Sichuan -and also photographed the corresponding teeth, and marked the characteristic features on the pictures for this species. We repeated the same procedure with the only sample of C. tanakae from the same locality (Baoxing) for interspecific comparisons.
We conducted a morphological investigation of the specimens sampled to identify the two species by determining three external measurements: total body length (TBL), head and body length (HBL), ear length (EL); and 10 skull measurements: greatest length of skull (GLS), cranial base length (GBL), median palatal length (MPL), length of teeth row (LUTR), greatest palatal breadth (GPB), breadth of occipital condyles (BOC), greatest breath of braincase (BBC), interorbital breadth (IOB), height of the braincase (HB), length of mandible (LM) according to Yang et al. (2005Yang et al. ( , 2007 and Jenkins et al. (2009).The measurements of the skull indices were performed with a digital vernier caliper (0.01 mm). Juveniles and sub-adults were excluded from the analysis according to the complete fusion of cranial sutures (Motokawa et al. 1997(Motokawa et al. , 2003, and by making a histogram of the HBL as an indicator for age identification of small mammals (Li et al. 1989(Li et al. , 1990Yang 1990).
We calculated the mean and standard deviation of external and skull morphological indices. The pairwise differences between the two species were tested by independent sample t-tests or Mann-Whitney U tests according to results of the Kolmogorov-Smirnov test for their normality of distribution. Principal component analysis (PCA) was used to test the general appropriateness of the groupings supplied by assessment of overall variation in the skull characters. These analyses were performed using SPSS Statistics 24.0 (SPSS, Chicago, IL, USA).

Results
We obtained 1140 bp of mitochondrial DNA sequences from 110 individuals in this study. The ML tree indicated that the specimens we collected were divided into two lineages, one was clustered with the C. attenuata download from GenBank which was distributed in its type locality, i.e., Baoxing of Sichuan Province, and the other was clustered with the C. tanakae download from GenBank which was exclusively distributed in its type locality, i.e., Taiwan Island (Fig. 2). K2P distance of Cytb between these two lineages was 12.3%. Together with the results of blasting on GenBank, a total of 33 specimens of C. attenuata lineage and 77 specimens of C. tanakae lineage collected in this study were genetically identified by Cytb, and their distribution localities plotted in Figure 3. Also, the distribution localities of C. tanakae recently reported in the mainland of China were added to the figure.
By investigating our samples of C. attenuata lineage from Baoxing, Sichuan, we found some morphological features correlated with the holotype: the superior articular facets are more angular in dorsal view and the basioccipital region is narrow and ridged particularly anterior to the position of the basioccipital suture in C. attenuata (Fig. 4). On the upper premolar (P 4 ) the protocone is variably positioned relative to the paracone; the posterolingual border of the tooth is not so rounded; and the posterior border of the tooth is deeply concave. The posterobuccal crest of the paracone of the second upper molar (M 2 ) forms a smooth W-shaped loph in unworn dentition (Fig. 5).
A total of 90 adult individuals were screened by age identification including 26 C. attenuata and 64 C. tanakae. The external and two skull measurements (BOC and GPB) were judged as a non-normal distribution by the Kolmogorov-Smirnov test (P<0.05), so we used the Mann-Whitney U Test for interspecific comparisons; for the others with normal distribution (P>0.05) the parametric independent sample t-test was used (Suppl. material 1, Table S3). Descriptive statistics for external and craniodental measurements of the two species and literature measurements (including holotype) are given in Table 1; they were basically consistent with the variation range and limits recorded in the literature except for IOB. Crocidura attenuata was a little larger than C. tanakae in GBL, MPL and BBC. Although there existed significant differences (P<0.05) in some morphological indices between the two species (Table 2), their range of measurements greatly overlapped. In the PCA made on external and skull measurements, three principal components were extracted and captured 70.07% of the total variation. Five indices, GBL, GLS, LUTR, BBC and LM, were the top five with the highest correlations with the first axis (PC1, Table 3). The sample distributions over the scatter plot in coordinate area constructed by first two principal component axes showed a great overlap between the two species in external and skull indices (Fig. 6), indicating that morphological indices cannot accurately identify the two species.       Among the localities of our field surveys, C. tanakae was recorded at almost all sites investigated (Fig. 3), whereas C. attenuata was only found in the following six provinces: Sichuan Province (Baoxing), Fujian Province (Mount Wuyi), Hubei Province (Shennongjia), Guangdong Province (Nanling), Jiangxi Province (Mount Jinggang), and Zhejiang Province (Jinhua).

Discussion
This study indicates that C. attenuata and C. tanakae are sympatrically distributed not only in continental Indochina (Jenkins et al. 2009(Jenkins et al. , 2013Bannikova et al. 2011;Abramov et al. 2012) but also in the mainland of China. The distribution of C. attenuata is apparently limited to only two ranges, i.e., Baoxing of Sichuan to Shennongjia of Hubei and Nanling of Guangdong to Jinhua of Zhejiang; the natural range of this species is much smaller than that of C. tanakae which is distributed almost all over the south of mainland China.
Note that the map of C. attenuata (Fig. 1, left) presented by the IUCN is erroneous due to the regular events of species misidentification of C. tanakae in the mainland of China. The IUCN map mistakenly shows the mixed distributions of both C. attenuata and C. tanakae; the presented distributions of C. attenuata in Taiwan and the Hainan Islands are erroneous for the same reason. For the distribution map of C. tanakae (Fig.  1, right), the range is not definitively established due to the few districts surveyed and information from more recent records has yet not to be included.
Based on morphological features we found among our samples and the results of its comparisons with type materials of C. attenuata and C. tanakae (Jenkins, et al., 2009(Jenkins, et al., , 2013, we consider that the specimens of the C. attenuata lineage should be attributed to C. attenuata, and the other lineage to C. tanakae. Wang (2003) divided C. attenuata into three subspecies in China, including the Himalayan subspecies (C. a. rubricosa Anderson, 1877) distributed in northwestern Yunnan (Gongshan), the South China subspecies (C. a. attenuata Milne-Edwards, 1872) distributed in other parts of mainland China and the Taiwan subspecies (C. a. tanakae Kuroda, 1938) distributed on Taiwan Island. It is clear that C. a. tanakae is actually a valid distinct species (Motokawa et al. 2001), but the other two subspecies still need taxonomical validation by detailed analysis to exclude the possibility of misidentification of C. tanakae specimens. Similarly, the same taxonomic challenge exists for C. a. grisea Howell, 1926, the subspecies distributed in the Fujian Province (Smith and Xie 2009). All these subspecies are uncertain because the authors may well have wrongly included specimens of C. tanakae mixed with C. attenuata samples.
There are many research reports listing C. attenuata in the mainland of China. For example, Zhang et al. (1987) investigated C. attenuata (attenuate in original paper) as a host animal of epidemic hemorrhagic fever, Wu (2002) reported population density fluctuation in C. attenuata, and many reports on animal diversity and pathogen host studies involved C. attenuata. Gu et al. (2007) reported that epidemiologic surveillance on leptospirosis in the Anhui Province and the first discovery of a pathogenic strain in the renal of C. attenuata, Wu et al. (2008) made a preliminary comparative anatomical study of digestive tracts between C. attenuata and Apodemus agrarius. Because C. tanakae might have been taxonomically misidentified with C. attenuata in these reports, and our present study demonstrates that C. tanakae is much more widely distributed in the mainland of China, the species "C. attenuata" described in these reports may be in fact C. tanakae or at least contains C. tanakae, results of these studies therefore need re-evaluation.