Using the combined gene approach and multiple analytical methods to improve the phylogeny and classification of Bombus (Hymenoptera, Apidae) in China

Abstract Bumble bees are vital to our agro-ecological system, with approximately 250 species reported around the world in the single genus Bombus. However, the health of bumble bees is threatened by multiple factors: habitat loss, climate change, pesticide use, and disease caused by pathogens and parasites. It is therefore vitally important to have a fully developed phylogeny for bumble bee species as part of our conservation efforts. The purpose of this study was to explore the phylogenetic relationships of the dominant bumble bees on the Tibetan plateau and in northern China as well as their placement and classification within the genus Bombus. The study used combined gene analysis consisting of sequence fragments from six genes, 16S rRNA, COI, EF-1α, Argk, Opsin and PEPCK, and the phylogenetic relationships of 209 Bombus species were explored. Twenty-six species, including 152 gene sequences, were collected from different regions throughout China, and 1037 gene sequences representing 183 species were obtained from GenBank or BOLD. The results suggest that the 209 analyzed species belong to fifteen subgenera and that most of the subgenera in Bombus are monophyletic, which is in accordance with conventional morphology-based classifications. The phylogenetic trees also show that nearly all subgenera easily fall into two distinct clades: short-faced and long-faced. The study is the first to investigate the phylogenetic placement of Bombus turneri (Richards), Bombus opulentus Smith, Bombus pyrosoma Morawitz, Bombus longipennis Friese, Bombus minshanensis Bischoff, and Bombus lantschouensis Vogt, all of which are widely distributed throughout different regions of China. The knowledge and understanding gained from the findings can provide a molecular basis to accurately classify Bombus in China and to define strategies to conserve biodiversity and promote pollinator populations.


introduction
Bumble bees belong to the genus Bombus, which has been classified in the tribe Bombini of the subfamily Apinae of the family Apidae. Four sister tribes including Bombini, Apini (e.g., honey bees), Meliponini (e.g., stingless bees), and Euglossini (e.g., orchard bees) belong to the corbiculate clade within the family Apidae (Winston and Michener 1977). With other species of bees and pollinators, bumble bees provide pollination services to vegetable crops in large greenhouses and to a great diversity of plants in the wild, and contribute substantially to the agriculture economy (Velthuis and Van Doorrn 2006;Carolan et al. 2012;Williams et al. 2012a) and biological diversity of ecosystems (Wang and Li 1998;Sun et al. 2003;He and Liu 2004;Goulson et al. 2008;An et al. 2011;Vergara and Fonseca-Buendía 2012). To date, there are approximately 250 known species of bumble bees in the world, with approximately 125 species documented in China (Cameron et al. 2007;Williams et al. 2010;An et al. 2011). However, with degradation of the ecological environment, human activity, pathogen infection and exposure to pesticides, populations of bumble bees are declining in China, especially in northwest China (Yang 1999;Xie et al. 2008), with some species even becoming extinct in certain areas (Rasmont et al. 2005;Williams 2005;Colla and Packer 2008;Goulson et al. 2008). Therefore, it is very important to know the distribution and phylogenetic evolution of bumble bee species in these regions, to be able to design effective conservation strategies for their protection.
The taxonomic status of closely related bumble bee taxa is often unclear. In the early twentieth century, we relied on morphological characters to classify Bombus. However, because of highly variable color patterns and the presence of convergent evolution in morphology, it is difficult to accurately identify the species within Bombus based only on morphological features (Hines and Williams 2012). Subsequently, male genitalia have been used to distinguish the different subgenera; they are more reliable than color pattern in classifying subgenera, although they do not unambiguously distinguish between species under certain circumstances (Stephen 1957;Thorp et al. 1983). With the development of molecular techniques, identification based on molecular markers has become a powerful tool in the phylogenetic analysis and placement of species of Bombus (Michener 2007). However, it is critical to choose appropriate genetic markers in molecular phylogenetic reconstructions. Over the years, the genes Cytochrome Oxidase subunit I (COI) and Cytochrome b (Cytb) have been used in phylogenetic studies on insects (Boehme et al. 2010;Wang and Qiao 2010). A specific COI fragment that is 648 bp long with enough genetic information and base variation has been used to effectively distinguish species (Hebert et al. 2003(Hebert et al. , 2004. COI has been widely applied to species identification and phylogenetics in insects (Pedersen 1996(Pedersen , 2002. Further, the mitochondrial gene 16S rRNA and nuclear genes elongation factor-1 alpha F2 copy (EF-1α), long-wavelength rhodopsin copy 1 (Opsin), arginine kinase (Argk), and phosphoenolpyruvate carboxykinase (PEPCK) have also been used for phylogenetic analysis (Kawakita et al. 2003(Kawakita et al. , 2004Cameron et al. 2007). The mitochondrial gene 16S rRNA is a useful marker for examining the phylogenetic position of some insects (Yoshizawa and Johnson 2003), and is the most informative for phylogenetic analysis of closely related species (Whitfield and Cameron 1998). EF-1α, which can promote aminoacyl-tRNA to combine with ribosomes, is often recognized as a good molecular marker to resolve the classification of insects at the phylogenetic level of family or genus (Cho et al. 1995;Mardulyn and Cameron 1999;Rokas et al. 2002;Danforth et al. 2004). The Opsin gene belongs to the family of the light absorption receptor proteins. It can distinguish evolutionary divergence in hymenopteran insects, including Cynipidae and Halictidae (Rokas et al. 2002;Danforth et al. 2004), and can be used to deduce the relationships between these and the corbiculate Apinae (Mardulyn and Cameron 1999;Mardulyn 2001, 2003;Michel-Salzat and Whitfield 2004). ArgK is a kind of phosphate kinase which distributes broadly in the tissue of insects and is also a relatively conserved nuclear gene. The PEPCK sequence contains two parts, the high variation intron and the conserved exon, which are largely applied in the classification of the order Lepidoptera (Friedlander et al. 1996).
While advances in molecular marker techniques have led to significant improvements in population genetic analysis, the standard mitochondrial barcode fragment or nuclear genes are sometimes not informative enough to help understand the genetic variability of species. When multiple genes are combined for phylogenetic analysis, a much clearer view of the phylogeny among closely related species can be generated. Kawakita et al. (2004) elucidated the phylogeny of 65 species of bumble bees through the use of three nuclear genes, and analyzed their geographic distribution and character evolution. Later, Cameron et al. (2007) made a robust phylogeny with a comprehensive analysis of 219 species of bumble bees from all over the world, including some species from China, which utilized mitochondrial gene 16S rRNA and four nuclear gene sequences (Opsin, EF-1α, Argk, and PEPCK). Their results suggested that, overall, Bombus is monophyletic, with the subgenera grouped into two distinct clades, the short-faced and the longfaced, the first including a diverse New World clade. Their paper systematically analyzed the relationships among all subgenera and provided a foundation for the phylogeny of Bombus, although there remained some species that could not be included.
To improve our understanding of the phylogenetic relationships of Bombus in China, we conducted a phylogenetic analysis of 209 species by combining sequence fragments of two mitochondrial genes (COI and 16S rRNA) and four nuclear genes (EF-1α, Opsin, ArgK, and PEPCK). We obtained 152 gene sequences from 26 species recently collected from different regions of China. An additional 1093 gene sequences representing 183 additional species of Bombus were obtained from GenBank or BOLD. Among the 26 recently-collected species, B. pyrosoma Morawitz and B. lantschouensis Vogt are two common native species and important pollinators, characterized by having more workers in the colony, by ease of rearing in an indoor environment, and by a widespread distribution in China (Peng et al. 2009). Through this study, we have gained an insight into Chinese bumble bee species distributions and phylogenetic relationships, which in turn could be applied to our efforts of biodiversity conservation to promote pollinator populations.

Bumble bee collection
Bumble bee specimens were collected with nets and as a random sample at any given locality in the Sichuan, Inner Mongolia, Qinghai, Anhui, Gansu provinces and Beijing, China, between 2006 and 2012, and after capture were transferred directly into 100% ethanol. The samples were kept at -20 °C for subsequent analysis and voucher specimens are deposited at the Institute of Apicultural Research, Chinese Academy of Agricultural Science, Beijing, China. Exact collection localities are listed in Table 1

Morphology
All species were identified according to the morphological characters of bumble bees as described by Williams (1998). Subgenera and species were authenticated by the characters of the genitalia and other key taxonomic characters such as body size, color pattern, and leg structure (Williams et al. 2009;). The detailed morphological classification is presented in Suppl. material 1.

Genomic DNA extraction
For the extraction of nucleic acid, the muscle tissue of each individual bee's thorax was cleanly cut off with scissors, immediately put into an aseptic tube and ground in liquid nitrogen with a pestle. DNA was extracted from bee muscle tissue using a Wizard ® Genomic DNA Purification Kit (A1120, Promega). DNA extracts were kept at -20 °C until needed as a DNA template for the PCR.

PCR amplification and sequencing
The specific primers used to amplify the two mitochondrial genes (COI and 16S rRNA) and four nuclear genes (Opsin, EF-1α, Argk, and PEPCK) are shown in Table 2. PCR reactions were performed using a Mastercycler 5333 (Eppendorf ) in 25 μL PCR Mix (2×), 2 μL template genomic DNA (about 50 ng), 1 μL of each primer (forward and reverse), 21 μL ddH 2 O, with a final volume of 50 μL. PCR parameters for amplification were as follows: initial denaturation at 94 °C for 3 min, followed by 35 cycles denaturation at 94 °C for 1 min, annealing at 50-60 °C for 1 min, elongation at 72 °C for 1 min and final elongation at 72 °C for 10 min. The annealing temperatures for each gene were: 50 °C for PEPCK and Argk, 53 °C for EF-1α, 55 °C for 16S rRNA, 56 °C for COI, and 60 °C for Opsin. PCR products were electrophoresed in 1.2% agarose gel containing 0.5 μg/ml GoldView (GV) and visualized under UV light. PCR products were purified and then sent to Invitrogen for sequencing. After manual editing and error checking, we then performed a BLAST database search in GenBank to identify and include the closest matches of the same sequence for Bombus taxa. We obtained 152 valid sequences belonging to 26 Bombus species. The sequences used in this analysis have been deposited in GenBank. The list of sequences with their codes and the respective GenBank accession numbers can be found in Table 1.

Sequence analysis and construction of the phylogenetic tree
Altogether, 1245 gene sequences were used to conduct the phylogenetic analysis. One hundred and fifty-two (152) sequences from 26 bumble bee species collected during this study (Tab. 1) and 1037 sequences from 183 bumble bee species retrieved from GenBank or BOLD (see Cameron et al. 2007) were used to construct the phylogenetic tree. The Apini Apis mellifera Linnaeus and Apis dorsata Fabricius, the  (2004) Cameron et al. (2007).
The sequence data were aligned by ClustalX using default settings and visually checked using BioEdit (V7.0.9.0). We referred to Cameron et al. (2007), who had submitted the sequences to GenBank, and downloaded sequences of 16S rRNA, ArgK, EF-1α, Opsin, PEPCK and COI of bumble bees and outgroups from GenBank or BOLD. Phylogenetic analysis was conducted in MEGA 6.0 (Tamura et al. 2013).
Phylogenetic relationships were estimated by Bayesian analysis, maximum likelihood (ML) analysis, maximum parsimony (MP) analysis and Neighbor Joining (NJ) analysis, separately. Model selection for each gene was based on the Akaike Information Criterion (AIC) in Modeltest (Posada and Crandall 1998) and MrModeltest (Nylander 2004); the best model parameters for each gene partition were GTR+I+G. Bayesian analysis was performed using MrBayes v. 3.2.6 (Ronquist et al. 2012). Two independent Markov Chain Monte Carlo (MCMC) runs were conducted for 10 million generations, sampling every 1000 generations. Tracer v.1.6 was used to establish the convergence between two runs (Rambaut et al. 2014). Burn-in samples, which were the first 25% of the yielded trees, were discarded, then the remaining trees were used to generate a majority-rule consensus tree with posterior probabilities (PP). ML analysis were conducted using the GTRGAMMAI model of RAxML v.7.2.6. Node support was assessed via 1000 bootstrap replicates (Stamatakis 2006). MP analysis was performed using PAUP* v.4.0a165 (Swofford 2002). The tree bisection reconnection (TBR) branch swapping algorithm was used, and 100 random addition replicates were performed using heuristic strategy. Support values were assessed under the heuristic search with TBR and 100 jackknife replicates each with 100 random addition searches. NJ analysis was performed using MEGA v. 6.0 (Tamura et al. 2013). The phylogenetic trees were displayed and edited utilizing Figtree v.1.4.0.

Phylogenetic analysis
The results of the phylogenetic analysis of 209 Bombus species and ten outgroup species showed the same topology structure in two trees, which is similar to results in Cameron et al. (2007). The Bombus genus was divided into two distinct clades by Bayesian Inference (BI), ML and MP: the short-faced and the long-faced (Figs 2, 3). The morphological differences between the short-faced and long-faced clades are based on characters of the head including the length of the tongue, and on the presence/absence of a midbasitarsal spine. The short-faced species are generally short-tongued without a mid-basitarsal spine, while the long-faced species are long-tongued with a mid-basitarsal spine. The subgenera Mendacibombus, Confusibombus, Bombias, and Kallobombus are separated into two clades, which is consistent with previous studies (Williams 1994;Pedersen 2002;Kawakita et al. 2003Kawakita et al. , 2004Cameron et al. 2007

s. str.
Our phylogeny is consistent with the studies reported by Cameron et al. (2007) in terms of the number and variety of subgenera in the short-faced and long-faced clades and the relationships among the different subgenera. We added six new species of Bombus from China into the trees. As a result, there were some differences in the subgenera Psithyrus, Thoracobombus, Melanobombus and Bombus s. str.; B. lantschouensis is sister to B. lucorum and B. patagiatus; B. longipennis and B. affinis Cresson were not well supported by BI and ML methods (PP = 0.49; bootstrap values = 33) in the Bombus s. str. clade, and B. minshanensis together with B. lantschouensis and B. lucorum + B. patagiatus formed a branch in the Bombus s. str. clade (Fig. 2).
Besides 20 species of bumble bees in our samples which were also included in the phylogenetic trees of Cameron et al. (2007), we replaced the original sequences in Cameron et al.'s study with new sequences generated from this study and reconstructed the phylogenetic trees (Figs 2, 3) study, but grouped with B. lantschouensis in our study. These differences may be due to the combined gene approach or to the addition of new species sequences in our study, and further research is needed to clarify this problem.
Based on the sequences of five genes (16S rRNA, Argk, EF-1α, Opsin, and PEP-CK), we analyzed the relationships between the same 20 species and built one phylogenetic tree using the ML analysis (Fig. 4). In general, the results suggested that most species of Bombus were stable in genetic evolution and that their taxonomic positions showed no significant change with the variation of distribution areas. The results showed that nearly all specimens of the same species formed one clade in the phylogenetic tree (Fig. 4). We compared all gene sequences of the six species from Cameron et al. (2007) to our samples and found that there were some sequence variations between both groups of samples, which may reflect the adaptation to different geographical environments and evolutionary pathways in certain bumble bee species.
To ensure the accuracy in the classification of species using the combined gene approach, we utilized six genes to analyze the relationships among species. In Cameron et al. (2007), B. ruderarius (Müller) and B. velox (Skorikov) were sister species supported by PP = 0.52, but in our analysis B. ruderarius and B. velox are sister species supported by PP = 0.99 and bootstrap values = 61 in the BI and ML analyses, respectively. In Cameron et al. (2007) ) (

s. str.
and their relationship with B. patagiatus is distant (Fig. 2). Although most species are strongly supported by bootstrap values in the BI and ML phylogenetic trees based on the six combined genes, there are certain species which are not well supported. For example, B. longipennis and B. affinis are sister species in the tree but the support values are low (PP = 0.49; bootstrap values = 33). Bombus longipennis, B. affinis, and B. franklini are sister clades, and their support values are also low (PP = 0.27; bootstrap values = 38) (Fig. 2). As shown in Fig. 3, the phylogenetic trees obtained by the MP method showed that there are two distinct clades (short-faced and long-faced) including nearly all subgenera of Bombus. This, in general, is consistent with the results obtained using the BI and ML methods, while there were still some variations among subgenera in the topology structure of the phylogenetic trees, and most of the support values on the branches were very low. The phylogenetic tree obtained by the NJ method is different from those resulting from the other three methods (Fig. 5). There are not two distinct clades, and the phylogenetic relationships of some species are not consistent with morphology (Figs 1-3 (Fig. 5).
The support values were also low for many bumble bee species in the NJ phylogenetic tree. These results suggest that the BI and ML methods were better than MP and NJ to analyze the phylogenetic relationships among a large number of samples.
Furthermore, based on the monophyletic groups of bumble bees in the phylogenetic trees of Cameron et al. (2007), morphology, and the important behavioral and ecological characters of bumble bees, Williams et al. (2008) simplified the subgeneric classification of bumble bees from 38 to 15 subgenera. The results of our BI, ML, and MP analyses are consistent with what Williams proposed (Figs 2, 6). Moreover, we constructed the phylogenetic tree by BI based on the combined six genes (Fig. 6), and most clades were well supported by posterior probabilities except the one formed by Melanobombus and its sister clade Sivircobombus + Cullumanobombus (PP = 0.44). It may be that these 15 subgenera are in close proximity in a molecular evolutionary sense, and we need to distinguish them using other information. These results suggest that molecular methods can determine the taxonomic status of the majority of species in Bombus, and that it is consistent with morphological identification, but there are a few species in the phylogenetic tree for which the posterior probability and bootstrap values are a little low, and their classification may need to be further supported by combining other criteria with morphology.

Distinguishing bumble bee species
There are many Bombus species distributed in diverse regions all over the world. Previous studies revealed that color pattern and the characters of the male genitalia could clearly distinguish the subgenera of Bombus (Franklin 1954;Hines et al. 2006). There have been some problems in some cryptic species complexes; for instance, according to the color pattern, it is easy to consider B. cryptarum, B. lucorum, and B. magnus Vogt as one species, but the chemical and molecular evidence suggests that they are three distinct species (Carolan et al. 2012). Molecular methods are a powerful tool for inference of phylogenetic relationships (Ratnasingham and Hebert 2013). Because the evolutionary rates of single genes are different, each genetic marker has its advantages and disadvantages in phylogenetic analysis and a single gene cannot always clearly resolve the classification of species. Hines et al. (2006) found that combined genes can obtain stronger support values in some nodes compared to individual genes. In the present study, we also explored the power of combining multiple genetic markers which are conserved in evolution and accurately infer phylogenetic relationships of species (Cameron and Mardulyn 2003;Hebert et al. 2003;Magnacca and Brown 2012;Isaka and Sato 2014;Kjer et al. 2014;Schmidt et al. 2015). When multiple genes were combined, we could generate a clearer phylogeny to accurately determine the taxonomy of species by relying on the ability of the individual genes to reconstruct a known phylogeny and a set of genes to accurately infer the phylogenetic relationships of species. Although it is possible to resolve the taxonomy of species within Pyrobombus with combined multiple genes, it is still difficult to distinguish among all Bombus species.
China has the largest diversity of Bombus species in the world (Williams et al. 2010). However, it has been an increasing challenge to effectively protect and utilize the abundant resource of bumble bees in China. Our results significantly enhance our understanding of the taxonomic status and distribution of Bombus in China and provide an important foundation in further revealing the evolutionary history of Bombus and strengthening the protection of bumble bees as a resource. Some species readily reproduce, so they have been successfully commercially reared for pollination in greenhouses. For example, B. terrestris (Linnaeus) (subgenus Bombus s. str.) has been reared commercially in Europe and B. impatiens Cresson (subgenus Pyrobombus) has been reared commercially in North America. Although we could introduce these Bombus species to our country for pollination of crops in large greenhouses, there is the potential that they could bring new pathogens that would threaten the native species. As a result, the most practical solution would be to identify native species that can be readily reared for large-scale production. Our results showed that B. longipennis, B. minshanensis, B. lantschouensis, and B. terrestris belong to the subgenus Bombus s. str. and have a close phylogenetic relationship within the subgenus. These species are widely distributed throughout China, so they may represent an ideal option for commercial rearing in China. Our future goal is to distinguish all species of Bombus completely and accurately using a combination of different methods, thereby leading to a better understanding of the distribution and evolutionary history of bumble bees in China and improving our strategies of biodiversity conservation to promote pollinator populations.