DNA barcoding and the differentiation between North American and West European Phormia regina (Diptera, Calliphoridae, Chrysomyinae)

Abstract Phormia regina (the black fly) is a common Holarctic blow fly species which serves as a primary indicator taxon to estimate minimal post mortem intervals. It is also a major research model in physiological and neurological studies on insect feeding. Previous studies have shown a sequence divergence of up to 4.3% in the mitochondrial COI gene between W European and N American P. regina populations. Here, we DNA barcoded P. regina specimens from six N American and 17 W European populations and confirmed a mean sequence divergence of ca. 4% between the populations of the two continents, while sequence divergence within each continent was a ten-fold lower. Comparable mean mtDNA sequence divergences were observed for COII (3.7%) and cyt b (5.3%), but mean divergence was lower for 16S (0.4–0.6%). Intercontinental divergence at nuclear DNA was very low (≤ 0.1% for both 28S and ITS2), and we did not detect any morphological differentiation between N American and W European specimens. Therefore, we consider the strong differentiation at COI, COII and cyt b as intraspecific mtDNA sequence divergence that should be taken into account when using P. regina in forensic casework or experimental research.


Keywords
Black fly, COI,COII,cytb,16S,28S,ITS2 introduction Forensic entomology uses the larval and pupal developmental stages of insects sampled on a corpse to estimate a minimum post-mortem interval (PMImin) of the corpse (Amendt et al. 2004(Amendt et al. , 2007. This requires i) detailed and accurate knowledge of the developmental rate of the species of forensic interest under different temperature conditions (Charabidze 2012), and ii) identification tools by which the different immature insect stadia can be identified (Catts 1992). Blowflies (family Calliphoridae) are among the most common insects found on dead bodies shortly after death. The species differ in their developmental times and have therefore a high potential for the accurate estimation of the PMImin. Unfortunately, several forensically important blow fly species can hardly be distinguished morphologically, especially in the larval and pupal stages (e.g. Catts 1992). To improve the success and reliability of identifications, a number of molecular techniques and tools have been explored to identify forensically important species (Wells and Stevens 2008, reviewed in Jordaens et al. in press).
Currently, the most popular molecular method for organismal identification is DNA barcoding, which was promoted by Hebert et al. (2003a, b) as a standardized molecular identification tool for all animals. It refers to establishing species-level identifications by sequencing a fragment of the mitochondrial cytochrome c oxidase subunit I (COI) gene, the "DNA barcode", into a taxonomically unknown specimen and performing comparisons with a reference library of barcodes of well-identified species. COI barcodes (and other fragments of COI) indeed have been successfully applied in the identification of many calliphorid species (e.g. Wallman and Donnellan 2001, Wells and Sperling 2001, Nelson et al. 2007, Wells and Williams 2007, Harvey et al. 2008, Desmyter and Gosselin 2009, DeBry et al. 2013). Yet, COI fails to unambiguously discriminate among several calliphorid species pairs (e.g. Nelson et al. 2007, see also the Discussion) and the use of alternative identification tools (e.g. other genes) could be necessary to acquire correct identifications.
Phormia regina is a highly mobile species that is abundant in North American areas with cool spring and fall temperatures and in warmer areas, but then at higher altitudes (Hall 1948, Brundage et al. 2011). The developmental time of P. regina seems highly variable and could be influenced by a number of environmental variables (Kamal 1958, Greenberg 1991, Anderson 2000, Byrd and Allen 2001, Nabity et al. 2007, Núñez-Vázquez et al. 2013. Using amplified fragment length polymorphisms (AFLP), Picard and Wells (2009) studied the population genetic structure of N American P. regina and found that the N American populations were panmictic but with significant temporal genetic differences within populations, even over short periods of time. They therefore suggested that part of the variation in developmental times and growth curves that was observed in laboratory studies is not only due to local environmental (i.e. laboratory) conditions, but also to differences in the genetic composition of the laboratory stocks. This finding is important for forensic sciences since it shows that forensically relevant ecological data from one population (i.e. from a forensic case) cannot be extrapolated to other populations (i.e. to other forensic cases). Interestingly, Desmyter and Gosselin (2009) found a 4.2% sequence divergence at a 304 bp COI fragment between N American and W European specimens. Subsequently, Boehme et al. (2012) found a similar sequence divergence (range: 3.5%-4.31%) at the COI barcodes between N American and W European P. regina specimens.
Because high COI sequence divergences are often indicating species level differentiation (e.g. Hebert et al. 2003a, b), the strong COI differentiation between N American and W European P. regina specimens calls for a taxonomic re-assessment. We therefore studied DNA sequence variation in mitochondrial and nuclear DNA, and examined morphological differentiation between N American and W European populations of P. regina to i) provide additional DNA barcodes for P. regina, ii) examine molecular differentiation between N American and W European specimens in other genes, and iii) assess whether the COI differentiation is correlated with morphological differentiation. The taxonomy of P. regina is then re-evaluated in the light of these results.

Specimen collection and morphological examination
Sixty-one adult individuals of P. regina were captured at several localities in N America (Indiana, Texas, Virginia, Washington, Wyoming) and W Europe (Belgium, France, Germany) and stored in > 70% ethanol (Appendix 1 -Supplementary Table  1). The individuals were qualitatively scored for the color of 11 external characters (Table 2). In addition, we dissected the male copulatory organs of five W European and five N American individuals to study the general shape of the penis, cerci and surstyli ( Figure 1).

DNA sequence analysis
DNA was extracted from on one or two legs. The remaining parts of the vouchers are kept at the NICC (National Institute of Criminalistics and Criminology -Brussels, Belgium) as pinned material. Genomic DNA was extracted using the NucleoSpin Tissue kit (Macherey-Nagel). A fragment of 721 bp from the 5'-end of the COI gene, including the standard barcode region (Hebert et al. 2003a,b), was amplified using primer pair TY-J-1460 and C1-N-2191 (Sperling et al. 1994, Wells andSperling 2001). Five other DNA markers were sequenced for a more limited set of samples (Appendix 1 -Supplementary Table 1). Fragments of the mitochondrial 16S ribosomal RNA (16S), cytochrome c oxidase subunit II (COII), and cytochrome b (cytb) genes, and of the nuclear ribosomal internal transcribed spacer 2 (ITS2) and fragment D1-D2 of the 28S ribosomal RNA (28S) were amplified using primer pairs 16Sf.dip/16Sr.dip (Kutty  Each 25 µl PCR reaction was prepared using 1 × PCR buffer, 0.2 mM dNTPs, 0.4 µM of each primer, 2.0 mM MgCl 2 , 0.5 U of Taq DNA polymerase (Platinum ® , Invitrogen), 2-4 µl DNA template (DNA was stored in 100 µl of elution buffer) and enough mQ-H 2 O to complete the total PCR reaction volume. The thermal cycler program consisted of an initial denaturation step of 4 min at 94 °C, followed by 30-40 cycles of 45-60 s at 94 °C, 30-60 s at a fragment depending annealing temperature and 90 s at 72 °C; with a final extension of 7 min at 72 °C. The annealing temperatures were 45 °C for COI and COII, 48 °C for 16S and cytb, 50 °C for ITS-2 and 55 °C for 28S. PCR products were cleaned using the NucleoFast96 PCR® kit (Macherey-Nagel) and bidirectionally sequenced on an ABI 3130 Genetic Analyzer (Applied Biosystems) using the BigDye® Terminator Cycle Sequencing Kit v3.1. Together with the P. regina specimens we also collected several ProtoPhormia terraenovae specimens that were also sequenced to increase the number of material for comparison (Appendix 1 -Supplementary Table 1). Sequences were assembled in SeqScape v2.5 (Applied Biosystems) and deposited in GenBank under accession numbers KF908069-KF908124 (COI), KF908126- Phormiini and its sister clade Chrysomyini form the Chrysomyinae (Singh and Wells 2011a, b). We therefore downloaded from GenBank (and for all genes) all available sequences (at 11 July 2013) of the Phormiini (genera Phormia, Proto-Phormia and Protocalliphora) and of the Chrysomyini (genera Chloroprocta, Chrysomya, Cochliomyia, Compsomyiops, Hemilucilia, Paralucilia and Trypocalliphora) to allow comparison with closely related taxa (Table 1). Sequences were aligned in MAFFT v7 (Katoh and Standley 2013). Sequences with > 5 ambiguous positions were discarded and each dataset was trimmed to equal sequence length (Table 3). The 16S dataset was trimmed at 251 bp and at 350 bp to yield a higher number of Chrysomyinae haplotypes for the latter dataset (i.e. 22 vs. 42 unique haplotypes; six species in the ingroup for both datasets). Alignments are available as fasta files in the online Appendix 2 text file. Unique sequences (haplotypes) were selected in DAMBE5 (Xia 2013). Nucleotide sequence divergences within and between species (based on the haplotypes) were calculated using the uncorrected p-distances in MEGA v5.05 (Tamura et al. 2011). For these calculations we excluded haplotypes that were not identified to the species level (one Protocalliphora sp. for COI) or that were most likely identification errors (for details see the Results). MEGA v5.05 was also used to construct Neighbour-Joining (NJ) trees (Saitou and Nei 1987) using the p-distances with complete deletion of positions with ambiguities and alignment gaps (indels). Relative branch support was evaluated with 1000 bootstrap replicates (Felsenstein 1985). In all analyses, several Lucilia spp. or Calliphora spp. sequences from GenBank were added as outgroups, and for COI we also used L. sericata NICC0390 as outgroup (GenBank accession number KF908125). Author names of all species are provided in Table 1.

Morphology
We did not detect morphological differences between N American and W European P. regina specimens in the 11 external color characters that we scored (Table 2). Also the male copulatory organs of W European and N American P. regina specimens were indistinguishable (Figure 1).

DNA sequence analysis
Basic information of the different datasets can be found in Table 3. There was only high bootstrap support for the monophyly of Chrysomyinae, Phormiini or Chrysomyini with 28S and a sister group relationship of P. regina and Pr. terraenovae with ITS2. Yet, for all fragments, except for 28S, there was high bootstrap support for the monophyly of P. regina (Figures 2-4 and Appendix 1 - Supplementary Figures 1-3). COI: The COI NJ-tree showed two supported clades within P. regina ( Figure  2). One clade (EU = Europe) comprised six haplotypes from Europe (23 specimens sequenced), while the other clade (NA = North America) comprised 14 haplotypes from N America (27 specimens sequenced). The seven NA haplotypes available in GenBank clustered within the NA clade. The mean p-distance between the EU and NA P. regina haplotypes was 0.04 ± 0.007 (Table 3). Sequence divergence in P. regina within each continent was approximately a ten-fold lower, viz. EU: 0.003 ± 0.001 -NA: 0.004 ± 0.001.
COII: The two EU and seven NA haplotypes of P. regina (from 30 specimens) formed two strongly supported clades (Figure 3) separated by mean p-distance of 0.037 ± 0.008 (Table 3). The three COII sequences from GenBank (from NA specimens) had the same haplotype as our NA specimens. Sequence divergence in P. regina within each continent was approximately a ten-fold lower, viz. EU: 0.002 ± 0.002 -NA: 0.004 ± 0.002 (Table 3). The mean p-distance between the 14 Chrysomya taxa was 0.059 ± 0.007. We considered C. megacephala_FJ153270 and C. rufifacies_ FJ839395 as misidentifications, and C. rufifacies_AY842670_AY842671 to be different from the other C. rufifacies individuals given the high sequence divergence (viz. mean p-distance = 0.10 ± 0.013). The mean p-distance between Co. macellaria and Co. hominivorax was 0.048 ± 0.009. The mean intra-and interspecific p-distances among all Chrysomyinae species (excluding P. regina) were 0.014 ± 0.014 and 0.046 ± 0.005, respectively (Table 3).
Cyt b: The three EU and seven NA haplotypes of P. regina (from 17 specimens) formed two strongly supported clades ( Figure 4) with a mean p-distance of 0.053 ± 0.009 between these two clades (Table 3). There were no cytb sequences of Phormia in GenBank. Sequence divergence in P. regina within each continent was approximately a ten-fold lower, viz. EU: 0.002 ± 0.007 -NA: 0.005 ± 0.002 (Table 3). The mean pdistance between the three Chrysomya species was 0.046 ± 0.005. The mean intra-and  interspecific p-distances among all Chrysomyinae species (excluding P. regina) were 0.003 ± 0.002 and 0.079 ± 0.007, respectively (Table 3).
16S: For the 350 bp dataset, the three NA 16S haplotypes (from 15 specimens) (mean within NA p-distance = 0.004 ± 0.003; Table 3) formed a well-supported clade, and formed a monophyletic group with the single EU haplotype ( Supplementary Figure 1A). The mean p-distance between the NA and EU haplotypes was 0.006 ± 0.003. The mean p-distance between C. megacephala and C. rufifacies was 0.040 ± 0.009. The mean intra-and interspecific p-distances among all Chrysomyinae species (excluding P. regina) were 0.014 and 0.023 ± 0.004.
For the 251 bp dataset, all eleven NA specimens had the same haplotype with a p-distance of 0.004 to the EU haplotype (four specimens) (Supplementary Figure 1B). The mean p-distance between C. megacephala and C. rufifacies was 0.059 ± 0.012. The mean intra-and interspecific p-distances among all Chrysomyinae species (excluding P. regina) were 0.028 ± 0.009 and 0.038 ± 0.006, respectively (Table 3).
ITS2: Excluding indels, all P. regina specimens (36 specimens) had the same haplotype ( Supplementary Figure 2), except for P. regina NICC0302 that had a C instead of a T at position 219 of the alignment (p-distance = 0.003). Phormia regina NICC0640 had a deletion at position 201, and P. regina NICC0048 had an insertion of a G at position 270 of the alignment. Both specimens were from the same locality (Liège -Belgium) in W Europe. The p-distance between Co. hominivorax and Co. macellaria was 0.008 ± 0.001, that between H. segmentaria and H. semidiaphana was 0.106 ± 0.018, and the mean p-distance among 16 Chrysomya species was 0.085 ± 0.010. The mean intra-and interspecific p-distances among all Chrysomyinae species (excluding P. regina) were 0.008 ± 0.005 and 0.085 ± 0.011, respectively (Table 3).
28S: All 37 P. regina specimens had the same haplotype, except for P. regina JQ246614 from N America that had an AG insertion at positions 460-461 of the alignment (Supplementary Figure 3). One haplotype of Pr. terraenovae (three specimens with GenBank accession numbers AJ300142, JQ307780 and JQ246615) only differed by two indels from haplotype JQ246614 of P. regina (at positions 408 and 460-461) (the other Pr. terraenovae haplotype differed at more positions). The mean p-distance between Co. macellaria and Co. hominivorax was 0.005, that between Pro. azurea and Pro. sialia was zero [an indel at position 439 (A) in Pro. azurea) of the alignment], and that between H. semidiaphana and H. segmentaria was 0.013. The mean p-distance among the six Chrysomya species was 0.006 ± 0.002. The mean intra-and interspecific p-distances among all Chrysomyinae species (excluding P. regina) were 0.003 ± 0.004 and 0.007 ± 0.002, respectively (Table 3). (2009) and Boehme et al. (2012) found a mean sequence divergence of approximately 4% within a 304 bp and the barcoding COI region between N American and W European P. regina, respectively. We confirmed this COI divergence with newly sequenced material. Such a strong divergence at COI is common among insect species (e.g. Park et al. 2011a, b, Webb et al. 2012, Ng'endo et al. 2013. Moreover, we here show a similar degree of divergence at two other mtDNA genes, viz. COII (3.7%) and cytb (5.3%). The 'within-continent' divergence in P. regina was very low (0.2-0.5% for the three genes) and comparable to the intraspecific differentiation of other Chrysomyinae (0.5% for COI, 1.4% for COII, 0.3% for cytb). Hence, the high between-continent mtDNA differentiation, and low within-continent mtDNA divergence may hint at a taxonomic difference between the N American and W European populations. In order to evaluate this suggestion, we included all publicly available GenBank sequences from species of the subfamily Chrysomyinae for the four mtDNA and two nDNA gene fragments that we sequenced. The combined study of mtDNA and nDNA has proven valuable to disentangle the taxonomy of other calliphorid species (e.g. Nelson et al. 2007, Sonet et al. 2012).

Desmyter and Gosselin
On the one hand, our results show that the mean p-distance of other intrageneric interspecific comparisons (COI: 5-6.8%, COII: 4.8-5.9%, cytb: 4.6%, 16S (251 bp): 5.9%), or among other Chrysomyinae species in general (COI: 6.6%, COII: 4.6%, cytb: 7.9%, 16S (251 bp): 3.8%), are higher than the mean p-distances between N American and W European P. regina at the four mtDNA fragments (COI: 4%, COII: 3.7%, cytb: 5.3%, 16S: 0.6%). For cytb the NA-EU differentiation in P. regina is higher than that observed within other Chrysomyinae species (0.3%) yet still below the minimum interspecific p-distance (7.3%). On the other hand, for COI and COII, the NA-EU differentiation in P. regina is higher than the intraspecific differentiation in other Chrysomyinae species and well within the range of interspecific p-distances within Chrysomyinae. Yet, the low interspecific p-distance between some Chrysomyinae species may be due to misidentifications or may be the result of a natural process (e.g. hybridization, incomplete lineage sorting). Likewise, the high intraspecific variation within some species may be indicative of cryptic diversity (see further).
North American and W European P. regina were not differentiated at both nDNA fragments, and at the mtDNA 16S (< 1%), whereas interspecific p-distances in Chrysomyinae in general are substantial for ITS2 (8.5%) and 16S (3.8%). Moreover, the NA-EU differentiation in P. regina at these genes was even lower than the minimum intraspecific differentiation within other Chrysomyinae. This suggests that the variation at these genes in P. regina is intraspecific variation. Finally, we could neither detect color differences in 11 external characters, nor in the general shape of the male copulatory organs between N American and W European specimens. Evidently, a statistical analysis of more specimens (from a wider range of the species' distribution) is necessary to reliably assess within and among population variation at these (and eventually other) morphological characters. For the time being, we consider the high differentiation at COI, COII and cytb, but the low (16S, nDNA) or lack of (morphological) differentiation, as indicative of substantial intraspecific mtDNA sequence divergence, rather than as a species-level differentiation.
Our findings may have important implications for the use of P. regina in forensic and other scientific fields. Indeed, it has been suggested that the high variation in developmental times and growth curves of P. regina (e.g. Byrd and Allen 2001 and references therein) is partly due to differences in the population genetic structure (Picard and Wells 2009) and that therefore ecological data obtained from one population should not be generalized or extrapolated to other populations (Byrne et al. 1995). Interestingly, Marchenko (2001) reports a mean accumulated degreedays (from egg to adult) of 148 °C (lower development temperature: 11.4 °C) for Russian/Lithuanian P. regina, whereas a mean accumulated degree-days of 162 °C (lower development temperature: 11.16 °C) was found for N American P. regina (Yves Braet, unpublished preliminary results). Hence, the strong mtDNA divergence between N American and W European P. regina requires a sound comparison of the ecology of populations from both continents, especially since P. regina is a key species in the study of the physiology and neurology of insect feeding (e.g. Haselton et al. 2009, Larson and Stoffolano 2011, Ishida et al. 2012. Moreover, if locally diverged populations differ in their developmental biology, then this may affect the estimate of PMImin. Intraspecific mtDNA divergence in other Chrysomyinae species is sometimes also high, viz. 4.3% for COI in C. megacephala, and 2.2%, 2.6% and 3.7% for COII in C. megacephala, C. semimetallica and C. rufifacies, respectively. Whereas these high intraspecific divergences may be due to hybridization/introgression or incomplete lineage sorting, they may also point to misidentifications. Obviously these issues are problematic if DNA barcoding of animals is only based on COI, as advocated by Hebert et al. (2003a, b). For instance, three C. megacephala specimens (KC135924, KC139925, KC135926) have a remarkably high p-distance of 8% with the other C. megacephala haplotypes and it would be advisable to re-identify these specimens. Also C. semimetallica shows much more intraspecific sequence variation (mean p-distance = 0.011 ± 0.003) as compared to other Chrysomyinae species but at the same time the species has a low mean interspecific p-distance with C. albiceps (p-distance = 0.017 ± 0.004).
Although there is no doubt that COI is a useful tool for the identification of forensically important Chrysomyinae species (Wells and Sperling 2001, Nelson et al. 2007, Wells and Williams 2007, Desmyter and Gosselin 2009, Boehme et al. 2012) not all species can be identified with COI. For instance, there is very low mean interspecific p-distance of 0.006 ± 0.002 between C. megacephala (excluding the three aforementioned haplotypes), C. cabrerai, C. saffranea and C. pacifica (the first two even share a haplotype) (see also Harvey et al. 2008). Therefore, other genes (or gene fragments) might help to overcome the shortcomings of the sole use of COI as molecular identification tool. We here showed that also COII may be a good DNA barcode marker in the Chrysomyinae. Indeed, the mean interspecific p-distance at COII is 4.6%, whereas the mean intraspecific distance is much lower (1.4%). Yet, the amount of Chrysomyinae COII data that is currently available in public libraries such as GenBank (194 sequences representing 108 haplotypes from 20 species), is rather limited compared to the amount of COI data (339 sequences representing 180 haplotypes from 36 species) (Table 1). Moreover, the problems inherent to misidentifications and introgression also apply to COII (or any other DNA marker). For instance, C. megacephala FJ153270 shares a haplotype within the C. rufifacies clade, and C. rufifacies FJ839395 shares a haplotype within the C. megacephala clade. Also other species share haplotypes such as C. semimetallica and C. latifrons. The other two mtDNA fragments (cytb and 16S) cannot yet be evaluated as DNA barcode markers because of insufficient sequence data (cytb: 32 sequences representing 20 haplotypes of five species; 16S: 39 sequences representing nine haplotypes of six species) ( Table 1), but both have been shown to discriminate sufficiently between other dipteran species of forensic interest (Vincent et al. 2000, Li et al. 2010. So far, the forensically important species within the Chrysomyinae belong to the genera Chrysomya, Cochliomyia, Paralucilia, ProtoPhormia and Phormia. A number of COI reference datasets of these species are available (e.g. Wallman and Donnellan 2001, Wells and Sperling 2001, Nelson et al. 2007, Wells and Williams 2007, Harvey et al. 2008, Desmyter and Gosselin 2009, Boehme et al. 2012) and they seem to work well to identify most forensically important species. Yet, it is important to also include species without a clear forensic interest in (local) reference databases because this will improve the assessment of species boundaries which, in turn, may help to reach a stable taxonomy.
In conclusion, we observed substantial differentiation between N American and W European P. regina at the mtDNA genes COI, COII and cytb, but not at the 16S rDNA and the nDNA genes ITS2 and 28S. Moreover, we neither detected any morphological differentiation between specimens from both continents. We therefore consider the strong mtDNA divergence between specimens from both continents as intraspecific variation. This differentiation has to be taken into account when using P. regina in forensic casework or physiological studies. Finally, the use of COII as a DNA barcode marker in the Chrysomyinae seems to perform as good as the standard COI barcode region.