The use of DNA barcoding to monitor the marine mammal biodiversity along the French Atlantic coast

Abstract In the last ten years, 14 species of cetaceans and five species of pinnipeds stranded along the Atlantic coast of Brittany in the North West of France. All species included, an average of 150 animals strand each year in this area. Based on reports from the stranding network operating along this coast, the most common stranding events comprise six cetacean species (Delphinus delphis, Tursiops truncatus, Stenella coeruleoalba, Globicephala melas, Grampus griseus, Phocoena phocoena)and one pinniped species (Halichoerus grypus). Rare stranding events include deep-diving or exotic species, such as arctic seals. In this study, our aim was to determine the potential contribution of DNA barcoding to the monitoring of marine mammal biodiversity as performed by the stranding network. We sequenced more than 500 bp of the 5’ end of the mitochondrial COI gene of 89 animals of 15 different species (12 cetaceans, and three pinnipeds). Except for members of the Delphininae, all species were unambiguously discriminated on the basis of their COI sequences. We then applied DNA barcoding to identify some “undetermined” samples. With again the exception of the Delphininae, this was successful using the BOLD identification engine. For samples of the Delphininae, we sequenced a portion of the mitochondrial control region (MCR), and using a non-metric multidimentional scaling plot and posterior probability calculations we were able to determine putatively each species. We then showed, in the case of the harbour porpoise, that COI polymorphisms, although being lower than MCR ones, could also be used to assess intraspecific variability. All these results show that the use of DNA barcoding in conjunction with a stranding network could clearly increase the accuracy of the monitoring of marine mammal biodiversity.

mitochondrial control region (MCR), and using a non-metric multidimentional scaling plot and posterior probability calculations we were able to determine putatively each species. We then showed, in the case of the harbour porpoise, that COI polymorphisms, although being lower than MCR ones, could also be used to assess intraspecific variability. All these results show that the use of DNA barcoding in conjunction with a stranding network could clearly increase the accuracy of the monitoring of marine mammal biodiversity.

Keywords
DNA barcoding, COI, control region, marine mammals, cetaceans, pinnipeds, biodiversity monitoring, stranding network introduction The aim of DNA barcoding is to concentrate the efforts of molecular taxonomists on a single part of the mitochondrial genome, chosen because it presents portions conserved across taxa that are appropriate for primer design, while including polymorphism among and within species (Hebert et al. 2003(Hebert et al. , 2004. This DNA sequence, targeted as the 5' end of the gene coding for the subunit I of the cytochrome c oxidase subunit I (COI), is sufficiently diverse so as to allow the specific identification of a great majority of animal species. Numerous studies have proven the success of this approach in the animal kingdom, and using various sources of tissue samples (e.g. Lambert 2005, Dawnay et al. 2007, Hajibabaei et al. 2007, Borisenko et al. 2008, Ward et al. 2009, Shokralla et al. 2010. Today (June 2013), a database, accessible at http://www.boldsystems.org, groups DNA barcode sequence data for more than 133,000 animal species, and offers a powerful identification tool for new specimens (Ratnasingham and Hebert 2007).
DNA barcoding also possesses some inherent limitations (Valentini et al. 2009): it is based on a single locus on the mitochondrial genome so that it is only maternally inherited (Hartl and Clark 2007), it can show heteroplasmy (Kmiec et al. 2006, Vollmer et al. 2011 or may exist as nuclear copies. Some of these limitations have been wellexposed (Ballard andWhitlock 2004, Toews andBrelsford 2012). The use of DNA barcoding for species delimitation also requires that interspecific divergence is higher than the intraspecific divergence. Although this has been shown to be true in numerous taxonomic groups, opposite examples also exist (Amaral et al. 2007, Wiemers and Fiedler 2007, Viricel and Rosel 2012. In the present study, we assess the contributions that DNA barcoding could provide to the monitoring of the marine mammal biodiversity along the coasts of Brittany, in the northwest of France. For almost 20 years, the stranding network has been collecting data and, when possible, sampling, each time a marine mammal stranding is reported. Field correspondents are organized in a geographical area covering the entire Brittany coasts. The network is coordinated regionally by Océanopolis (Brest, France), and nationally by Pelagis (La Rochelle, France).
DNA barcoding could be useful for the monitoring of marine mammal strandings at different levels. First, by confirming the quality and the reproducibility of a spe-cies identification made by the field correspondents. Beside common species, which are often encountered and easily identified, exotic or deep living species represent rare stranding events. In such cases, DNA barcoding could provide a confirmation or an additional degree of precision of taxonomic determination (Thompson et al. 2012). Second, DNA barcoding can help specifying species identifications in those cases where the taxonomic identification was made only to the genus or family levels. This is often due to incomplete or highly degraded carcasses. DNA barcoding also is a valuable and cost effective alternative to the taking of the head or skull of the animals. Third, genetic data collected for DNA barcoding generally include intraspecific variation, which allows downstream population-level analyses including the detection of genetic structure and, in some cases, monitoring population movements. A long-term use of the barcoding approach would therefore clearly increase the significance and the precision of marine mammal stranding monitoring. Migration or movement of populations or groups of a particular species can be highlighted, thus revealing e.g. environmental changes leading to these movements (Pauls et al. 2012).
We evaluated the usefulness of DNA barcoding in the monitoring of marine mammal biodiversity along the coasts of Brittany at three levels: by confirming the taxonomic identification performed by field correspondents, by identifying degraded carcasses or parts of carcasses, and by determining intraspecific variations for two species commonly found off Brittany, the harbour porpoise and the grey seal. For this last part of our study, we also compared COI and the mitochondrial control region in terms of their effectiveness in species identification.

Collection of data and samples
The CRMM (Centre de Recherche sur les Mammifères Marins, La Rochelle, France), presently the Joint Service Unit PELAGIS, UMS 3462, University of La Rochelle-CNRS has created the French marine mammal stranding recording program at the beginning of the 70s. The network comprises about 260 field correspondents, members of organizations or volunteers (Peltier et al. 2013).
Since 1995, the LEMM (Laboratoire d'Etude des Mammifères Marins, Océanopolis, Brest, France) has coordinated this network at a regional scale in Brittany, North West of France. Data are collected from the Brittany coastlines, analyzed, and then added to the central database maintained in La Rochelle. The Brittany coasts have been divided into 18 sections covering the whole coastline (Jung et al. 2009). In each of these areas, correspondents are trained in the analysis of stranded marine mammals. Taxonomic identification and characteristic measurements are performed following a standard procedure. The LEMM therefore compiles standardized data on a large proportion of cetaceans stranded on the Brittany coasts on a yearly basis. Whenever possible, skin, blubber, muscle and teeth samples are also collected in the field from each stranded animal. Samples are then kept in absolute ethanol or dry at -20 °C until analyses. Some harbour porpoise samples, described in the Appendix 1 and in Alfonsi et al. (2012), were stranded or by-caught in the Bay of Biscay (Atlantic coast of France).

Genomic DNA extraction, amplification and sequencing of COI and MCR (mitochondrial control region)
Genomic DNA was extracted from blood samples or from muscle and skin tissues using a standardized protocol and the DNeasy Blood and Tissue kit (Qiagen), following the instructions of the manufacturer. The quality and the concentration of all the DNA extracts were estimated by agarose gel electrophoresis and by spectrophotometry using a Nanodrop 1000 (Thermo Scientific).
A 736 base-pair (bp) fragment of the 5' region of the COI fragment (position 5352 to 6087 of the complete mitochondrial genome of the harbour porpoise, Gen-Bank acc. no. AJ554063), was amplified using two newly designed primers, LCOIea (5'-tcggccattttacctatgttcata-3') and HBCUem (5'-ggtggccgaagaatcagaata-3'). The 50 µl PCR final volume included approximately 50 ng of genomic DNA, and 25 pmole of each primer in the Hotgoldstar master mix × 1 (Eurogentec) with a final concentration of MgCl 2 of 2.5 mM. After an initial denaturation step of 10 min at 95 °C, the thermocycle profile consisted of 32 cycles for cetaceans or 35 cycles for pinnipeds at 95 °C for 30 s, 53 °C for 30 s and 72 °C for 60 s, with a final extension at 72 °C for 10 min.
For some animals, we also amplified and sequenced another part of the mitochondrial genome including the control region (MCR). For cetaceans, the primers and reaction conditions are described in (Alfonsi et al. 2012). For pinnipeds, two newly designed primers LMCRHgem 5'-tcatacccattgccagcattat-3' and HMCRHgem 5'-taccaaatgcatgacaccacag-3' amplified a 693 bp fragment from position 16160 to 55 of the Halichoerus grypus complete mitochondrial genome sequence (GenBank acc. no. X72004). PCR reaction conditions were the same as described above for pinnipeds, with the hybridization temperature set to 53 °C. PCR products were purified using the "MinElute PCR Purification Kit" and sequenced by a commercial sequence facility (Macrogen, Korea).
Electropherograms were analyzed and edited manually using the Sequence scanner software (Applied Biosystems), and alignments were produced using CLUSTAL W (Thompson et al. 1994) with default settings in Bioedit (Hall 1999). All sequences were analyzed using the Barcode of Life Data Systems (BOLD) interface (accessible at http://www.boldsystems. org), and were also compared to GenBank data using BLAST (Benson et al. 2010).
DNA sequences and specimen information have been added to two BOLD projects. The first project includes specimens for which the species had been identified without doubt using classical morphological identification, and is referred to as IMMB (Identified Marine Mammals in Brittany). The IMMB project is a part of the campaign "barcoding mammals of the world". The second project, UMMB (Unidentified Marine Mammals in Brittany), includes specimens only identified to the genus or to higher taxonomic levels. This second project is a part of the campaign "barcoding application".
Genetic distances (intraspecific, interspecific and minimal distance to the nearest neighbour) were calculated using the Kimura 2-parameter (K2P) model (Kimura 1980) and the MUSCLE alignment algorithm on the BOLD user interface or using the software MEGA5 (Tamura et al. 2011). Neighbour-Joining trees based on the K2P-model were built using the BOLD user interface. DnaSP v5.10 was used to calculate haplotype and nucleotide diversities (Librado and Rozas 2009). We used non-metric multidimensional scaling (nMDS) to represent MCR distances graphically and to discriminate closely related species within the Stenella-Tursiops-Delphinus complex (LeDuc et al. 1999, McGowen 2011, Perrin et al. 2013. Distance matrices were computed with the K2P-model using DNAdist (Felsenstein 1989) and were then analyzed by nMDS using Statistica (Statsoft 2005). Posterior probabilities were calculated by a LDA (linear discriminant analysis) on coordinates given by the nMDS. Phylogenetic relationships among COI sequences of harbour porpoise were depicted using a median joining network of haplotypes using Network v4.6 (www.fluxus-engineering.com).

Results
From 2003 to 2012, 1530 marine mammal strandings were recorded along the coastline of Brittany (Table 1). Fourteen species of cetaceans and five species of pinnipeds were identified. The most frequent cetaceans were six indigenous species of the Brittany waters, viz. five members of the Delphinidae (Delphinus delphis, Tursiops truncatus, Stenella coeruleoalba, Globicephala melas, Grampus griseus), and the harbour porpoise (Phocoena phocoena). Two members of the Zyphiidae (Hyperoodon ampullatus and Ziphius cavirostris), three other species of Delphinidae (Lagenorhynchus acutus, Orcinus orca and Stenella frontalis), one species of Physeteridae (Physeter macrocephalus) and two mysticete species (Balaenoptera acutorostrata and Balaenoptera physalus) were rare stranding events. Halichoerus grypus was by far the most commonly encountered pinniped, far before Phoca vitulina, and some uncommon arctic seals (Phoca hispida, Cystophora cristata and Phoca groenlandica). Between 9 and 12 different marine mammal species stranded each year ( Figure 1).
Members of the stranding network are trained to identify the stranded animals. Nevertheless, 258 animals (16.8% of the strandings) were not characterized to the species level, generally because of an advanced state of decomposition of the animal body, sometimes in conjunction with bad field-work conditions.

COI sequencing and analysis from different marine mammal samples
DNA was extracted from 92 stranded animals, i.e. from dead cetaceans and pinnipeds, but also from 40 grey seals stranded alive, which were treated in the care center of Océanopolis (Brest, France) and from which a small blood sample was taken and kept at -20 °C. All the samples came from animals stranded at the coasts of Brittany, except for one grey seal (Hgc406), which stranded alive in Spain in 2009 and which  was transported to the care center ( Figure 2). Our sampling included 12 species of cetaceans, and three species of pinnipeds (Table 2). Two species were very common, the harbour porpoise (29 samples) and the grey seal (44 samples), thus allowing intraspecific distance analyses. A COI amplicon was recovered from 89 samples, and good quality sequences of more than 500 bp were obtained for all samples (GenBank accession numbers KF281608-KF281697). The sequence alignment used in the analyses was 507 bp long. About 32% of the positions were polymorphic in the cetaceans and 13.1% in the pinnipeds (Table 3). The maximal intraspecific distance was 0.46% for the grey seal and 0.83% for the harbour porpoise. The COI sequences of three species of the Delphininae (Stenella frontalis, Stenella coeruleoalba and Delphinus delphis) showed very low interspecific distances (0.84% between D. delphis and the nearest species S. frontalis, and 1.18% between the two Stenella species). All other interspecific distances were above 3.9% for pinnipeds and above 6% for cetaceans. The Neighbour-Joining (NJ) tree built on the BOLD interface using K2P-distances ( Figure 3) confirms that, except for of the Delphininae, all the cetacean and pinniped species analyzed are distinguished unambiguously.

Taxonomic identification of undetermined samples
We then determined COI sequences from 10 cetacean samples whose species could not be determined accurately using morphological characters (Figure 4), either because only parts of the animal were recovered ( Figure 4A) or because of the highly degraded state of the carcasses ( Figure 4C). COI sequences of good qualities were obtained from all these samples, and three of them were identified unambiguously using the BOLD identification engine: Ms250511 was identified as a Balaenoptera physalus, Ds160111  as a Grampus griseus and Ds290811 as a Phocoena phocoena. The other seven samples were Delphininae, as confirmed by COI sequences. Yet, neither the BOLD identification engine, nor a BLAST search on GenBank allowed a more precise determination. We therefore sequenced MCR, which is more variable than COI, from six unidentified samples. BLAST searches on GenBank confirmed the COI results: all these samples were Delphininae, but a more precise identification could not be achieved. We constructed a nMDS plot of the distances between MCR sequences of S. coeruleoalba, S. frontalis and D. delphis taken from GenBank: for S. coeruleoalba, we used sequences AM498725, AM498723, AM498721, AM498719, AM498717, AM498715, AM498713, AM498711, AM498709, AM498707 (Mace et al. unpublished), for D. delphis FM211560, FM211553, FM211545, FM211535, FM211527, FM211519, FM211511, FM211503, FM211495 (Mirimin et al. 2009) and DQ520121, The three species were clearly discriminated by the nMDS (Figure 5). The posterior probabilities are given in Appendix 2. This analysis suggests that five of our unidentified samples could belong to D. delphis, and one to S. coeruleoalba.

Intraspecific variation of COI and MCR in harbour porpoise and grey seal
For the intraspecific analysis of the harbour porpoise, we included 17 additional samples of animals stranded or by-caught from the Bay of Biscay (Appendix 1, Alfonsi et al. 2012). All in all, we compared 35 sequences of grey seals, and 45 of harbour porpoises. As expected, MCR sequences were more polymorphic than COI: in harbour porpoise, 3.8% of the MCR positions were polymorphic vs. 1.30% in COI, while 4.73% of the MCR positions in the grey seal were polymorphic vs. 0.75% in COI (Table 4). Hence, MCR was 3× more polymorphic than COI in harbour porpoise and 6x in grey seals. Haplotype and nucleotide diversities were also higher for MCR than for COI.  The haplotype network of the COI sequences in harbour porpoises clearly differentiated two haplogroups (Figure 6), that correspond perfectly to those described for MCR in Alfonsi et al. (2012).

Discussion
Stranding networks collect opportunistic data that are ecologically significant (Borsa 2006, Jung et al. 2009, Peltier et al. 2013, although, among other parameters, data quality control may deserve a special attention (Evans and Hammond 2004). Stranding networks can also collect skin and muscle samples that can be used for genetic analysis, therefore contributing to the construction of biological sample banks which are of high value when working with marine mammals.
The aim of this study was to evaluate the feasibility of a routine use of DNA barcoding in a stranding network; and to determine which gains this use could bring in terms of data relevance. The Brittany stranding network is a part of the French stranding network, and has to analyze an average of around 150 marine mammal strandings per year, with a high species biodiversity (19 species during 2003-2012). Figure 6. Haplotype network established from the COI sequences of 45 harbour porpoises stranded along the Atlantic coast of France (Appendix 1). Numbers on a line connecting two haplotypes correspond to the sequence position of the mutation differentiating these haplotypes. Two mitochondrial haplogroups appear (black circles -grey circles), that group the same individuals as the haplogroups alpha and beta determined using MCR polymorphisms and described in Alfonsi et al. (2012).

Can COI be used as an appropriate species identification tool for marine mammals in the frame of a stranding network?
We obtained DNA sequences of good quality for almost all the samples studied, whatever their origin, their collectors, or even their state of degradation. This is consistent with the numerous molecular genetic studies that have used samples taken on stranded cetaceans or pinnipeds (e.g. Gaspari et al. 2006, Amaral et al. 2007, Fontaine et al. 2007, Mirimin et al. 2009, 2011, Alfonsi et al. 2012. Viricel and Rosel (2012) previously demonstrated that COI sequences allowed identifying cetacean species, except for a few closely related Delphinidae species (see also Amaral et al. 2007). As expected, our NJ tree matched the overall classification, and the distance-based analysis identified correctly the sequences to the species levels for all cetaceans except within the Delphininae. The three species of pinnipeds analyzed were also unambiguously distinguished on the basis of their COI sequences.
The quality of the whole functioning and organization of the stranding network, from the field-work achieved by the correspondents to the preservation of the samples is therefore confirmed by our study. All the samples analyzed by DNA barcoding led to correct identification of the expected species with no exceptions.
We obtained COI good quality sequences for 10 unidentified animals, some of which were in a highly degraded body state. This showed that DNA barcoding can help to identify such specimens, which represent more than 16% of the stranded animals in the period 2003-2012. Hence, a routine use of DNA barcoding would noticeably decrease the proportion of unidentified animals.

The case of the Delphininae
Within the Delphininae, species are difficult to discriminate (Amaral et al. 2007, Viricel and Rosel 2012. In particular, Delphinus delphis, Stenella coeruleoalba and Stenella frontalis show very low interspecific COI distances, which do not allow distinguishing the species accurately. Other mitochondrial loci, such as MCR and cyt b, are neither very effective in this matter (Amaral et al. 2007, Viricel andRosel 2012). This is attributed to recent and rapid radiation events in the subfamily, and it leads to problematic results in molecular taxonomic studies (Kingston et al. 2009, Amaral et al. 2012, Viricel and Rosel 2012, Perrin et al. 2013). In our case, these three species produced COI and MCR sequences that did not allow to associate samples with species names, neither with the identification engine on BOLD, nor with a distance tree or a BLAST search on GenBank. nMDS of genetic distances is known to uncover sample clustering (e.g. Geffen et al. 2004, Maltagliati et al. 2006, Alfonsi et al. 2012, Weckworth et al. 2012. As such, nMDS clustering of MCR sequence distances of Delphinus delphis, Stenella coeruleoalba and Stenella frontalis, chosen randomly on GenBank among Atlantic samples, showed that individuals of the three species formed separate groups. Moreover, each individual had a high posterior probability to belong to the right group, except for one sample (i.e. 97.0% of the assignments were successful), so that all our unidentified samples could be putatively identified to the species level, based on the nMDS plot and its posterior probabilities.
Can DNA barcoding increase the accuracy of the data listed by the stranding network?
DNA barcoding is informative for animals that belong to species that infrequently strand along the coasts of Brittany, which can involve either species living far off the coasts or living in deep water, but also exotic species. Such species can be more difficult to identify by the field correspondents simply because of their scarcity. Along the coast of Brittany, we observed a Stenella frontalis, a temperate to tropical Atlantic Ocean inhabitant, and three species of arctic seals (Phoca hispida, Cystophora cristata and Phoca groenlandica). It is likely that other members of such rare species are listed among the "undetermined" species, just because their morphological characteristics are less well known by field correspondents. Additionally, a species that rarely strands along the French coast may be mistakenly identified as its more common sister-species. This issue can be illustrated by the case of the two pilot whale species: Globicephala melas, the long-finned pilot whale, commonly strands along the French Atlantic coast, while only a few stranding events of G. macrorhynchus, the short-finned pilot whale, have been reported (the Bay of Biscay is the northern limit of the geographical range of G. macrorhynchus). The two species have overlaping morphological characters, which adds to the difficulty of detecting rare stranding events of G. macrorhynchus based on morphological data only (Viricel and Sabatier unpublished data). A systematic use of DNA barcoding when morphological taxonomic characteristics are not straightforward, would clearly lower the percentage of exotic animals not listed. The existence of natural interspecific hybrids between the two Globicephala sister-species (Miralles et al. 2013), as between other cetacean species (e.g. Bérubé andAguilar 1998, Willis et al. 2004) still reinforces the interest of such a monitoring based on molecular data.
It is important to note that a main limitation of DNA barcoding is the use of a single locus, leading to some problematic species identification such as within the Delphininae, but also to an inability to detect hybrids without complementary genetic studies. This limitation may well be removed in the near future thanks to next-generation sequencing, allowing the accumulation of large amount of DNA sequence data in a cost-effective manner. Multi-locus barcoding, including mitochondrial and nuclear polymorphic loci, will certainly represent a next step for the barcoding community.
A routine use of DNA barcoding could also allow monitoring the marine mammal biodiversity at intraspecific levels. For instance, global climate change has some effects on genetic diversity that must be studied and quantified (Pauls et al. 2012), in particular in the marine realm. Knowledge of the existence of distinct genetic groups or populations, of the history of their formation and of their movements are of a first importance to ecological understandings of natural populations, and also to the conservation efforts dedicated to them. Around the coast of Brittany, different species of marine mammals have shown variations in abundance in the last decades (Vincent et al. 2005, Jung et al. 2009). Using samples from the French Stranding Network and MCR polymorphisms, we have recently shown that two previously separated, genetically distinct, populations of harbour porpoises are now admixing along the Atlantic coast of France (Alfonsi et al. 2012). These results were unexpected according to previous work Rosel 2006, Fontaine et al. 2007). In this study, we show that this genetic clustering would also have been detected using COI polymorphisms, thus reinforcing the interest of a routine use of DNA barcoding in conjunction with the stranding network.

Contributions of our study to the Barcoding of Life Database
This project is part of the collaboration between the Laboratory BioGeMME of the "Université de Bretagne Occidentale" (Brest, France), Océanopolis, a public private company (http://www.oceanopolis.com), the "Parc naturel marin d'Iroise" (http:// www.parc-marin-iroise.com) and the French Stranding Network, coordinated by Pelagis, Université de La Rochelle, France. All the specimens and sequence data described in this manuscript are deposited in BOLD under the institution called "Oceanopolis-BioGeMME" in two projects, UMMB and IMMB. Our mixed institution became the first contributor to BOLD for the Cetacea, as well as for the Phocidae, and these two BOLD projects will be publicly available, and all the sequences published on GenBank.