Evaluating the diversity of Neotropical anurans using DNA barcodes

Abstract This study tested the effectiveness of COI barcodes for the discrimination of anuran species from the Amazon basin and other Neotropical regions. Barcodes were determined for a total of 59 species, with a further 58 species being included from GenBank. In most cases, distinguishing species using the barcodes was straightforward. Each species had a distinct COI barcode or codes, with intraspecific distances ranging from 0% to 9.9%. However, relatively high intraspecific divergence (11.4–19.4%) was observed in some species, such as Ranitomeya ventrimaculata, Craugastor fitzingeri, Hypsiboas leptolineatus, Scinax fuscomarginatus and Leptodactylus knudseni, which may reflect errors of identification or the presence of a species complex. Intraspecific distances recorded in species for which samples were obtained from GenBank (Engystomops pustulosus, Atelopus varius, Craugastor podiciferus, and Dendropsophus labialis) were greater than those between many pairs of species. Interspecific distances ranged between 11–39%. Overall, the clear differences observed between most intra- and inter-specific distances indicate that the COI barcode is an effective tool for the identification of Neotropical species in most of the cases analyzed in the present study.


Introduction
Many amphibian groups are morphologically homogeneous and tend to lack clear diagnostic traits. This means that, while there have been a number of recent advances, the taxonomy of amphibians is poorly resolved in general (see e.g. Darst and Cannatella 2004;Faivovich et al. 2005;Frost et al. 2010;Grant et al. 2006;Roelants et al. 2007;Vences et al. 2003). In particular, the intrageneric diversity of the amphibians appears to be underestimated in most cases (e.g., Bossuyt et al. 2004;Crawford et al. 2010;De la Riva et al. 2000;Fouquet et al. 2007;Vieites et al. 2009). In this context, the accelerating global decline and changes in amphibian populations (Hoffmann, et al. 2010, McCallum, 2007Stuart et al. 2004;Narins et al. 2014), as well as the cryptic diversity reported for several taxa (Fouquet et al. 2007;Crawford et al. 2013), implies that many still undescribed species may be disappearing from the Neotropical region before they have even been identified (Collins 2010).
The increasing availability of molecular data has reinforced the conclusion that morphological evolution in amphibians is often cryptic, resulting in a revitalization of amphibian taxonomy (e.g. Real et al. 2005;Vieites et al. 2009;Rowley et al. 2010;Stuart et al. 2006;Funk et al. 2012;Xia et al. 2012;Crawford et al. 2013). Rapidlyevolving genes may overwrite the evidence of ancient affinities, but are extremely useful for the understanding of recent divergence among closely-related species. Mitochondrial DNA (mtDNA) has been widely used in phylogenetic studies of animals because it evolves much more rapidly than nuclear DNA, resulting in the accumulation of differences between closely-related species (Brown et al. 1979;Moore 1995;Mindell et al. 1997). The taxonomic reviews at the species level now almost always include some form of analysis of mtDNA divergence. A number of species of the genus Rana have been recognized in recent years, based on molecular methods (Newman et al. 2012), for example, and through comparisons with other amphibian species (Channing et al. 2013;Hasan et al. 2014;Biju et al. 2014).
Short DNA sequences from a standardized region of the genome can provide a DNA "barcode" for the identification of species (Hebert et al. 2003), and may provide a substitute for more traditional molecular approaches, which have been used for the identification of amphibian taxa for some time (Larson and Chippindale, 1993). A 648-bp region of the mitochondrial Cytochrome Oxidase I (COI) gene is commonly used as a barcode for the identification of animal species, given that it is easily sequenced and provides excellent resolution for the identification of taxa, especially when combined with the analysis of other traits (Pereyra et al. 2016). This is supported by the considerable divergence in sequences found by Hebert et al. (2003) between 13,000 pairs of closely-related animal species, and reinforces the need for the analysis of more than a single, short sequence of DNA, which may produce inconclusive results (Blotto et al. 2012, Pereyra et al. 2016. The usefulness of COI as a DNA barcode has been evaluated in Malagasy mantellids and North American plethodontid salamanders (Vences et al. 2005a), Holarctic amphibians (Smith et al. 2008), and Asiatic salamanders of the family Hynobiidae (Xia et al. 2012). In the Neotropical zone, COI has been tested in amphibians from Panama and the Guianan Shield (Crawford et al. 2010(Crawford et al. , 2013Hawkins et al. 2007). Variations in the performance of COI as a DNA barcode have provoked doubts on the effectiveness of the approach for the identification of species (Vences et al. 2005b). The main limitation on the use of COI in amphibians is the lack of a universal primer for the PCR-mediated amplification of the DNA of different species (Vences et al. 2012). In many cases, the overlap found between intraspecific and interspecific distances reduces the reliability of species identification (Vences et al. 2005a;Hawkins et al. 2007). Given this, Vences et al. (2005b) recommended the use of 16S rRNA as a DNA barcode, rather than COI.
Using a combination of primers, COI sequences were used to successfully identify 94% of Holarctic amphibians, and showed that the overlap between intra-and interspecific distances was the result of hybridization, the presence of species complexes or taxonomic problems (Smith et al. 2008). In many cases, there was no overlap in these distances. Overall, then, the COI barcode presented the same problems encountered in the analysis of any other group of animals (Smith et al. 2008;Crawford et al. 2010;Hawkins et al. 2007;Vences et al. 2012).
In this context, the present study evaluated the potential of the mitochondrial COI gene as a barcode, used in combination with other traits, for the identification of Neotropical amphibians from the Amazon basin and other regions of South America. In particular, the study compares the molecular classification of the specimens with the traditional taxonomy of the group.

Study area and samples
In order to establish a reference site for the evaluation of a barcoding approach for Amazonian vertebrates, a field survey was conducted in the BX044 polygon in the southwestern Amazon basin, an area considered to be of the highest importance for the conservation of the biome's biological diversity (Pronabio, 2002). The polygon covers an area of 5270 km 2 and is located between latitudes 08°02'52" and 08°54'46" S, and longitudes 60°50'24" and 62°10'13"W, within the Madeira-Tapajós interfluve (Fig. 1). This interfluve is poorly studied and has few few protected areas, with no more than six percent of its total area located within conservation units of any kind (Ferreira et al. 2001). Notwithstanding, it encompasses a unique complex of habitats including open forests, savanna, forest-savanna transition, and gallery forests (Pereira et al. 2004). This mosaic of habitats reflects the position of the study area within the ecotone marking the transition between the Amazonian Hylea and the Cerrado savannas of central Brazil (Nascimento et al. 1988;Stotz et al. 1997). Specimens were collected in January, 2004, at 74 sites located along the Maderinha, Roosevelt, and Jatuarana rivers, and their tributaries. Specimens were collected in open and dense savanna habitats, gallery and flooded forests, rainforest, and ricefields. The specimens were euthanized with a lethal dose of lidocaine (Brasil, 1979). A total of 76 specimens representing 33 species was collected, and 37 sequences were obtained from 17 species, which represent one third of the total number of species analyzed in the present study. The sample was augmented by tissue samples (41 specimens representing 37 species) obtained from other institutions in Brazil and other countries. In addition to these samples, the COI sequences of a number of other amphibian species (see Suppl. material 1) with large sample sizes were obtained from GenBank, to provide a better visualization of the variation in the COI gene in these organisms.

Specimen identification
Following the extraction of tissue samples, the specimens collected during the present study were preserved for identification at the Goeldi Museum in Belém, Brazil, where they were confirmed by M.S.H. Hoogmoed. The accuracy of COI as a barcode for the identification of species was assessed based on the most recent classification of the amphibians (Frost 2016).

Molecular methods
Total DNA was extracted from either muscle or liver tissue by the SDS-proteinase K/ phenol-chloroform extraction method (Sambrook and Russell 2001). A partial 680-bp fragment of the COI gene was amplified using the 5-CCTGCAGGAGGAGGAGA-YCC-3´ and 5-AGTATAAGCGTCTGGGTAGTC-3´ primers (Palumbi 1996). The 25 µL polymerase chain reaction (PCR) mixture contained 0.4-1.2 µL of the DNA template, 2.5 µL 10XPCR buffer, 0.5 µL of each primer (10 pM/µL), 0.6-2.0 µL of MgCl 2 , 1µL dNTPs, and 0.15 µL of Taq DNA polymerase. The PCR conditions consistedof 3 min at 94 °C, followed by 35 (or 34) cycles of 50 sec at 94 °C, 50 sec at 55 °C (or 57 and 60 °C), 50 sec at 72 °C and a final extension at 72 °C for 5 min. The DNA was sequenced in both directions using the primers described above in a MegaBace (GE Healthcare) automatic DNA sequencer, using the DYEnamic ET Dye Terminator kit (GE Healthcare).
The sequences obtained were aligned and edited by BIOEDIT v. 7.0.5.3 (Hall 1999). The possible saturation of bases was assessed using a graphic representation of transitions and transversions (Ti-Tv) plotted against Kimura 2 parameters' distance (Kimura 1980). This analysis was run in DAMBE v. 5.3.105 (Xia 2013).
Pairwise comparisons of COI sequences were conducted for three categories: (i) individuals of the same species, (ii) individuals of the same genus (excluding those of the same species), and (iii) individuals of the same family (excluding those of the same genus). The frequency distribution of intra-and interspecific genetic distances was calculated using MEGA 5 (Tamura et al. 2011), as was a neighbor-joining (NJ) tree based on the K2P model (Kimura 1980). The robustness of the nodes of this tree was estimated by a bootstrap analysis, with 1000 pseudo-replications.
The variability of the COI gene between populations of the same species was also tested using the K2P model, for which the species were selected based on the largest possible sample size (number of specimens) in GenBank (A. varius, C. podiciferus, D. labialis and E. pustulosus) and the availability of accurate information on their geographic origin. Additional species were included in this analysis (see Suppl. material 1).

Results
COI sequences were recovered from 75% (83/110) of the specimens analyzed. Fulllength PCR products (640 bps) were amplified from all of these specimens (see Suppl. material 1). Of the 111 species analyzed, sequences of 56 were obtained during the present study and 58 from GenBank (sequences of Dendropsophus minutus, Rhinella marina, and Osteocephalus taurinus were obtained from both sources). Altogether, 410 sequences were analyzed, of which, 78 were obtained in the present study and 332 from GenBank. No evidence of base saturation was found whatsoever (Fig. 2).

Species identification
The COI barcode identified correctly the species of 94% of the specimens examined (93 of 109 species). The COI sequences obtained for the 36 species represented by two or more specimens were most similar to one another than to those of any other species. In addition, with a few notable exceptions, which are discussed below, the differences in COI sequences between closely-related species were higher than those within species. The mean K2P distance within species was 3.0% (Fig. 3), whereas that between species was 10.3%.
In most cases, the neighbor-joining (NJ) tree reflected a relatively reduced differentiation within species in comparison with between-species divergence (Fig. 4). Most of the terminal groups include specimens of the same species or genus with bootstrap values of over 85, except for Ranitomeya, Scinax, Leptodactylus, Osteopilus, and Hypsiboas, which all rendered relatively low bootstrap values. Also, in the Cophomantinae subfamily, the COI barcode generated contradictory clusters, such as Bokermannohyla alvarengai being sister group of Hypsiboas albopunctatus, Dendropsophus minutus and Hypsiboas multifasciatus, and Aplastodiscus callipygius and Dendropsophus cachimbo, and Aplastodiscus albosignatus.
Interspecific divergence varied considerably (Table 1). The distances between most species (5826 comparisons) were within the 9.9-39% range, whereas a few species (60 comparisons) were in the 0-9.9% range. The distances between populations of Atelopus varius, Craugastor podiciferus, Dendropsophus labialis, and Engystomops pustulosus exceed the observed intraspecific distances in other species.

Discussion
A single mitochondrial DNA barcode, derived from the COI gene, identified correctly 93 of the 109 Neotropical amphibian species analyzed in the present study. Similar barcodes (sequences) were not observed in different species, and lower distances (generally 0.0-9.9%) were observed within species than between them. The ranges of values recorded in the present study were consistent with those recorded in previous amphibian studies (Table 2). However, relatively high intraspecific variation was recorded between populations in some species, such as E. pustulosus (0.0-11.4%), C. podiciferus (4.1-11.4%), and D. labialis (0.2-9.0%). This indicates the possible presence of additional cryptic species, and supports the development of a standard screening threshold of sequence differentiation that would contribute to the more systematic and effective identification of new animal species.

Dendropsophus minutus
The high COI divergence rates recorded in the present study were nevertheless similar to those recorded in pulmonate snails (Thomaz et al. 1996) and lizards (Harris et al. 2004). In order to evaluate the relative divergence of this gene, Vences et al. (2005a) compared substitution rates in COI with those of two other mitochondrial genes commonly used in studies of amphibians (Cytb and ND4), and concluded that molecular evolution in COI is relatively fast, resulting in considerable variability in comparison with either of the other two genes.
The neighbor-joining tree indicated that most of the species and genera analyzed in the present study form relatively cohesive units. However, the data available on Dendropsophus minutus (Hawkins et al. 2007;present study) indicate that this form may include more than one species, and a similarly complex situation was observed in the Atelopus species (Lotters et al. 2011). The greatest intraspecific distances were recorded in Ranitomeya ventrimaculata (12.9%), Leptodactylus knudseni, Hypsiboas leptolineatus (13.3%), and Scinax fuscomarginatus (10.9%). A similar degree of divergence was found in R. ventrimaculata by Symula et al. (2003) and Brown et al. (2011). Likewise, Kok and Kalamandeen (2008) have suggested that L. knudseni may represent a species complex. The status of H. leptolineatus and S. fuscomarginatus is less clear, especially given the taxonomic complexity of Scinax, given the large number of known species, its conservative morphology, and the number of undescribed species (Nunes et al. 2012;Duellman et al. 2016).
The greatest intrageneric distances were recorded in Hypsiboas (18.2%), Craugastor (19.7%), and Osteopilus (20.2%). The considerable distances between some Craugastor species indicates the existence of a species complex, as indicated previously for Craugastor podiciferus by Streicher et al. (2009). In respect to Osteopilus septentrionalis, which is widely distributed in Cuba, a similar pattern was observed by the Cyt b gene (Heinicke et al. 2011). According to theses authors, this may be related to ancient marine incursions, which would have isolated different lineages.
The general polytomy observed in the present study may have been the result of the phylogenetic divergence at the family and genus levels, and the relatively reduced number of terminal taxa. This may also be reflected in the considerable variation in the bootstrap values, from 0% to 92%, found in some clades.
The amplification of the COI gene is straightforward in most vertebrates (Clare et al. 2007;Hajibabaei et al. 2006b;Hebert et al. 2004b;Ward et al. 2005). In the present study, however, difficulties were encountered due to the use of universal primers, as reported previously by Vences et al. (2005a;2012). For instance, in such studies, several modifications were done to perform successful COI amplifications, such as PCR purification and cloning, annealing temperature optimizations, and others. Thus, it may be necessary to formulate a cocktail of primers, with differentiated amplification protocols and annealing temperatures appropriate to the different amphibian species groups, genera or families (Clare et al. 2007;Vences et al. 2012). However, for other groups, such as Asian Salamanders, Xia et al. (2012) concluded that the high success rate in the sequencing (89%) was due to the reduced variation in the priming regions.
The results of the present study support the use of COI sequences as a DNA barcode for help the identification of Neotropical amphibian species, in particular to ensure the presence of cryptic forms. However, it will still be necessary to identify the factors determining the relatively high rates of divergence observed within the populations of some of the species analyzed in the present study. It will also be important to compile a database of sequences for different molecular markers, in order to better evaluate intra-and inter-specific patterns of variability (Richardson 2012;Luquet et al. 2015;Chambers and Hebert, 2016), addition to update the identification of specimens in the collections.