(C) 2013 Angeliki Laiou. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
For reference, use of the paginated PDF or printed version of this article is recommended.
Citation: Laiou A, Mandolini LA, Piredda R, Bellarosa R, Simeone MC (2013) DNA barcoding as a complementary tool for conservation and valorisation of forest resources. In: Nagy ZT, Backeljau T, De Meyer M, Jordaens K (Eds) DNA barcoding: a practical tool for fundamental and applied biodiversity research. ZooKeys 365: 197–213. doi: 10.3897/zookeys.365.5670
Since the pre-historic era, humans have been using forests as a food, drugs and handcraft reservoir. Today, the use of botanical raw material to produce pharmaceuticals, herbal remedies, teas, spirits, cosmetics, sweets, dietary supplements, special industrial compounds and crude materials constitute an important global resource in terms of healthcare and economy. In recent years, DNA barcoding has been suggested as a useful molecular technique to complement traditional taxonomic expertise for fast species identification and biodiversity inventories. In this study, in situ application of DNA barcodes was tested on a selected group of forest tree species with the aim of contributing to the identification, conservation and trade control of these valuable plant resources.
The “core barcode” for land plants (rbcL, matK, and trnH-psbA) was tested on 68 tree specimens (24 taxa). Universality of the method, ease of data retrieval and correct species assignment using sequence character states, presence of DNA barcoding gaps and GenBank discrimination assessment were evaluated. The markers showed different prospects of reliable applicability. RbcL and trnH-psbA displayed 100% amplification and sequencing success, while matK did not amplify in some plant groups. The majority of species had a single haplotype. The trnH-psbA region showed the highest genetic variability, but in most cases the high intraspecific sequence divergence revealed the absence of a clear DNA barcoding gap. We also faced an important limitation because the taxonomic coverage of the public reference database is incomplete. Overall, species identification success was 66.7%.
This work illustrates current limitations in the applicability of DNA barcoding to taxonomic forest surveys. These difficulties urge for an improvement of technical protocols and an increase of the number of sequences and taxa in public databases.
DNA barcoding, Forest Biodiversity, Medicinal and Aromatic plants, Conservation
Forests figure prominently among the world’s most important ecosystems. The importance of trees in sustaining biodiversity and habitat stability, as well as to provide a large variety of environmental services is well acknowledged. Nevertheless, the increasing human impact, the recent environmental decay, and the on-going climate change are among the main factors affecting forest communities, especially at local and regional scales within the Mediterranean basin (
Temperate and boreal forests are a traditional source, not only for timber, but also for many products that have been extracted from forests for millennia, including resin, tannin, fodder, litter, medical plants, fruits, nuts, roots, mushrooms, seeds, honey, ornamentals and exudates. Today there is an institutional rediscovery of the value of forest products and services other than timber, and the total value of Non-Wood Goods (NWGs) reported in Europe has almost tripled since 2007 (
Besides wood trade, Mediterranean woody flora includes numerous valuable species used as ornamentals or for secondary products processing and marketing (edibles, industrial and medicinal compounds). The option of stimulating the production of non-timber forest products has long been considered promising (
Molecular technology is considered a reliable alternative tool for the identification of plant species (e.g.
Based on the relative ease of amplification, sequencing, multi-alignment and the amount of variation displayed (sufficient to discriminate among sister species without affecting their correct assignation through intraspecific variation), three plastid loci are currently used in plants: rbcL (a universal but slowly evolving coding region), matK (a relatively fast evolving coding region) and trnH-psbA (a rapidly evolving intergenic spacer) (
Tree taxa have peculiar biological, evolutionary and taxonomic features that are likely to constitute a challenge to species recognition through DNA barcodes, viz. the generally low mutation rate of the plastid DNA, their ability to hybridize, and their narrowly defined species limits (
Sixty eight trees belonging to 24 species (ten genera, nine families) were sampled in the wild (Italy, Greece and adjacent areas) and/or Botanic Gardens (Table 1). Plants were identified directly in the field. Herbarium specimens and lyophilized green tissues of the collected material were vouchered and preserved at the Mediterranean Forest DNA bank of the University of Tuscia (www.Medna-bank.eu).
Sample list.
Familia | Species | Relevance | No. of samples |
---|---|---|---|
Pinaceae | Cedrus atlantica | Ornamental/afforestation | 3 |
Cedrus deodara | Ornamental/afforestation | 3 | |
Cedrus libani | Ornamental/afforestation/conservation | 3 | |
Rosaceae | Crataegus monogyna | Medicinal/ornamental | 3 |
Crataegus oxyacantha | Medicinal/ornamental | 2 | |
Crataegus azarolus | Food industry/conservation | 4 | |
Sorbus aria | / | 3 | |
Sorbus aucuparia | Ornamental/conservation | 2 | |
Sorbus domestica | Medicinal/food industry | 3 | |
Sorbus torminalis | Valuable wood industry | 3 | |
Sapindaceae | Aesculus hippocastanus | Medicinal/ornamental | 3 |
Aesculus indica | / | 3 | |
Oleaceae | Fraxinus ornus | Medicinal/food industry | 5 |
Fraxinus angustifolia | / | 3 | |
Fraxinus excelsior | / | 2 | |
Adoxaceae | Sambucus nigra | Medicinal | 5 |
Sambucus ebulus | / | 2 | |
Sambucus racemosa | / | 1 | |
Passifloraceae | Passiflora incarnata | Medicinal/ornamental | 2 |
Passiflora edulis | Food industry | 1 | |
Lythraceae | Punica granatum | Medicinal/food industry/ornamental | 4 |
Rhamnaceae | Ziziphus jujuba | Medicinal/food industry | 3 |
Aquifoliaceae | Ilex aquifolium | Medicinal/ornamental/conservation | 4 |
Ilex latifolia | / | 1 |
DNA extractions were performed with the DNeasy Plant Minikit (QIAGEN), following the manufacturer’s instructions. The universal applicability of the technical analyses was considered a prerequisite for exploring the DNA barcoding potential in a practical floristic case study: uniform PCR procedures were thus performed for all taxa and barcoding loci. Genomic DNAs (ca. 40 ng) were amplified with RTG PCR beads (GE Healthcare) in 25 μl final volume according to the manufacturer’s protocol. Thermocycling conditions were as follows: 94 °C for 3 min, followed by 35 cycles of 94 °C for 30 s, 53 °C for 40 s and 72 °C for 40 s, with a final extension step of 10 min at 72 °C. Primers for the investigated barcoding region are shown in Table 2. MatK1F/2R oligos were used in Cedrus (
Primers list.
Marker region | Primers | Reference |
---|---|---|
rbcL | Fw - ATGTCACCACAAACAGAAAC | Kress et al. (2005) |
Rev - TCGCATGTACCTGCAGTAGC | ||
trnH-psbA | Fw - CGCGCATGGTGGATTCACAATCC | Shaw et al. (2007) |
Rev - GTTATGCATGAACGTAATGCTC | ||
matK_Kim | Fw - CGTACAGTACTTTTGTGTTTACGAG | Kim (unpublished) |
Rev - ACCCAGTCCATCTAAATCTTGGTTC | ||
matK1F/2R | Fw - GAACTCGTCGGATGGAGTG | |
Rev - TAAACGATCCTCTCATTCACGA |
Sequences were aligned with MEGA5 (
Species discrimination power of the investigated loci was also assessed using the genetic distance approach, to evaluate whether the amount of variation displayed was sufficient to discriminate sister species without affecting their correct assignation through intraspecific variation. This approach is at the basis of the “barcoding gap” definition, i.e. the assumption that the amount of sequence divergence within species is smaller than that between species. Uncorrected p-distance matrices of sequence divergences within and among congeneric species were calculated for each gene fragment and for the two joined markers (rbcL + trnH-psbA), with MEGA5. All the species presenting a minimum interspecific distance value higher than their maximum intraspecific distance were considered successfully discriminated (
Finally, we simulated a barcode identification scenario using each sequence as an unknown query and GenBank (http://www.ncbi.nlm.nih.gov) as global reference database. The NCBI Taxonomy database (http://www.ncbi.nlm.nih.gov/taxonomy) was screened to assess the presence of the investigated species set in GenBank, relatively to markers under study. The identification ability of every single marker was evaluated using the megaBLAST algorithm (http://blast.ncbi.nlm.nih.gov) with default parameters and adjusted to retrieve 5000 sequences. A query sequence was considered as successfully identified if the top Bit-score obtained in GenBank matched the name of the species (
Optimal amplification rates were obtained with rbcL and trnH-psbA which produced clear, single-banded PCR products from all 68 investigated samples (136 sequences; 100% efficiency). MatK was not consistently amplified in the Pinaceae and Rosaceae (44.1% of the investigated dataset) and thus it was not included in further analyses. All rbcL electropherograms were easily read and analysed. Conversely, the very long poly-nucleotide repeatsin the trnH-psbA regions of Sambucus sp. made subsequent traces hardly readable. Consequently, in this genus the entire sequences were completed by joining partial bidirectional reads (
The alignment–free method implemented in BLUSTClust produced for each marker the haplotypes shown in Table 3. Based on the uniqueness of sequence character states, trnH-psbA generated a total of 43 haplotypes, 35 of which could be ascribed to single species. Common haplotypes were displayed by 14 individuals of the following species pairs, thus preventing their discrimination: Fraxinus angustifolia–Fraxinus excelsior (three samples), Crataegus monogyna–Crataegus oxyacantha (four samples), Sorbus aucuparia–Sorbus domestica (two samples), Ilex aquifolium–Ilex latifolia (five samples). Consequently, trnH-psbA discrimination ability was 79.4% of the investigated plants, corresponding to 66.7% of the species in the total dataset, 63.6% considering only those genera in which at least one species pair was sampled.
Haplotypes generated by BLASTClust in the investigated dataset with both markers and their combination. Shaded: species where unique haplotypes (either single or in combination) were detected.
Species | Samples | Unique haplotypes | Inter-species shared haplotypes | ||||
---|---|---|---|---|---|---|---|
rbcL | trnH-psbA | Combined | rbcL | trnH-psbA | Combined | ||
Cedrus atlantica | 3 | 2 | 2 | 2 | / | / | / |
Cedrus deodara | 3 | 1 | 1 | 1 | / | / | / |
Cedrus libani | 3 | 1 | 1 | 1 | / | / | / |
Crataegus monogyna | 3 | / | / | / | 1 | 1 | 1 |
Crataegus oxyacantha | 2 | / | 1 | 1 | 1 | 1 | 1 |
Crataegus azarolus | 4 | / | 2 | 2 | 1 | / | / |
Sorbus aria | 3 | 1 | 3 | 3 | / | / | / |
Sorbus aucuparia | 2 | 1 | 1 | 1 | 1 | 1 | 1 |
Sorbus domestica | 3 | / | 1 | 1 | 1 | 1 | 1 |
Sorbus torminalis | 3 | 1 | 1 | 1 | / | / | / |
Aesculus hippocastanus | 3 | 1 | 2 | 2 | / | / | / |
Aesculus indica | 3 | 1 | 3 | 3 | / | / | / |
Fraxinus ornus | 5 | 2 | 4 | 5 | 1 | / | / |
Fraxinus angustifolia | 3 | / | 1 | 1 | 1 | 1 | 1 |
Fraxinus excelsior | 2 | / | / | / | 1 | 1 | 1 |
Sambucus nigra | 5 | 1 | 4 | 4 | 1 | / | / |
Sambucus ebulus | 2 | 1 | 2 | 2 | 1 | / | / |
Sambucus racemosa | 1 | 1 | 1 | 1 | / | / | / |
Passiflora incarnata | 2 | 2 | 2 | 2 | / | / | / |
Passiflora edulis | 1 | 1 | 1 | 1 | / | / | / |
Punica granatum | 4 | 1 | 1 | 1 | n.d. | n.d. | n.d. |
Ziziphus jujuba | 3 | 1 | 1 | 1 | n.d. | n.d. | n.d. |
Ilex aquifolium | 4 | / | / | / | 1 | 1 | 1 |
Ilex latifolia | 1 | / | / | / | 1 | 1 | 1 |
Total | 68 | 19 | 35 | 36 | 12 | 8 | 8 |
RbcL displayed a much lower sequence differentiation (with a total of 31 haplotypes, 12 of which were shared between species). No haplotypes were shared among species from different genera. The two-marker combination did not improve markedly the discrimination efficacy displayed by trnH-psbA alone.
In this study, the two potential DNA barcodes displayed different levels of intra- and inter-specific distances. With rbcL, all intra-specific uncorrected p-distances were zero, except in Cedrus atlantica (0.0014), Sorbus aria (0.0014), Sorbus aucuparia (0.0028), Crataegus monogyna (0.0028), and Sambucus ebulus (0.004). Zero inter-specific distances were detected between individuals belonging to Sorbus aucuparia and Sorbus domestica, among the three Crataegus species, the three Fraxinus species, between Sambucus nigra and Sambucus ebulus, and between the two Ilex species. Conversely, no intraspecific sequence variation was found at trnH-psbA in Cedrus deodara, Cedrus libani, Sorbus torminalis, Crataegus monogyna, Crataegus oxyacantha, Fraxinus angustifolia, Sambucus racemosa, Passiflora edulis, Punica granatum, Ziziphus jujuba and the two Ilex species. Inter-specific genetic differences produced by this marker exhibited values higher than zero (0.0018–0.0298) only in five species belonging to Cedrus, Aesculus and Passiflora genera, and in Fraxinus ornus and Sambucus racemosa.
The values of the maximum intra- and minimum interspecific sequence divergence of the two combined barcoding loci are shown in Table 4 (all inter-specific distances involve congeneric species). In agreement with data based on the single markers, non-overlapping intra- and interspecific distances were observed in a few species groups. As such, barcoding gaps were observed in Cedrus deodara and Cedrus libani, Sorbus torminalis, and the two Aesculus species. All remaining taxa displayed equal (e.g. in Cedrus atlantica) or higher values of intra- than interspecific divergence (e.g. in Passiflora incarnata, Fraxinus ornus, Sorbus aria). Several species showed sequences involving zero interspecific divergence (e.g. Sorbus domestica, Sorbus aucuparia, Fraxinus excelsior, Fraxinus angustifolia, Sambucus nigra, Sambucus ebulus, Crataegus spp.). The lack of additional conspecific samples did not allow a comparison with the high levels of inter-specific divergences shown by two species (Passiflora edulis and Sambucus racemosa). These results suggest that there is a barcoding gap in only five out of 19 analyzed species, corresponding to 26.3% of our dataset (taxa with only one individual/species or one species/genus excluded).
Values of maximum inter- and minimum intraspecific uncorrected p-genetic distances resulting from the combination of rbcL + trnH-psbA sequences, and relative barcoding gaps calculated in 24 forest tree taxa; n.d. = not determined; * = no sister species included in the dataset; ** = taxa with single accession. Shaded: species where a barcoding gap was detected.
Samples | Max. Intrasp. distance | Min Intersp. distance | Barcoding gap | |
---|---|---|---|---|
Cedrus atlantica | 3 | 0.0015 | 0.0015 | 0 |
Cedrus deodara | 3 | 0 | 0.0015 | 0.0015 |
Cedrus libani | 3 | 0 | 0.0023 | 0.0023 |
Sorbus aria | 3 | 0.002898554 | 0.000950571 | -0.0019 |
Sorbus aucuparia | 2 | 0.0058 | 0 | -0.0058 |
Sorbus domestica | 3 | 0.0009 | 0 | -0.0009 |
Sorbus torminalis | 3 | 0 | 0.0009 | 0.0009 |
Crataegus azarolus | 3 | 0.0009 | 0 | -0.0009 |
Crataegus monogyna | 2 | 0.0019 | 0 | -0.0019 |
Crataegus oxyacantha | 4 | 0 | 0 | 0 |
Aesculus hippocastanus | 3 | 0 | 0.0064 | 0.0064 |
Aesculus indica | 3 | 0 | 0.0064 | 0.0064 |
Fraxinus ornus | 5 | 0.00568 | 0.00284 | -0.0028 |
Fraxinus angustifolia | 3 | 0.0036 | 0 | -0.0036 |
Fraxinus excelsior | 2 | 0 | 0 | 0 |
Sambucus nigra | 5 | 0.0017 | 0 | -0.0017 |
Sambucus ebulus | 2 | 0.0101 | 0 | -0.0101 |
Sambucus racemosa** | 1 | n.d. | 0.0142 | n.d. |
Passiflora incarnata | 2 | 0.02397 | 0.01588 | -0.0081 |
Passiflora edulis** | 1 | n.d. | 0.0158 | n.d. |
Punica granatum* | 4 | 0 | n.d. | n.d. |
Ziziphus jujuba* | 3 | 0 | n.d. | n.d. |
Ilex aquifolium | 4 | 0 | 0 | 0 |
Ilex latifolia** | 1 | n.d. | 0 | n.d. |
The NCBI Taxonomy database screening revealed that all the species in our dataset were represented by rbcL and trnH-psbA marker sequences in the database, except for Aesculus indica, Cedrus libani (neither marker), Crataegus azarolus and Sorbus domestica (only rbcL present).
When BLASTed to GenBank, all our rbcL sequences were identified by the reference sequences at the genus level (87.5% of total taxa), or even at the species level (41.6%). Genus misidentification occurred in the three Crataegus species, for which genera Cotoneaster, Pyrus, Piracantha, Amelanchier, Chaenomeles (all belonging to the Rosaceae family) and Crataegus were also the best match. In contrast, correct genus and species identifications were obtained for Ilex aquifolium, Passiflora incarnata and Passiflora edulis, Punica granatum, Ziziphus jujuba, Sambucus nigra, Sorbus torminalis, Cedrus atlantica and Cedrus deodara.
TrnH-psbA was outperformed by rbcL, since none of the Sorbus sequences (four species) matched the right genus, and only eight species (33.3%) were correctly identified (Fraxinus ornus, Passiflora incarnata, Punica granatum, Ziziphus jujuba, Sambucus racemosa, Cedrus atlantica and Cedrus deodara). All other samples shared the highest score with other species (e.g. Aesculus hippocastanum with Aesculus turbinata, Fraxinus excelsior with Fraxinus angusitfolia, Sambucus nigra with Sambucus racemosa, Crataegus monogyna with several other species), or even hit the wrong species (e.g. Ilex aquifolium, Sambucus ebulus, Crataegus oxyacantha). The four taxa not represented in GenBank (Cedrus libani, Aesculus indica, Creataegus azarolus and Sorbus domestica) were assigned to the correct genus. As a final result, only 11 species were correctly identified by the two locus-combination corresponding to 55% of the investigated species having a reference in GenBank (45.8% of the total species set). A summary of the correct species identifications achieved with the three discrimination methods used in the present study is shown in Table 5. Thirteen species (54.2% of our dataset) were identified by at least two methods. Only two species (Cedrus deodara and Sorbus torminalis) were identified with the three methods, whereas the absence of conspecific GenBank references prevented the same full identification for Cedrus libani and Aesculus indica. In contrast, six species (corresponding to three species pairs and totalling 25% of our dataset) appeared unidentifiable with any method: Crataegus monogyna, Crataegus oxyacantha, Sorbus aucuparia, Sorbus domestica, Fraxinus angustifolia, Fraxinus excelsior. Two species (Crataegus azarolus and Sorbus aria) were discriminated only by means of sequence specificity but received no confidence by any of the other two approaches (the former was absent in GenBank).
Summary of the species identification success achieved with rbcL + trnH-psbA and the three discrimination methods in the present study: occurrence of unique haplotypes in the total species set, genetic distances among and within congeneric species, correct species match in the GenBank database. Green: correct identification; red: non confident/wrong identification; shaded = not determined (no intra- or interspecific samples investigated); a = species absent in GenBank with either one or both markers.
Species | Identification success | ||
---|---|---|---|
Haplotype specificity | Min. inter- > max. intraspecific distance | GenBank correct match | |
Cedrus atlantica | √ | - | √ |
Cedrus deodara | √ | √ | √ |
Cedrus libani | √ | √ | a |
Crataegus monogyna | - | - | - |
Crataegus oxyacantha | - | - | - |
Crataegus azarolus | √ | - | a |
Sorbus aria | √ | - | - |
Sorbus aucuparia | - | - | - |
Sorbus domestica | - | - | a |
Sorbus torminalis | √ | √ | √ |
Aesculus hippocastanus | √ | √ | - |
Aesculus indica | √ | √ | a |
Fraxinus ornus | √ | - | √ |
Fraxinus angustifolia | - | - | - |
Fraxinus excelsior | - | - | - |
Sambucus nigra | √ | - | √ |
Sambucus ebulus | √ | - | - |
Sambucus racemosa | √ | n.d. | √ |
Passiflora incarnata | √ | - | √ |
Passiflora edulis | √ | n.d. | √ |
Punica granatum | √ | n.d. | √ |
Ziziphus jujuba | √ | n.d. | √ |
Ilex aquifolium | - | - | √ |
Ilex latifolia | - | n.d. | - |
Efficacy | 66.7% | 26.3% | 55% |
In our dataset, the rbcL + trnH-psbA combination showed the highest amplification and sequencing success (100%), whereas matK showed a much lower success (55.9%). Specifically, the currently most adopted primers set for Angiosperms (matK_KIM) failed in the amplification of the Rosaceae, and matK1F/2R primers, suggested for the Pinaceae, failed to amplify Cedrus sp. In addition, matK also revealed severe difficulties in the amplification and/or sequencing steps in the genera Berberis (Berberidaceae), Vitex (Rhamnaceae), Cercis (Leguminosae) and Ginkgo (Ginkgoaceae), in the ongoing prosecution of this work. The lack of universality of matK was already reported by e.g.
In contrast, trnH–psbA provided better discrimination than matK in many diverse tree genera such as Alnus (
The BLUSTClust analysis yielded a 66.7% species discrimination, which is a bit lower but still in line with the general limit acknowledged for land plants when markers from a single genetic linkage group are used (ca. 70%;
The highest identification success was achieved with the analysis based on the uniqueness of sequence character states, where some parts in the haplotypes (especially some trnH-psbA indels) appeared diagnostics for certain species. However, more data are required to confirm these diagnostic sequence features. Yet, if confirmed, these features may be important in view of the generally low interspecific divergences we observed. Conversely, the analysis with the barcoding gaps suggests that such a discrimination approach may yield a lower efficiency, at least with trnH-psbA, since the uncorrected p-distance analysis removed all indels. A further complication we encountered was constituted by the high intraspecific divergences (e.g. in Cedrus atlantica) and the sharing of haplotypes among congeneric species (e.g. in Sorbus, Crataegus, Fraxinus, Sambucus). All these results challenge the application of DNA barcoding with rbcL + trnH-psbA in the taxa investigated here. This is the more so as GenBank also showed a low identification efficiency and sometimes lead to erroneous identifications, most often due to the limited number of available reference sequences and their sometimes very high intraspecific divergences.
DNA barcoding is a substantial improvement of our capacity to document the existing biodiversity. It is also a powerful research complement for human socio-economics, safety, trade control, frauds discovery and detection of forgeries in plant commercial products (
The Mediterranean woody flora comprises numerous valuable species used as ornamentals or for secondary products processing and marketing (edibles, essential oils, medicinal compounds). Field identification, authentication and certification of germplasm and raw materials are a major concern. As such, our results on Cedrus support previous findings that members of Pinaceae can be efficiently barcoded with rbcL + trnH-psbA (at least at a regional scale;
On the other hand, we confirm the difficulties previously encountered in barcoding Fraxinus (
Recently, an outstanding research interest towards DNA barcoding of regional floras with biological and/or economical relevance has spread. In the present work, we lay the foundations towards DNA barcoding applications of important woody plant genera in the Mediterranean basin, such as Cedrus, Aesculus, Ilex, Passifllora, Punica, Sambucus, Sorbus, Ziziphus. All these genera include valuable taxa for multiple natural and economic purposes, and combine with similar DNA barcoding investigations performed on Euro-Mediterranean forested land in recent years (