DNA Barcoding of Portuguese Lacewings (Neuroptera) and Snakeflies (Raphidioptera) (Insecta, Neuropterida)

Abstract The orders Neuroptera and Raphidioptera include the species of insects known as lacewings and snakeflies, respectively. In Portugal, these groups account for over 100 species, some of which are very difficult to identify by morphological analysis. This work is the first to sample and DNA sequence lacewings and snakeflies of Portugal. A reference collection was built with captured specimens that were identified morphologically. DNA barcode sequences of 658 bp were obtained from 243 specimens of 54 species. The results showed that most species can be successfully identified through DNA barcoding, with the exception of seven species of Chrysopidae (Neuroptera). Additionally, the first published distribution data are presented for Portugal for the neuropterans Gymnocnemiavariegata (Schneider, 1845) and Myrmecaelurus (Myrmecaelurus) trigrammus (Pallas, 1771).


Introduction
Neuropterida is a superorder of insects which encompasses the orders Neuroptera, Raphidioptera and Megaloptera. The present work focuses on DNA barcoding of the first two orders in Portugal, while DNA barcoding of Megaloptera in the country was addressed in Ferreira et al. (2019).
The order Neuroptera includes the holometabolous insects commonly known as lacewings. With at least 6000 species worldwide, more than 300 of which occur in Europe, Neuroptera accounts for most of the diversity of the Neuropterida (Aspöck 2002b;Aspöck et al. 2015). For the almost 200 species known in the Iberian Peninsula, around half have been recorded in Portugal, spanning 10 families (Aspöck et al. 2001;Letardi and Almeida 2013;Monserrat and Triviño 2013;Oliveira and Ferreira 2020;this work).
The monophyly of the three orders of Neuropterida (Megaloptera as a sister group of Neuroptera + Raphidioptera) has been solidly established. Nonetheless, taxonomy of the groups is incompletely resolved and internal relationships are not yet established, despite recent studies, especially in the case of Neuroptera (Aspöck 2002b;Winterton et al. 2010;Wang et al. 2017;Engel et al. 2018). Most notably, recent evidence has been mounting for the integration of Ascalaphidae as a subfamily of Myrmeleontidae Machado et al. 2019;Vasilikopoulos et al. 2020).
DNA barcoding was proposed in 2003 as a method to rapidly and accurately identify species (Hebert et al. 2003;Hebert and Gregory 2005). This method relies on the existence of comprehensive databases of short DNA sequences (the DNA barcodes), which are attributed to previously identified specimens and used for comparison with DNA barcode sequences obtained from unidentified specimens or even environmental samples. For insects, the typical DNA barcode consists of a 658 bp sequence of the cytochrome c oxidase subunit I (COI) (Folmer et al. 1994), also known as the "Folmer region". DNA barcoding has been used in studies involving Neuropterida, namely in the construction and analysis of DNA barcode databases for the fauna of certain regions, including Central Europe (Morinière et al. 2014) and Beijing, China (Yi et al. 2018), in the description of new species (Pantaleoni and Badano 2012;Badano et al. 2016), and to resolve taxonomic questions (Price et al. 2015). It is important to accurately identify species, especially the ones with agricultural applications, such as those in Chrysopidae and Hemerobiidae, as misidentifications may compromise bio-logical control. Hitherto, DNA barcoding studies of Neuroptera and Raphidioptera in Portugal were non-existent, despite the considerable number of species known to occur in the country.
In this work, we present a contribution to the DNA barcode library for the Portuguese species of Neuroptera and Raphidioptera representing about 50% of known species in the country, alongside new and interesting distributional data. While most species were found to be identifiable through the use of the obtained DNA barcodes, this was not true for some cases in Chrysopidae. This work was conducted within the frame of the InBIO Barcoding Initiative, which aims at producing a comprehensive DNA barcode database for the Portuguese terrestrial invertebrate biodiversity.

Sampling of specimens
Specimens were collected during field expeditions throughout continental Portugal, from 2006 to 2019, and stored in 96% ethanol at the InBIO Barcoding Initiative reference collection (Vairão, Portugal). Specimens were captured during direct searches of the environment or lured by light trapping, the latter with UV LEDs or mercury vapour lamps. Morphological identification was done based on the most recent literature on Iberian Neuroptera and Raphidioptera Acevedo 2012a, b, 2013;Monserrat 2014aMonserrat , b, c, 2016aMonserrat et al. 2014;Monserrat and Papenberg 2015), and using an Olympus SZX2-ILLT Stereozoom microscope when necessary. From each specimen, one tissue sample (a leg) was removed and stored in 96% ethanol for DNA extraction.

DNA extraction, amplification and sequencing
For each species, we selected six specimens for DNA sequencing based on their location of capture, attempting to maximize the geographical coverage of the study. For species with less than six specimens, all were selected for sequencing.
DNA was extracted from most tissue samples using the EasySpin Genomic DNA Microplate Tissue Kit. For specimens belonging to species of smaller sizes (such as those from the Hemerobiidae and Coniopterygidae families), the QIAmp DNA Micro Kit was used, as it is designed to extract higher concentrations of genetic material from samples with small amounts of DNA.
Amplification of the DNA was performed using three different primer pairs, that amplify three overlapping fragments of the same 658 bp region of the COI mitochondrial gene. Initially, we used two primer pairs, LCO1490 (Folmer et al. 1994) + Ill_C_R (Shokralla et al. 2015) and Ill_B_F (Shokralla et al. 2015) + HCO2198 (Folmer et al. 1994) (henceforth referred to as LC and BH, respectively) to amplify two overlapping fragments of 325 bp and 418 bp, respectively. Following publication of the third primer pair, BF2 + BR2 (422 bp fragment), by Elbrecht and Leese (2017), this started to be used instead of Ill_B_F + HCO2198 due to higher amplification efficiency.
PCRs were performed in 10 µl reactions, containing 5 µl of Multiplex PCR Master Mix (Qiagen, Hilde, Germany, 0.3 (BF2-BR2) -0.4 mM of each primer, and 1-2 µl of DNA, with the remaining volume in water. For DNA amplification, an initial denaturation at 95 °C for 15 min was performed followed by 5 cycles at 95 °C for 30 sec, 47 °C for 45 sec, 72 °C for 45 sec (only for LC and BH); then 40 cycles at 95 °C for 30 sec, 51 °C for 45 sec (48 °C for 60 sec for BF2 + BR2), 72 °C for 45 sec; and a final elongation step at 60 °C for 10 min. DNA amplification was performed in T100 Thermal Cycler (Bio-Rad, California, USA).
All PCR products were analysed by agarose gel electrophoresis and samples selected for sequencing were then organised for assignment of sequencing 'indexes'. One of two types of index were used for each run. For Illumina indexes, samples were pooled into one plate, as described in Shokralla et al. (2015). When using custom indexes (designed based on (Meyer and Kircher (2010)) no pooling was required. The latter allow for a maximum of 1920 unique index combinations. A second PCR was then performed where the 'indexes' and Illumina sequencing adapters were attached to the DNA extract. The index PCR was performed in a volume of 10 µl, including 5 µL of Phusion High-Fidelity PCR Kit (New England Biolabs) or KAPA HiFi PCR Kit (KAPA Biosystems, USA), 0.5 µL of each 'index' and 2 µL of diluted PCR product (usually 1:4). This PCR reaction is only of 10 cycles and performed at an annealing temperature of 55 °C. The amplicons were purified using AMPure XP beads (New England Biolabs) before quantification using NanoDrop 1000 (Thermo Scientific). This step allows for a normalization of concentrations between samples before the final quantification step with a qPCR using the KAPA Library Quantification Kit Illumina Platforms (KAPA Biosystems, USA) (Paupério et al. 2018).
Sequencing was performed at the CIBIO facilities on an Illumina MiSeq benchtop system, using a V2 MiSeq sequencing kit (2× 250 bp).

Bioinformatic processing and data analysis
Sequences were filtered and processed with OBITools (Boyer et al. 2014) and the fragments were assembled into their consensus 658 bp-long sequences using Geneious 9.1.8 (https://www.geneious.com). The obtained DNA sequences were then compared against the BOLD database (Ratnasingham and Hebert 2007) using the built-in identification engine, based on the BLAST algorithm. Sequences were submitted to the BOLD database and the Barcode Index Numbers (BIN) for every sequence were retrieved and analysed (Suppl. material 1: Table S1).
All DNA barcode sequences were aligned in Geneious 9.1.8. with the CLUSTALW ( Thompson et al. 1994) plugin. Nucleotide composition of all sequences, as well as intra and interspecific p-distances were calculated in MEGAX (Kumar et al. 2018). Neighbour-joining trees were constructed in PAUP* 4.0a167 (Swofford 2003), with 1000 bootstrap replicates, as a simple way of visualizing genetic distance between sequences, while detecting possible misidentifications and incongruences. First, a tree with all obtained DNA barcode sequences of Neuroptera and Raphidioptera was constructed. For this, the outgroup sequences IBIMP001-19 and AGRID020-10 from the BOLD database (of Sialis fuliginosa Latreille, 1803 and Agriotes proximus Schwarz, 1891, respectively) were used to root the tree. These outgroups refer, respectively, to a species of Megaloptera (the third order within the Neuropterida) and a species of Coleoptera, the closest order to Neuropterida (Wang et al. 2017). Additionally, a NJ tree was constructed for Chrysopidae and Hemerobiidae, utilizing the sequences FBNE073-11 and FBNE001-11 (of Osmylus fulvicephalus (Scopoli, 1763) and Sisyra nigra (Retzius, 1783), respectively) as outgroups. The latter set of outgroups was used for family-level trees as representative of Osmylidae Linnaeus, 1758 and Sisyridae Banks, 1905. An analysis of the data with the Automatic Barcode Gap Discovery (ABGD) method (Puillandre et al. 2012) was performed at the dedicated website (https://bioinfo. mnhn.fr/abi/public/abgd/abgdweb.html), as a test of the existence of a barcoding gap between species, which is fundamental to species identification using DNA barcodes (Hebert et al. 2003(Hebert et al. , 2004.

Results
DNA barcode sequences of 658 bp were obtained for 243 specimens of Neuropterida, representing 54 of the 104 species known to occur in continental Portugal ( Fig. 1; Suppl. material 1: Table S1). These species are representative of 9 of 10 families of Neuroptera, and one of two families of Raphidioptera recorded in the country. These sequences represent 21 new species of Neuroptera and one of Raphidioptera for the BOLD database. Furthermore, of the already available sequences in BOLD only six originate from continental Portugal (accessed on 19/01/2021).

Neuroptera Linnaeus, 1758
Neuroptera specimens were collected from 67 sampling locations, in 12 districts ( Fig. 1 and Suppl. material 1: Table S1). From the 51 species, 12 were captured only once and are therefore represented by a single DNA barcode sequence in the dataset. Two of the species were hitherto without published records in scientific literature for the country: Gymnocnemia variegata and Myrmecaelurus trigrammus (Suppl. material 1: Table S1), despite being widespread in the whole Euro-Mediterranean area and their presence well known in Spain.
The ABGD method yielded partitions generally congruent with morphological identification. Nonetheless, some exceptions were noted. Regarding the Chrysopidae, the ABGD analysis yielded 15 partitions (P = 0.0055) (Fig. 3). While congruent with the NJ analysis (by considering the aforementioned polyphyletic groups of species as two separate species), it also grouped the DNA barcoding sequences of Pseudomallada prasinus and Pseudomallada abdominalis (Brauer, 1856), which NJ analysis separates into three clades (in congruence with three detected morphospecies; see Discussion), into one single "species". In the Hemerobiidae family, the ABGD analysis recovered only eight partitions (P = 0.0492), grouping Wesmaelius subnebulosus (Stephens, 1836) and Wesmaelius nervosus (Fabricius, 1793) (Fig. 4). Similar to the other two methods used, BIN allocation using BOLD Systems yielded congruent results for most species, with some particular cases of incongruence. In Ascalaphidae, the sequences belonging to the two species of Libelloides were grouped under the same BIN. For Chrysopidae, the BIN framework clustered sequences simi-  larly to ABGD, except for two sequences of Pseudomallada prasinus (INV10273 and INV07344), which were assigned BINs different from each other and the other sequences for the species, as well as one sequence from both Pseudomallada genei and Pseudomallada venosus which were not grouped in the same BIN as the other sequences of the same species (Fig. 3 and Suppl. material 1: Table S1). The sequences of Hemerobiidae yielded 10 BINs, one more than the number of morphologically identified species, as sequences of Sympherobius pygmaeus are in two BINs (Fig. 4).

Raphidioptera Latreille, 1810
DNA barcode sequences were obtained for eight specimens of Raphidioptera, accounting for three of the six species known to occur in Portugal.
Average nucleotide composition of all DNA barcode sequences of Raphidioptera was calculated as 37.2% thymine (T), 18.1% cytosine (C), 29.6% adenine (A) and 15.1% guanine (G). Base frequencies analysis revealed GC-contents of 33% for the DNA barcode fragment. Genetic distances between species ranged from 12.3% between A. maculicollis and H. castellana to 15.9% between H. laufferi and H. castellana. Intraspecific distances ranged from 0.2% in H. castellana to 1.2% in A. maculicollis (Suppl. material 2: Table S2). The NJ tree constructed with the calculated genetic distances recovered all species as monophyletic (Fig. 2). Analysis with the BOLD BIN system yielded three BINs, congruent with the morphological identification. Similarly, three partitions were recovered from ABGD analysis.
The eight specimens of Raphidioptera were captured in six sampling locations in Bragança and Leiria ( Fig. 1 and Suppl. material 1: Table S1)

Discussion
In this work, DNA barcode sequences and their respective analyses, as well as novel distributional data are provided based on 235 specimens of 51 species of Neuroptera and 8 specimens of 3 species of Raphidioptera. This is the first study focusing on DNA barcoding for these orders in Portugal.
The main goal of this work was to compile a DNA barcode reference collection for the Portuguese species of Neuroptera and Raphidioptera. About 50% of the faunal diversity of the groups is represented in the collection, and DNA barcode sequences were added to the BOLD database for species hitherto unrepresented. The analyses conducted suggest that most of the encompassed species can be identified with the COI gene-based DNA barcodes. This is the case for the Ascalaphidae, Berothidae, Mantispidae, Myrmeleontidae and Nemopteridae families. For the other families, Chrysopidae and Hemerobiidae, further scrutiny is necessary.
Interestingly, despite the congruence of taxonomy and the obtained DNA barcodes for the families Ascalaphidae and Myrmeleontidae, the genetic distances and phylogenetic tree (Fig. 2) show the latter group as paraphyletic. These results may provide further evidence for the integration of current Ascalaphidae species into the family Myrmeleontidae, a taxonomic change that has seen growing support in recent years Machado et al. 2019;Vasilikopoulos et al. 2020) Regarding the Chrysopidae, the results show four groups of species with conflicting results between morphological identification, NJ and ABGD analysis, and BIN attribution. The first consists of the DNA barcode sequences belonging to P. flavifrons and P. picteti, whose sequences were recovered as a single clade (NJ) and placed by ABGD analysis into a single group. Despite possessing distinctive morphological characteristics these are closely-related species with high degree of morphological variation (Aspöck et al. 2001;Monserrat 2016b;Duelli et al. 2017). The obtained results suggest that P. picteti and P. flavifrons share mitochondrial haplotypes, which may be due to incomplete lineage sorting or mitochondrial genome capture as a result of introgressive hybridization.
The morphospecies P. venosus and P. genei were recovered as monophyletic and ABGD considered each of the species as single units, although two different BINs were attributed to each species.
The Pseudomalla "prasinus" species complex, where P. prasinus and P. abdominalis are included, is the third group with conflicting results between NJ, ABGD and morphological analysis, and has been a subject of interest and contention in Neuropterology for over a century (McLachlan 1886). Recent molecular genetics works have supported the existence of a species complex (Duelli et al. 2017), showing cryptic diversity in the group. One of the prasinoid morphotypes is known as "marianus" and was previously considered as a valid species. The specimens INV10273 and INV07344 were identified as Pseudomallada marianus, by utilizing the key available for the Iberian Peninsula in Monserrat (2016b). Duelli and Obrist (2019) established the synonymy of P. marianus with P. prasinus, previously proposed by Hölzel (1973), based on Central European specimens. In the former, authors state that the morphological characters previously attributed to the "marianus" morph (i.e., larger size and bundled egg placement) are the ones that define the "real" P. prasinus. As such, smaller specimens belonging to the "prasinus" species complex can't yet be identified conclusively to species level until the prasinoid morphotypes are well-defined and described as a species (Duelli and Obrist 2019). However, the implications of this work on the Iberian Peninsula's specimens of the "prasinus" species complex are not clear and require further research. In the present work, the NJ analysis was congruent with the morphological identification based on the characteristics described in Monserrat (2016b) since it separately grouped INV7344 and INV10273, which were identified as the "marianus" morphotype, but failed to retrieve P. prasinus as monophyletic. In contrast, the ABGD analysis grouped all specimens of P. prasinus and P. abdominalis. Additionally, the intraspecific distance between DNA barcode sequences of P. prasinus (2.25%) was higher than expected relative to the other species in our dataset. Our results, albeit limited, provide additional support to the existence of cryptic diversity in P. prasinus. Identification through DNA barcoding may prove problematic until the taxonomy of the group is better resolved, and will likely benefit of the use of other DNA markers.
A more complex situation is that of Chrysoperla carnea, C. lucasina, C. pallida, C. agilis and C. mediterranea, in which all obtained sequences are grouped by NJ, ABGD and BIN analysis in a single unit. The five species belong to the so-called C. carnea species complex (Thierry et al. 1998;Henry et al. 2002Henry et al. , 2013. So far, the most reliable way to identify the species in this group is by their substrate-borne vibrational songs, produced by tremulation (Henry and Wells 2015). Even though these are not used for attraction of mates at long-distances as in many other animals, these signals are produced for close-range recognition of sexually receptive mates (Henry et al. 2002(Henry et al. , 2003(Henry et al. , 2012. The obtained results for the species of the group are congruent with previous studies (Lourenço et al. 2006;Morinière et al. 2014) and might be a result of the pre-copulatory reproductive isolation and the recent and rapid speciation of this group of species (Henry et al. 2013). Considering the obtained data and the available literature, a COI-based DNA barcode is not a feasible tool for species identification in this species complex.
The analysis of the sequences obtained from Dilaridae specimens yielded the highest intra and interspecific genetic distances of all studied species. The intraspecific genetic diversity in Dilar meridionalis was 3.67% (N = 4), while the genetic distance between the D. meridionalis and D. saldubensis was 17.7%. Since previous works on DNA barcoding of Neuroptera have poorly (Yi et al. 2018) or not represented (Morinière et al. 2014) the family at all, further sampling and sequencing would be needed to access the validity of DNA barcoding based on the COI gene for identification of species in this family.
The two species of Wesmaelius were separated in the NJ analysis as by morphology, though ABGD failed to recover two distinct groups. Furthermore, both the ID engine and BIN analysis in BOLD systems clearly separated the species and grouped the sequences in BINs with other sequences available in the BOLD database of the same two species. Considering these results, we suggest that COI DNA barcode sequences may be used in the identification of these two species.
Another species that presents more than one BIN is Sympherobius pygmaeus. The genetic diversity observed is congruent with previous work (Morinière et al. 2014) and further research is needed to verify if it is a case of cryptic diversity.
In our dataset, all species of Raphidioptera showed relatively low intraspecific divergence when compared with the respective interspecific distances. Despite the low number of DNA barcode sequences available and the absence of three of the six species in the dataset, the obtained results suggest that a DNA barcoding approach using a COI gene fragment may be used to discern between species of Portuguese Raphidioptera. This assumption is reinforced by the fact that all six species in the country belong to six different genera and are, as such, predicted to show relatively high interspecific distances between them.
For the large majority of encompassed species, DNA barcoding appears to be a reliable method of identification. While DNA barcoding cannot replace morphological taxonomy experts entirely, especially in taxa where the taxonomy still needs revision, it can aid in species identification in cases where morphology cannot be used. For example, in diet analyses, where only small body parts (or none at all) can be retrieved, using DNA barcoding may be the only method suitable for species identification, allowing the understanding of species interactions and their roles in the ecosystems.
Currently 73 species of Neuropterida present in Portugal have DNA barcoding data available, comprising the 54 species encompassed in this work and the 19 already available in the BOLD database from other countries. Nonetheless, 29 species known to occur in Portugal remain without DNA barcode available and further efforts are needed to fill this gap.

Conclusion
This work provides novel data on the DNA barcoding and geographical distribution of Neuroptera and Raphidioptera species in Portugal. Our results suggest that DNA barcoding using COI Folmer region may be used to identify the great majority of species of Neuroptera and Raphidioptera species recorded in the country. It is not, however, suitable for identification of several species of the Chrysopidae family. In total, there were 22 cases where the first publicly available DNA barcode sequence for a species was obtained but further sampling and sequencing efforts are still needed for many. The completion of DNA barcode databases is an ongoing effort and, in the cases of Neuroptera and Raphidioptera, still require much work, including in Europe, where several species are not yet sequenced. The future, however, looks bright as international initiatives are promoting and aiding in the development of DNA barcode sequences databases for particular regions worldwide (Letardi 2019).