When DNA barcoding and morphology mesh: Ceratopogonidae diversity in Finnmark, Norway

Abstract DNA barcoding in Ceratopogonidae has been restricted to interpreting the medically and veterinary important members of Culicoides Latreille. Here the technique is utilised, together with morphological study, to interpret all members of the family in a select area. Limited sampling from the county of Finnmark in northernmost Norway indicated the presence of 54 species, including 14 likely new to science, 16 new to Norway, and one new to Europe. No species were previously recorded from this county. Only 93 species were known for all of Norway before this survey, indicating how poorly studied the group is. We evaluate and discuss morphological characters commonly used in identification of biting midges and relate species diagnoses to released DNA barcode data from 223 specimens forming 58 barcode clusters in our dataset. DNA barcodes and morphology were congruent for all species, except in three morphological species where highly divergent barcode clusters indicate the possible presence of cryptic species.


Introduction
The Ceratopogonidae (biting midges) are generally small flies with a nearly worldwide distribution; the family includes 6,180 extant species in 111 genera (Borkent 2014a) but undoubtedly, many more undescribed species await discovery. Immatures are found in a wide array of aquatic, semiaquatic and moist terrestrial habitats. Female adults of many species in early lineages of the family suck blood from vertebrates or are ectoparasites on larger insects (e.g. wings of Odonata and Lepidoptera, caterpillars, phasmids). More derived lineages are predators of primarily nematocerous Diptera (e.g. Chironomidae) (Downes and Wirth 1981). Adults of both sexes imbibe nectar and/or honey dew and some are important pollinators of plants such as cocoa (Glendinning 1972). Numerous species of Leptoconops Skuse, Forcipomyia Meigen and Culicoides Latreille are pests of humans and livestock, having irritating bites and transmitting a wide array of viruses, protozoa and nematodes, including some important diseases (Borkent 2005).
Although Ceratopogonidae are common in almost all aquatic and semi-aquatic habitats, many species are small and members of some genera can be notoriously difficult to identify. The family is particularly poorly known taxonomically in Norway, in part due to very limited collecting and a general lack of experts over many years. Presently in Europe, the family has approximately half as many species as the Chironomidae, while in Norway this percentage is considerably lower (15%), as only 93 species of ceratopogonids have been recorded (Soot-Ryen 1943, Mehl 1996, Thunes et al. 2004and Szadziewski et al. 2012. Worldwide, however, there are about as many species of Ceratopogonidae (6,180) known as of Chironomidae (6235) (Borkent 2014a, Patrick Ashe pers. comm.). As this limited study of the ceratopogonid fauna of the far north shows, there are many more species of Ceratopogonidae actually present in Norway, with the strong expectation of further species both there and in more southerly habitats once these are systematically collected. Further to this, and particularly pertinent to studies of northern faunas, there have been only a few taxonomic studies comparing Old and New World Ceratopogonidae (Borkent and Bissett 1990;Borkent and Grogan 1995) and we therefore are uncertain about the true identity of some of these. There is an especially strong need to compare species of Forcipomyia, Atrichopogon Kieffer, Dasyhelea Kieffer, Culicoides, and Brachypogon Kieffer, all genera with numbers of species in the far north and which likely are more broadly distributed in the Holarctic than presently recognized.
Of all biting flies, the immatures of Ceratopogonidae are by far the most poorly known, with only limited regional keys to some larvae and pupae of some genera. To a distressing degree, the larvae of the subfamily Ceratopogoninae are morphologically similar and difficult to identify. The pupae are rich in characters and have been recently revised by Borkent (2014b).
DNA barcoding is defined as the use of short standardized sequences to identify specimens to species (Hebert et al. 2003). As a natural consequence, DNA barcodes can also be used to analyze species boundaries through genetic comparisons between similar taxa and provide an objective dataset to be used in the definition of species in addition to morphology, ecology and other species specific characteristics. The 5' end of the mitochondrial gene cytochrome c oxidase subunit one (COI) is, since Hebert et al. (2003), regarded as the standard barcode region for animals and has been fairly widely used in Diptera (e.g. Ekrem et al. 2010;Renaud et al. 2012, Meiklejohn et al. 2012, Nagy et al. 2013). This marker is also used in the establishment of the Barcode Index Number (BIN) System, a DNA based registry for all animal species using operational taxonomic units as presumptive species (Ratnasingham and Hebert 2013). The use of COI-barcodes (or other molecular markers) to interpret species of Ceratopogonidae has barely begun and has focused on distinguishing those species of Culicoides implicated in the spread of diseases of domestic animals (e.g. Ander et al. 2013;Augot et al. 2013) as well as their hosts and parasites (Santiago-Alarcon et al. 2012). Our broader use here is the first to examine all the species of Ceratopogonidae at a given locality. Being a study of a high latitude fauna, the work provides ample opportunities to make future comparisons with the ceratopogonid fauna from elsewhere and especially from other localities in the northern Holarctic Region.
Neither Mehl (1996) in his overview of Norwegian Culicoides nor The Norwegian Biodiversity Information Centre (NBIC's "Artsobservasjoner" and "Artskart") have registered any Ceratopogonidae species from the county of Finnmark previous to our work.

Material and methods
Specimens were collected through a survey focusing on selected aquatic insect groups in Finnmark, the northernmost county of mainland Norway. More than 100 different sites were visited in three main trips during the season from June 11 to September 9, 2010 (Ekrem et al. 2012). Since the Ceratopogonidae were not a target group during the sampling, it is likely only a fraction of the existing species have been collected. The majority of Ceratopogonidae were retrieved from eight Malaise traps and only seven additional sites were sampled with sweep nets, dip nets or light trap (Fig. 1). All sample sites are described in Ekrem et al. (2012).
DNA barcodes were initially used to explore the unknown diversity of Ceratopogonidae from Finnmark. Several specimens of each morphotype were selected under a stereomicroscope and sampled for DNA analysis, typically by removing 1-3 legs. Tissues were shipped to the Biodiversity Institute of Ontario (BIO), Canada for sequencing of partial COI gene sequences. Mainly adult flies of both sexes were sequenced, but two larvae and one pupa were also included. COI amplification and sequencing followed standard protocols at the Canadian Centre for DNA Barcoding, BIO, including bi-directional Sanger sequencing. A list of barcoded material and all reference numbers are given in the Appendix; protocols, sequences, metadata and photographs of all specimens are available through the public project "Ceratopogonidae of Finnmark" [FICER] in the Barcode of Life Data Systems 3.0 (BOLD), (www.boldsystems.org, Ratnasingham and Hebert 2007).
Since slide mounting is generally needed for morphological species identification of biting midges, selected specimens (representing both sexes when available) from each cluster were slide mounted in Euparal©. The remaining un-mounted midges are preserved in 96% ethanol and stored in a -20 °C freezer. All specimens are deposited in the collection of the NTNU University Museum in Trondheim, Norway.
DNA barcodes from each genetic cluster (produced by the neighbor joining algorithm on Kimura 2-parameter genetic distances in BOLD) were compared with all COI sequences in BOLD and GenBank through the BOLD identification engine and Gen-Bank's MegaBLAST-algorithm (Morgulis et al. 2008) respectively. All instances which produced an identification different from our morphological identification are discussed in the taxonomic treatments below. We used MEGA 5.2 (Tamura et al. 2011) to generate the taxon ID-tree based on the neighbor joining algorithm from aligned COI sequences using partial deletion for areas with gaps and 1000 bootstrap replicates. The taxon ID-tree is not a phylogenetic hypothesis of the included taxa, but a graphic representation of barcode clusters based on genetic Kimura 2-parameter distances. Alignment was performed on protein sequences and was trivial as there were no observed indels and very high similarity on the amino acids level. Tools present in BOLD were used to produce a genetic distance summary and to perform a barcode gap analysis. All analyses were done using Kimura 2-parameter genetic distances (Kimura 1980). Species were identified using taxonomic literature as referenced below (under each genus or species). Sources for Ceratopogonidae records in Norway were Soot-Ryen (1943), Mehl (1996), Hagan et al. (2000), Thunes et al. (2004), and Szadziewski et al. (2012). Comments on European distribution of Ceratopogonidae are based on data published in Fauna Europea (Szadziewski et al. 2012), for North America we relied on the summary distributions given by Borkent and Grogan (2009) and kept updated by the second author.

Results
DNA barcodes were obtained from 223 specimens representing 54 morphological species (Table 1, Fig. 2). Thirty-eight species were represented by more than one specimen from Finnmark and showed a mean intraspecific Kimura 2-parameter distance of 1.6%. Maximum observed intraspecific distance for the complete dataset was considerably higher (11.9%) than the minimum observed interspecific divergence (5.8%). However, at least three morphological species contained multiple BINs (well separated barcode clusters) where cryptic species-level diversity may be present. Dasyhelea (Dicryptoscena) modesta (Winnertz, 1852) contains two BINs with mean intraspecific distance 4.9%, maximum intraspecific distance 11.9% and distance to nearest neighbor of a different morphospecies 16.9%. Dasyhelea (Dasyhelea) malleola Remm, 1962 contains four BINs with mean intraspecific distance 2.2%, maximum intraspecific distance 5.1% and distance to nearest neighbor of a different morphospecies 15.1%. Brachypogon (Isohelea) nitidulus (Edwards, 1921) contains two BINs with mean intraspecific distance 3.2%, maximum intraspecific distance 6.0% and distance to nearest neighbor of a different morphospecies 17.2%. Treating the multiple clusters of these three morphospecies as presumptive (cryptic) species, the maximum intraspecific distance for the whole dataset is 4.0% compared to 5.8% minimum interspecific distance, giving an overall barcode-gap of almost 2%.
There are two additional morphospecies where the Refined Single Linkage (RESL) analysis in BOLD (Ratnasingham and Hebert 2013) produces multiple BINs but where we suspect no more than one species: Brachypogon (Isohelea) sociabilis (Goetghebuer, 1920) has four BINs, a mean intraspecific distance of 1.71%, maximum intraspecific distance of 4.0% and distance to nearest neighbor 13.5%. Bezzia rhynchostylata Remm, 1974 has three BINs, a mean intraspecific distance of 2.4%, maximum intraspecific distance 3.8% and distance to nearest neighbor 17.2%. Both morphology and comparatively low intraspecific distance in these species suggest that the RESL algortithm overestimates presumptive species (as BINs) for these taxa.
We also compared our DNA barcodes with the partial COI gene sequences Ander et al. (2013) provided for 37 named Culicoides species from Sweden. All Culicoides species we collected in Finnmark, except for C. minutissimus (Zetterstedt, 1855), are Table 1. Distribution of Ceratopogonidae in Finnmark based on the revised Strand-system (Økland 1981). Species marked with an asterisk (*) are also known from North America (Borkent and Grogan 2009). Division of Finnmark in four regions according to the revised "Strand-system" (Økland 1981 (Edwards, 1921) x x X Brachypogon (Isohelea) sociabilis (Goetghebuer, 1920) x X  Ander et al.'s (2013) and not to Wenk et al.'s (2012) interpretation of the species. Voucher material for the COI-sequences published by Wenk et al. (2012) and Ander et al. (2013) was requested from the respective authors, but unfortunately not made available for examination. Thus, we were unable to confirm if the identifications correspond to our morphological interpretation of C. salinarius. Five of the sample sites collected 92% of the investigated specimens and all but one species were found at the five sites FinLoc65, FinLoc05, FinLoc08, FinLoc85, and FinLoc42 ( Fig. 1, Ekrem et al. 2012). The most productive location in terms of Cerat-   Figure 2. Taxon-ID tree of the studied Ceratopogonidae specimens based on the neighbor joining algorithm from aligned COI sequences using partial deletion for areas with gaps and 1000 bootstrap replicates in MEGA 5.2. All included sequences were longer than 500 bp. Bootstrap values shown on branches supported by more than 90% of the bootstrap replicates.  opogonidae material was one locality in the eastern part of the county (FinLoc65, Malaise 7) in which 72% of all specimens treated were sampled and 41 of 55 species were found. The other Malaise traps collected from 1.4% to 8.4% of the specimens and 3-13 species, while the light trap at the research station (FinLoc85) collected 4.7% of the specimens and four species, including three species not collected elsewhere.

Taxonomic discussion
The "ID" referred to below is the individual DNA barcode specimen ID and serves as a link between the DNA barcode in BOLD and the voucher specimen. The "FinLoc" number denotes the specific collecting sites shown in Figure 1.

Atrichopogon
We collected adults of five species of Atrichopogon representing three subgenera.

Atrichopogon (Atrichopogon) hirtidorsum Remm, 1961
All three females of A. (Atrichopogon) hirtidorsum key to A. fossicola in Goetghebuer (1934) (A. fossicola is listed as a synonym of A. fuscus) and to A. hirtidorsum in Remm (1961) based on the length of the scutal bristles.

Atrichopogon (Atrichopogon) infuscus Goetghebuer, 1929
The single male of A. (Atrichopogon) infuscus keys to A. infuscus both in Goetghebuer (1934) and Remm (1961). The available descriptions of A. infuscus and A. hirtidorsum are very basic and we have not examined types of these species. Thus, more detailed taxonomic revision of these species may change the identity of our examined specimens.

Forcipomyia
Within the genus Forcipomyia we found 18 species distributed in five subgenera (Table 1). For identifying the subgenera we used the key and definitions in Wirth and Ratanaworabhan (1978), Debenham (1987), and a key to the subgenera restricted to Fennoscandia and northern Europe (Borkent unpublished). Alwin and Szadziewksi (2013) recently published a key to the subgenera present in Poland and confirms subgeneric identifications here. Identification of Forcipomyia at the species level are mostly based on the key and figures in Remm (1962), however, additional literature is used in individual cases (see below).

Forcipomyia (Euprojoannisia) sp. 6ES nr. palustris
Forcipomyia sp. 6ES nr. palustris is a species morphologically similar to F. palustris but differs in subtle differences in the male genitalia: The gonocoxal apodemes are narrower apically and with very short lateral projections and posteriorly the ventral prong is more slender and elongate.

Forcipomyia (Forcipomyia) sp. 2ES bipunctata group
The females of F. sp. 2ES have lanceolate setae on all tibiae and elongated seminal capsules. They seem to belong within the bipunctata group (Szadziewski et al. 2007). For species determination an association with male specimens is necessary. Whether these two specimens belong to one or two species is not clear. More material and associations are necessary for accurate determination. Material examined. 2♀♀ (ID: FiCer54, FiCer96) 07 and 08 September 2010, FinLoc85, light trap.

Forcipomyia (Forcipomyia) sp. 3ES bipunctata group
The single female, F. sp. 3ES, with lanceaolate setae on mid and hind tibia, fits within the bipunctata group (Szadziewski et al. 2007). The larger setae on fore tibia are missing (broken) and could be lanceolate or not. This specimen has a wing length of 1.7 mm, like the largest species of this group, F. (Forcipomyia) ciliata (Winnertz, 1852).

Forcipomyia (Forcipomyia) sp. 1ES
The single female specimen of F. sp. 1ES, is genetically relatively close to F. hygrophila but easy to distinguish morphologically (e.g. by the shape of the palpus) (Fig. 3).

Forcipomyia (Forcipomyia) tenuis (Winnertz, 1852)
Forcipomyia tenuis has not been recorded from Scandinavia before, but is known from many other European countries.

Forcipomyia (Synthyridomyia) acidicola (Tokunaga, 1937)
Three species of the subgenus F. (Synthyridomyia) are known from Europe. The single female specimen from Finnmark fits the diagnosis of the subgenus (Wirth and Ratanaworabhan 1978) and Tokunaga's (1937) description of the species. Forcipomyia acidicola has been previously recorded in Norway (Thunes et al. 2004).

Forcipomyia (Synthyridomyia) knockensis Goetghebuer, 1938
The identification of this species is based on the key of Remm (1962) and the redescription by Szadziewski (1983).

Dasyhelea
The Polish species of this genus have recently been revised by Dominiak (2012) which included 30 of the 63 (Dominiak and Szadziewski 2010) known European species.

Dasyhelea (Dasyhelea) bensoni Edwards, 1933
Dasyhelea bensoni is not included in Dominiak's (2012) key but the species is discussed within the description of D. pallidiventris (Goetghebuer, 1931) in her work. An allocation of the two Finnmark females to the species is not definite since the palpal setae are missing on both specimens and no associated males have been collected. Dasyhelea (Dasyhelea) bensoni has been previously recorded in Norway.

Dasyhelea (Dasyhelea) malleola Remm, 1962 (2 cluster)
There are two clusters of D. (Dasyhelea) specimens, both including males and females, which key out to D. (Dasyhelea) malleola and fit within the description for the species provided by Dominiak (2012). Whether or not these specimens are members of one or two species requires more material and a Holarctic revision of the genus. Dasyhelea malleola has been previously recorded in Norway. Between the males, no significant differences could be observed. The females however, differ in the shape of the posterior portion of sternite 9 (projecting anteriorly): subgenital plate elongate and vase shape in FiCer44 (Fig. 5) and widened in FiCer66, FiCer120, FiCer191 and FiCer243 ( Fig. 5 and figure 39 in Dominiak (2012)). The spermatheca of specimen FiCer44 lacked pores and the extension was narrow; spermathecae of specimens FiCer191, FiCer120, FiCer243, and FiCer66 were with pores and the extension thicker). Since we only have one single male and female in the cluster of D. cf. malleola it has to be confirmed with more material if the differences are consistent between the two forms.

Dasyhelea (Dicryptoscena) modesta (Winnertz, 1852)
Several specimens, both males and females, could be assigned to D. modesta (Winnertz, 1852). They fit Dominiak's (2012) interpretation of the species. The genetic distances for CO1 within the species cluster, however, can be as much as 10%, indicating the possibility of more than one species under this name.

Culicoides
Within the genus Culicoides we found seven species representing five subgenera. For identification, the keys and descriptions of Glukhova (2005) were used and, in addition, the key and descriptions in Campbell and Pelham-Clinton (1959) and Delécolle (1985) were consulted.

Culicoides (Beltramyia) salinarius Kieffer, 1914
The males of this species key to C. salinarius in Delécolle (1985). The wings of these specimens have a single pale spot over r-m and CuA 2 is dark, also features of females of this species.

Ceratopogonini Genus Brachypogon
The Brachypogon species collected in Finnmark all belong to the subgenus B. (Isohelea).

Brachypogon (Isohelea) nitidulus (Edwards, 1921)
As mentioned above, there are two clearly divergent clusters of DNA barcodes from specimens identified as B. nitidulus, with a maximum Kimura 2-parameter distance of 5.98% (Figure 2). Specimens from the two clusters were collected at the same time and place and no morphological distinction is observed. We suspect that the fairly large observed COI divergence indicates possible cryptic species in this group. Brachypogon nitidulus has been previously recorded in Norway. The male specimens of the cluster with FiCer04 have a relatively stout palpal segment 3, the males of cluster with FiCer05 have a more slender palpal segment 3.

Genus Ceratopogon
All identifications are based on the generic revision by Borkent and Grogan (1995).

Ceratopogon abstrusus Borkent & Grogan, 1995
Ceratopogon abstrusus was described by Borkent and Grogan (1995) from the Nearctic with a wide range from Alaska to northern Greenland and has been referred by them as "the most broadly distributed of all Ceratopogon species". The record from Finnmark is the first for the Palearctic (other than northern Greenland).
A single pupa was collected in a drift sample (see Fig. 9 in Ekrem et al. 2012).
The pupa from Finnmark is the first record of this genus for Norway. Three European species of Probezzia are Holarctic in distribution (Wirth 1971). The specimen was identified to genus using the key to genera by Borkent (2014b).

Genus Palpomyia
Palpomyia puberula Remm, 1976 The examined female keys to and fits the description of Palpomyia puberula in Remm (1976).

Palpomyia serripes (Meigen, 1818)
The examined males and females key to and fit the description of P. serripes in Remm (1976). The species seems to have a "north-south" rather than a circumpolar distribution.

Ceratopogonidae gen. sp. 1ES
The larvae belong to either Bezzia or Palpomyia. For further identification association with the adult is required.

Discussion
Our relatively cursory sampling of Ceratopogonidae revealed a startling 54 species within nine genera. Of these, 40 could be identified to previously named species, and 14 are apparently either undescribed or are close to previously known species. Considering that no Ceratopogonidae have been previously recorded from Finnmark, this is a substantial increase in numbers and reflects the poorly sampled and interpreted state of this diverse and common family in northern Norway. There are several impediments to our understanding this group in Finnmark. For example, much of our collecting, especially with hand nets, was not focused on Ceratopogonidae, which often require a less delicate sweeping mode than is best for Chironomidae. Most of our specimens were collected with Malaise traps, especially with the trap at locality FinLoc65. Even with these considerable limitations, we uncovered a substantial diversity. Certainly, with further concerted sampling in Finnmark, we would expect to find a significantly more diverse fauna than reported here.
A second impediment to understanding Ceratopogonidae in Finnmark, Norway and Europe in general is the major gaps in taxonomic revisions. For most genera, there are no inclusive European keys, based on examination of types and comparative material and most current revisions are regional or country specific. Even the continent-wide threat of Bluetongue and the Schmallenberg virus, resulting in millions of Euros in losses to livestock, has failed as an incentive to produce a comprehensive taxonomic analysis of the species of Culicoides, some of which act as vectors of these diseases. Further to this, very few revisions have compared Palaearctic and Nearctic species, especially important for northern taxa, and this has made an understanding of the distributions of many species uncertain. In some instances, it is very likely that some Palaearctic and Nearctic species, presently with different names, are actually conspecific.
To complete comprehensive revisions, authors should check all available types. This too is an impediment to our understanding of a number of genera. Many species names are floating because no one has examined the types since they were first described (in some cases over 150 years ago!).
Much of this reflects the general state of support for taxonomy, which is generally poor to non-existent. In the meantime, the Ceratopogonidae are a case in point for the value of future studies. Many species live in peripheral aquatic habitats (edges of streams, ponds and marshes) or in very small water bodies (springs, small pools), habitats that are often under extreme threat on our planet. A better understanding of the fauna of these habitats would reinforce the concept that they need to be protected.
One advantage of the present study, despite the lack of some species names, is that every investigated specimen is DNA barcoded and kept as a voucher in a public collection. This makes it possible to include them in further taxonomic studies, and to associate other life stages at a later point in time when obtained. For morphological species that are represented by more than one barcode cluster (such as Dasyhelea modesta or Brachypogon nitidulus), detailed reexamination of vouchers will be required to discover possible morphological traits that may distinguish new taxa. Moreover, as Anderson et al. (2013) found for the chironomid genus Micropsectra, detailed comparison of multiple life stages, ecology and nuclear molecular markers should clarify whether some of the highly divergent barcode clusters obtained in our study actually represent different biological species.
Overview of sequenced Ceratopogonidae specimens and species from Finnmark.