How reliably can northeast Atlantic sand lances of the genera Ammodytes and Hyperoplus be distinguished? A comparative application of morphological and molecular methods

Abstract Accurate stock assessments for each of the dominant species of sand lances in the northeast Atlantic Ocean and adjacent areas are not available due to the lack of a reliable identification procedure; therefore, appropriate measures of fisheries management or conservation of sand lances cannot be implemented. In this study, detailed morphological and molecular features are assessed to discriminate between four species of sand lances belonging to the genera Ammodytes and Hyperoplus. Morphological characters described by earlier authors as useful for identification of the genera are confirmed, and two additional distinguishing characters are added. A combination of the following morphological characters is recommended to distinguish between the genera Hyperoplus and Ammodytes: the protrusibility of the premaxillae, the presence of hooked ends of the prevomer, the number of dermal plicae, and the pectoral-fin length as a percentage of the standard length. The discriminant function analysis revealed that morphometric data are not very useful to distinguish the species of each of the two genera. The following meristic characters improve the separation of Hyperoplus lanceolatus from Hyperoplus immaculatus: the number of lower arch gill rakers, total number of gill rakers, numbers of caudal vertebrae and total vertebrae, and numbers of dorsal-fin and anal-fin rays. It is confirmed that Ammodytes tobianus differs from Ammodytes marinus by its belly scales that are organised in tight chevrons, scales which are present over the musculature at the base of the caudal fin, as well as by the lower numbers of dermal plicae, dorsal-fin rays, and total vertebrae. In contrast to the morphological data, mitochondrial COI sequences (DNA barcodes) failed to separate unambiguously the four investigated species. Ammodytes tobianus and Hyperoplus lanceolatus showed an overlap between intraspecific and interspecific K2P genetic distances and cannot be reliably distinguished using the common DNA barcoding approach. Ammodytes marinus and Hyperoplus immaculatus exhibited gaps between intraspecific and interspecific K2P distances of 2.73 and 3.34% respectively, indicating that their DNA barcodes can be used for species identification. As an alternative, short nuclear Rhodopsin sequences were analysed and one diagnostic character was found for each of the species Ammodytes marinus, Hyperoplus lanceolatus, and Hyperoplus immaculatus. Ammodytes tobianus can be characterised by the lack of species-specific mutations when compared to the other three species. In contrast to COI, the short nuclear sequences represent a useful alternative for rapid species identification whenever an examination of morphological characters is not available.


Introduction
Sand lances of the family Ammodytidae are small fishes that live primarily in marine and adjacent brackish waters with sandy substrates of the northern hemisphere, where they are able to quickly dive into the substrate to escape predation (Randall andIda 2014, Orr et al. 2015). These fishes are characterised by elongated and subcylindrical bodies and possess relatively low elongated dorsal and anal fins without spines, which are separated from the forked caudal fin (e.g. Reay 1986). The number of principal caudal rays is reduced and there is no pelvic fin in most species (e.g. Ida et al. 1994). Sand lances have an increased number of vertebrae in which the number of pre-caudal vertebrae is higher than the number of caudal vertebrae. The lower jaws project beyond the upper jaws. Small and unobtrusive scales are present (e.g. Reay 1986) and the body is often covered in oblique skinfolds (so-called plicae).
The family Ammodytidae comprises 31 species in seven genera (e.g. Randall andIda 2014, Orr et al. 2015) of which the two genera Ammodytes and Hyperoplus are distributed circumboreally (Ida et al. 1994). Five species of sand lances belonging to three genera occur in northeast Atlantic waters (Sparholt 2015). This includes the Common sand eel Ammodytes tobianus Linnaeus, 1758 and the Lesser sand eel A. marinus Raitt, 1934, currently recognised together with four further species in the genus Ammodytes (Orr et al. 2015). Additionally, both species of the genus Hyperoplus, Corbin´s sand eel Hyperoplus immaculatus (Corbin, 1950) and the Greater sand eel H. lanceolatus (Le Sauvage, 1824), can be found in the eastern north Atlantic area (Reay 1986), as well as Gymnammodytes semisquamatus (Jourdain, 1879). The latter can morphologically be distinguished from the species mentioned above by having a branched lateral line, a body not covered in oblique plicae (Cameron 1959), and scales that are loosely scattered and restricted to the posterior third of the body (Reay 1986), whereas the genera Hyperoplus and Ammodytes exhibit plicae along the body and an unbranched lateral line.
In identification keys these two genera are often distinguished by showing clear protrusible premaxillae and no vomerine teeth (Ammodytes) or no clear protrusible premaxillae and a pair of vomerine teeth (Hyperoplus, e.g. Reay 1986). Hyperoplus lanceolatus can be separated from H. immaculatus by the occurrence of a conspicuous dark spot on either side of the snout below the anterior nostril. This spot is lacking in H. immaculatus. Ammodytes tobianus is generally distinguished from A. marinus by its characteristic belly scales that are organised in tight chevrons and having scales present over the musculature at the base of the caudal fin, whereas these features are not present in A. marinus (Reay 1986).
However, the distinguishing features mentioned above are not easy to observe for the untrained eye when comparative material of different species is not available. Furthermore, an accurate species identification, especially of juvenile individuals, is difficult and even sub-adult and adult sand lances are difficult to identify (Sparholt 2015), if identification procedure is restricted to the few morphological characters mentioned above. In this context, Naevdal and Thorkildsen (2002) mentioned the difficulties regarding morphological separation of some of the five species of sand lances found in the northeast Atlantic and suggested a method for successful species identification on the basis of allozyme variation. DNA restriction fragment patterns have also been proposed to distinguish between some of the Atlantic sand eel species (Mitchell et al. 1998) as an alternative to morphological characters.
The difficult identification of sand lance species contributed to the current situation that accurate stock assessments are not available separately for each of the species in the North Sea and adjacent areas (see Sparholt 2015). However, sand lances here are subject to large-scale, industrial exploitation for fish meal and oil production and are also a major prey for many predators such as piscivorous fish, birds, and mammals (e.g. Reay 1986). It is known that exploitation of sand lances affects the food availability for these predators and that the abundance of sand lances is sensitive to recruitment variation (Sparholt 2015). Sand lances are divided into seven stock components for stock assessments in the North Sea based on the most abundant species A. marinus. With this approach, the stock situation of the single species cannot be evaluated, as it does not consider that sand lances represent a mix of different species. Clearly, another drawback is that an evaluation of the conservation status of the single species of sand lances is not possible (Thiel et al. 2013).
Molecular-based identification methods of fish species have been developed over the last decades (for an overview see Teletchea 2009). In this context, DNA barcoding constitutes the most popular and effective technique by using partial cytochrome c oxidase subunit I (COI) sequences for a standardised and routine identification of specimens to species level (Hebert et al. 2003). For a successful application of DNA barcoding as a tool for specimen identification, reliable sequence reference libraries such as the Barcode of Life Database (BOLD, Ratnasingham and Hebert 2007) were developed. Newly generated DNA barcodes can be uploaded and analysed together with data already available on BOLD in order to provide taxonomic identification. Additionally, barcode sequences were automatically analysed on BOLD and a Bar-code Index Number (BIN) is assigned according to the calculated sequence clusters (Ratnasingham and Hebert 2013). Taxonomic conflicts apparently occur if sequences assigned to the same species name can be found within different BIN clusters.
For fish, the species discrimination success of DNA barcoding was demonstrated in many studies including freshwater as well as marine faunas from many regions all over the world (e.g. Ward et al. 2005, Hubert et al. 2008, Ward et al. 2008a, April et al. 2011, Mabragaña et al. 2011, Costa et al. 2012, Zhang and Hanner 2012, Keskin and Atar 2013, McCusker et al. 2013, Geiger et al. 2014, Knebelsberger and Thiel 2014, Knebelsberger et al. 2015. DNA barcodes have also been successfully used to identify fish larvae (Pegg et al. 2006, Victor et al. 2009, Hubert et al. 2010, Kim et al. 2010, and fins ), and can provide evidence for cryptic diversity (Hubert et al. 2010, Ward et al. 2008b, Puckridge et al. 2013, Geiger et al. 2014, Knebelsberger et al. 2015. For the North Sea and adjacent areas, two DNA barcoding studies revealed successful differentiation of all investigated species Thiel 2014). Altogether, 105 species belonging to 88 genera were analysed. Most of the genera were represented by only one species. As an exception, the genus Pomatoschistus was represented by five closely related species.
One of these studies already provided DNA barcodes for the two sand lance species A. marinus and H. immaculatus and demonstrated a clear separation of these two species . A former study from continental Portugal Atlantic waters included DNA barcodes for H. lanceolatus but other species of sand lances were missing (Costa et al. 2012). Studies including congeneric species of the genus Ammodytes revealed inconsistencies between morphological and DNA barcode-based identification: for two species from the northwest Atlantic Ocean, namely A. americanus and A. dubius, barcoding fails to separate these species, which may be caused by inadequate taxonomy (McCusker et al. 2013). Inadequate taxonomy may also concern the two species A. personatus and A. hexapterus from the north Pacific (Turanov and Kartavtsev 2014). In both cases the taxonomic status of the species is questionable and may require comprehensive taxonomic revision. In order to examine the application of DNA barcoding for the identification of sand lances from the North Sea area, all closely related species from this region must be included. This concerns A. marinus and H. immaculatus as well as A. tobianus and H. lanceolatus. For the latter two species reliable COI data from the North Sea are still missing. This paper presents the first comprehensive study combining morphological and molecular methods for the discrimination of four species of sand lances belonging to the genera Ammodytes and Hyperoplus occurring in the northeast Atlantic Ocean and adjacent waters. The suitability of two morphological types of parameters (meristic characters and morphometric measurements) and two genetic approaches (mitochondrial COI (DNA barcoding region) and partial nuclear Rhodopsin DNA sequences) for accurate species identification is examined. A detailed and accurate species identification matrix is presented, based on the integration of morphological and molecular traits.

Material
In this study 85 specimens representing two species of genus Ammodytes and two species of genus Hyperoplus were sampled from the North and the Baltic Seas (Suppl. material 1 and 2, Figure 1). For the molecular analysis 70 samples were collected from the North Sea during several cruises conducted by the Thünen Institute of Sea Fisheries (Hamburg, Germany) and the research vessel of the Senckenberg Institute (Wilhelmshaven, Germany). Tissue samples were taken from each of the 70 specimens and preserved in 96% ethanol for molecular analysis at Senckenberg's German Center for Marine Biodiversity Research (DZMB, Wilhelmshaven, Germany). Specimens were preserved in 70% ethanol. The remaining 15 individuals belonging to the species A. tobianus were used for morphological analyses only and collected from the Baltic Sea during three different cruises conducted by the German Oceangraphic Museum (Stralsund, Germany). Immediately after catch, specimens were preserved in 4% formaldehyde solution. All   85 voucher specimens were databased and morphologically investigated at the Zoological Museum of the Center of Natural History of the University of Hamburg (ZMH, Hamburg, Germany). Finally, the material was stored for future reference in the ZMH fish collection. All COI sequences and related metadata belonging to the 70 voucher specimens from the North Sea are available on the Barcode of Life Data System (www. barcodinglife.org; Ratnasingham and Hebert 2007). DNA barcodes of eight specimens of H. immaculatus and 22 specimens of A. marinus were obtained from the BOLD project "Barcoding North Sea Fish I" (BNSFI) . Newly generated barcodes belonging to five specimens of A. marinus, six of A. tobianus, and 29 of H. lanceolatus were uploaded to the BOLD project "Barcoding North Sea Sand eels" (BNSSE). In addition to COI, nuclear Rhodopsin DNA sequences were generated from all 70 North Sea specimens (Suppl. material 1). For comparison, published Rhodopsin data was downloaded from GenBank (A. tobianus: AY141306; H. lanceolatus: EU492010 and EU492011).

Morphological analyses
Meristic parameters (Table 1) were analysed at the left-hand side of the specimens and supplemented with right-side counts when the left side was damaged. Counts of dorsal, ventral and principal caudal-fin rays as well as of vertebrae were taken from radiographs ( Figure 2) made by an X-ray imaging system (Faxitron LX-60). The first caudal vertebra was defined as the first centrum with a long haemal spine, and the centrum fused to the hypural plate was counted as the last vertebra. Counts of dorsal-fin rays were made using the method of Nizinski et al. (1990). Counting dorsal-fin rays began with the first visible ray and excluded the one or two anterior rayless pterygiophores. However, these counts included the last two rays that were each supported by a pterygiophore. Counts of anal-fin rays included all rays visible from the outside. Gill rakers were counted on the lower and upper arch separately. Gill rakers of the lower arch included the raker at the junction between upper and lower parts of the arch. Dermal plicae included those anterior and posterior to the lateral-line pores.
Morphometric measurements (Table 2) were taken by vernier calipers to one tenth of a millimetre. Measurements were done following Hubbs and Lagler´s (1958) method, with the following changes: standard length (SL) was measured from the front of the upper lip in the median plane to the midbase of the caudal fin (end of hypural plate). The front of the upper lip was used as the anterior point of all other horizontal measurements. Head length (HL) was measured from the front of the upper lip to the posterior end of the opercular membrane. Body depth was measured twice, as the depth at the beginning of the base of the dorsal fin (BDD) and as the depth at the beginning of the base of the anal fin (BDA). Body width was measured as the maximum width at the beginning of the base of the dorsal fin (BWD). Orbit diameter (OD) is the maximum fleshy diameter. Interorbital width (IW) is the least fleshy width. Caudal-peduncle depth (CPD) is the smallest depth, and caudal peduncle length (CPL) the horizontal Table 1. Data of estimated morphological characters of four species of sand lances of the genera Ammodytes and Hyperoplus. If possible, each meristic character and morphometric measurement is presented with its range (before the semicolon), mean value with standard deviation and the number of specimens analysed (in brackets). Morphometric measurements are given as proportion of SL.

Statistical treatment of morphological data
All morphological data were statistically processed, involving ranges, means, and standard deviations. Morphological data of all specimens that had a complete suite of meristic and morphometric character data were used to conduct two multiple discriminant function analyses (DFA) to determine if the four species of sand lances could be differentiated based on meristic and/or morphometric parameters using XLSTAT (version 2013.0.04, Addinsoft), a statistical analysis add-in for Microsoft Excel®. DFA was used to demonstrate the degree of separation in multivariate space defined by the main pat-terns of morphological variation among species which is described via the discriminant functions. It also shows which character contributes more to the differentiation. The standardised discriminant function coefficients represent the contributions of every variable to the discriminatory power of the function. Hence, the larger the standardised coefficient, the larger the weight of the variable in the function. Both discriminant analyses were conducted for 76 individuals (22 A. marinus, 20 A. tobianus, 8 H. immaculatus, 26 H. lanceolatus). Morphological variables without any variation (e.g. principal caudal-fin rays (CR)), variables, where other variables are included (e.g. total vertebrae (TV)) and qualitative variables (e.g. premaxillae clearly protrusible (PCP)) were excluded from the DFA procedures. The first DFA was performed for the following eight quantitative meristic characters: dermal plicae, dorsal-fin rays, anal-fin rays, pectoral-fin rays, upper arch gill rakers, lower arch gill rakers, precaudal vertebrae, and caudal vertebrae. The second DFA was conducted for the following 19 morphometric parameters: body depth at dorsal-fin origin, body depth at anal-fin origin, body width at dorsal-fin origin, head length, snout length, orbit diameter, interorbital width, upper jaw length, caudal peduncle depth, caudal peduncle length, prepectoral length, predorsal length, preanal length, pectoral-fin length, dorsal-fin base length, anal-fin base length, caudal-fin length, dorsal-fin height, and anal-fin height.

Sequence alignment and data analyses
Forward and reverse sequences of COI and Rhodpsin were assembled and edited using Geneious (version 7.1.9. http://www.geneious.com). Consensus sequences were submitted to GenBank (for accession numbers see Suppl. material 1). Variance in sequence length, base composition, number of invariable sites and the presence of stop codons were analysed using Geneious. The nc and mt sequences were aligned independently using MUSCLE (Edgar 2004) with default settings as implemented in MEGA version 6.06 (Tamura et al. 2013). Primer sequences were cut from the alignment. Rhodopsin gene sequence alignment was checked by eye for species specific diagnostic characters. For COI, Kimura-2-parameter (K2P) distances were calculated in MEGA, as K2P is used as standard model for barcoding analyses and enables direct comparison with other studies. Neighbour-Joining (NJ) topology (Saitou and Nei 1987) was built in MEGA using the "pairwise deletion" option for the treatment of gaps and missing data, in order to retain all sites initially, excluding them as necessary. Node support for the NJ topology was evaluated by a non-parametric bootstrap analysis (Felsenstein 1985) with 10,000 replicates. In order to quantify the distinctness between species at the barcode locus, genetic distances were used to calculate the difference between the maximum intraspecific genetic distance and the minimum distance to the nearest neighbor (barcode gap). For the calculation of genetic distances at genus and family level, we used BOLDs "Distance Summary" tool by choosing K2P distance model and MUSCLE (Edgar 2004) alignment algorithm. On BOLD, DNA barcodes were automatically assigned to operational taxonomic units (OTUs), generated through Refined Single Linkage (RESL) analyses (Ratnasingham and Hebert 2013). Finally, a unique alphanumeric code is assigned to each of the OTUs, constituting the so called barcode index number (BIN). It has been shown that BINs are highly congruent with existing species assignments (Ratnasingham and Hebert 2013). Here, the 'BIN Discordance Report' analysis tool was applied to analyse our dataset together with public sequences on BOLD, and to get hints on cryptic diversity (species) or to identify cases of haplotype sharing between species. Furthermore, BOLD's "Diagnostic Characters" sequence analysis tool was applied to the COI dataset choosing MUSCLE (Edgar 2004) alignment algorithm. Sequences were grouped by species names in order to categorise consensus bases by their diagnostic potential.

General results of morphological analysis
Meristic characters and morphometric measurements of the four examined species of sand lances are given in Table 1. The number of individuals per analysed character ranged from 24 to 27 for A. marinus, from 20 to 21 for A. tobianus, from 28 to 29 for H. lanceolatus, and was eight individuals for H. immaculatus (Table 1).
The data of the present study confirmed that the two genera of Ammodytes and Hyperoplus can be distinguished by qualitative meristic characters, i.e. by having a clear protrusible premaxillae (PCP) and no vomerine teeth (VTP) (Ammodytes) or no clear protrusible premaxillae and two vomerine teeth (Hyperoplus) ( Table 1). Hyperoplus can also be separated from Ammodytes by its significantly higher number of dermal plicae (DP). It is also possible to distinguish Hyperoplus from Ammodytes by its obviously lower pectoral-fin length (PFL), and to a somewhat lesser significance, also by its greater mean snout length (SNL), since no sexual dimorphism has been reported for the last two characters in both genera.
Hyperoplus lanceolatus can be separated from H. immaculatus by the presence of a conspicuous dark spot on either side of snout (DSSS) which is lacking in H. immaculatus (Table 1). Furthermore, H. lanceolatus differs from H. immaculatus by its lower numbers of total and lower arch gill rakers (GR, LR), total and caudal vertebrae (TV, CV), as well as dorsal and anal-fin rays (DR, AR).
Ammodytes tobianus can be distinguished from A. marinus by having belly scales that are organised in tight chevrons (BSTC) and having scales present over the musculature at the base of the caudal fin (SBCF) and in the midline anterior to dorsal fin (SADF), whereas these characters are not present in A. marinus (Table 1). Ammodytes tobianus differs from A. marinus also by its lower numbers of dermal plicae (DP), dorsal-fin rays (DR), and precaudal and total vertebrae numbers (PV, TV).

Discriminant Function Analysis with meristic characters
DFA based on meristic characters provided three significant functions (Box-Test with χ 2 =790.916 and p<0.0001; Wilks´ lambda= 0.0003 and p<0.0001). These three functions explain 100% of the total variation in the data. The first two functions explain 91,755% of the total variation in the data (Table 2), which is sufficient for the further detailed analysis. The third discriminant function explains 8.245% of total variation. Individual specimens are projected onto the first two discriminant functions in Figure 3. Because all four species were clearly separated in the discriminant space defined by the first two functions, the third function was not used. The first discriminant function explains 71.438% of total variation (  (Figure 3). Ammodytes tobianus and A. marinus cannot be so clearly separated by the first discriminant function.
From the standardised coefficients (Table 2), the two characters that have the greatest influence on the first discriminant function (characters most discriminatory) are the dermal plicae (DP) and lower arch gill rakers (LR) (  number of lower arch gill rakers (LR) is higher in H. immaculatus in comparison with the other three species, which have overlapping numbers of LR. The second discriminant function accounts for 20.317% of total variation. Ammodytes tobianus and A. marinus are clearly and the species pairs of A. tobianus and H. immaculatus, A. marinus and H. lanceolatus as well as H. immaculatus and H. lanceolatus are to a lesser extent discriminated by this function. Ammodytes tobianus and H. lanceolatus and A. marinus and H. immaculatus cannot be clearly separated by the second discriminant function. The contrasts between the numbers of dorsal-fin rays (DR) and the numbers of precaudal vertebrae (PV) of the species are mainly responsible for this discrimination. DR is lowest in A. tobianus and highest in H. immaculatus (Table  1). PV is lowest in A. tobianus and highest in A. marinus and H. immaculatus.

Discriminant Function Analysis with morphometric measurements
Three significant DFA functions were estimated based on morphometric measurements (Box-Test with χ 2 = 944.979 and p < 0.0001; Wilks´ lambda = 0.003 and p < 0.0001). Together these functions explain 100% of the total variation in the data. The first two functions explain 93.144% of the total variation in the data (Table 3), which is sufficient for the further detailed analysis. The third discriminant function explains 6.856% of total variation. Figure 4 presents the individual specimens projected onto the first two discriminant functions. Because all four species were clearly separated in the discriminant space defined by the first two functions, the third function was not used. The first discriminant function explains 78.576% of total variation (  (Figure 4). The species pairs of A. marinus and A. tobianus as well as of H. immaculatus and H. lanceolatus cannot be separated by the first discriminant function.
The two measurement characters that have the greatest weight on the first discriminant function are pectoral-fin length (PFL) and the snout length (SNL) ( Table  3). Both species of the genus Ammodytes have a greater PFL than both species of the genus Hyperoplus (Table 1). In contrast, both Hyperoplus species have a greater SNL than both Ammodytes species. PFL and SNL are relatively similar for the species of the same genera.  The second discriminant function accounts for 14.568% of total variation. Especially the species within the genera Ammodytes and Hyperoplus, namely A. tobianus and A. marinus as well as H. immaculatus and H. lanceolatus are separated by this function (Figure 4).
Upper jaw length (UJL) and caudal peduncle depth (CPD) are the two measurements, for which no sexual dimorphism is known, and that have the greatest weight on the second discriminant function (Table 3).

Mt DNA barcoding
Mitochondrial DNA barcodes were obtained for 70 specimens belonging to four species of the family Ammodytidae investigated in this study (Suppl. material 1). The DNA sequences did not show any ambiguous base calls (Ns) or stop codons, and no insertions or deletions were found within the sequence alignment. Sequence length ranged from 619 to 652 bp (mean and standard deviation: 650.5 ± 5.7 bp). The average base composition was 22.8% adenine (A), 29.7% cytosine (C), 18.3% guanine (G) and 29.3% thymine (T); GC content was 48%. The sequence alignment showed 588 identical sites.   The NJ analysis of the K2P distances revealed well supported monophyletic clusters for the species A. marinus and H. immaculatus with bootstrap values of 97 and 100, respectively ( Figure 5). In contrast, A. tobianus and H. lanceolatus sequences were grouped together in one monophyletic cluster with a bootstrap support of 100. Within this cluster the sequences of A. tobianus were grouped together without bootstrap support indicating that there is no sharing of haplotypes between these two species. The analysis of the K2P genetic distances revealed an overlap between intraspecific (range: 0.0-0.77%; mean and standard deviation: 0.22 ± 0.17%) and interspecific distances (0.15-7.27%; 4.73 ± 1.7%). The overlap was caused by the two species A. tobianus and H. lanceolatus: in A. tobianus, the minimum distance to the nearest neighbour species was even lower than the maximum intraspecific distance, whereas both values were equal in H. lanceolatus (Table 4). In contrast, the species A. marinus and H. immaculatus exhibited barcode gaps of 2.73% and 3.34% respectively, which indicates an undoubtedly separation from the other species. At genus and family level, the genetic distances between species of the same genus varied between 4.46-7.09% and the distances between species belonging to different ranged from 0.15-7.27%.

BIN report
The BIN discordance report tool on BOLD assigned three different BIN numbers to the 70 COI haplotypes. The BIN BOLD:ACF3320 was found to be "concordant" and exclusively comprised 32 specimens of the species Ammodytes marinus, of which five individuals were not provided by this study. The "discordant" BIN BOLD:AAC5676 comprised 57 specimens, 14 identified as Ammodytes tobianus and 43 as Hyperoplus lanceolatus. From the former species eight specimens and from the latter 14 specimens were not provided by our study but also support the findings of this study. The third BIN BOLD:AAJ2299 was also specified as discordant and comprised ten specimens, eight (in our study) identified as Hyperoplus immaculatus and two identified as Ammodytes marinus. The two A. marinus entries may represent cases of misidentification as 32 A. marinus individuals were grouped together in BIN BOLD:ACF3320.

Diagnostic characters
The analysis revealed four diagnostic characters for the species A. marinus and 16 for H. immaculatus (results not shown). The two species A. tobianus and H. lanceolatus did not show any diagnostic characters on species level. Consequently, only two of the four investigated species can be identified using diagnostic characters on the basis of COI barcode sequences.

Nc DNA analysis
The nc Rhodopsin sequence alignment showed a length of 464 bp after primer trimming. The complete fragment could be amplified and sequenced for all 70 specimens used for the mt DNA barcode analysis. The number of variable sites was very low and the alignment could be easily evaluated by eye. One diagnostic character was found for each of the species A. marinus (Table 5;  . Ammodytes tobianus showed no species specific mutation but could be characterised by a combination of all tree variable sites (Table 5, underlined bases). The Rhodopsin sequences from Gen-Bank were compared with our sequences; the two H. lanceolatus sequences (GenBank accessions: EU492010, EU492011) showed concordant results. In the case of the A. tobianus sequence (GenBank accession: AY141306) no data was available for site 460 but the two other sites were in agreement with our results.

Identification of genera and species using morphological characters
The primary objective of this study was to contribute to robust genera-and specieslevel identifications, combining morphological and molecular methods, of four closely related species of sand lances of the genera Ammodytes and Hyperoplus occurring in the northeast Atlantic Ocean and adjacent waters. The detailed morphological analyses confirmed findings described by other authors (e.g. Duncker andMohr 1939, Reay 1986): the genus Ammodytes can be distinguished by two morphological characters from the genus Hyperoplus. Ammodytes has clear protrusible premaxillae and no vomerine teeth. In contrast, Hyperoplus has no clear protrusible premaxillae and a pair of vomerine teeth. It should be noted here that Kayser (1961) found out that Hyperoplus has no real vomerine teeth, but anterior hooked ends of the prevomer instead.
Subsequently, Ida et al. (1994) pointed out that the tip of the prevomer in Ammodytes is straight, not protruded from the roof of the mouth, whereas in Hyperoplus the tip of the prevomer curved downwards, protruding from the roof of the mouth. According to Wiecaszek et al. (2007), the genus Ammodytes also has a longer lower jaw when compared to the length of pectoral-fin, while this relationship is reversed in Hyperoplus.
This study adds three more characters helpful in distinguishing between both genera of sand lances based on the four species considered. Firstly, the number of dermal plicae is significantly higher in Hyperoplus compared to Ammodytes. Secondly, Hyperoplus has a lower pectoral-fin length in relation to standard length (SL) than Ammodytes. Goltberg (1910) also reported a lower value of pectoral-fin length expressed as a proportion of head length for H. lanceolatus than for A. tobianus. Thirdly, Hyperoplus has a larger mean snout length in % SL than Ammodytes. However, the last mentioned character is less recommended for practical taxonomical assignments, since its ranges overlap between the genera to a relatively large extent. Therefore, a combination of the following four characters remains, which seems to be useful to distinguish between the genera Hyperoplus and Ammodytes: protrusibility of premaxillae, presence of the hooked ends of prevomer, number of dermal plicae, and pectoral-fin length in % SL.
As indicated by the results of discriminant function analysis, morphometric measurements seem not to be characters of the first choice to distinguish the two species of each of the two genera, since they could not be discriminated by the first discriminant function.
According to the results presented here, six meristic characters (the number of lower arch gill rakers, the total number of gill rakers, the number of caudal vertebrae, the number of total vertebrae, and the number of dorsal-fin and anal-fin rays) are more useful than morphometric measurements to distinguish between H. immaculatus and H. lanceolatus. The use of these additional characters would support and refine the current methods to separate H. lanceolatus from H. immaculatus. Searching only for the occurrence of a conspicuous dark spot on either side of snout below anterior nostril could be unsuccessful in the case of preserved specimens.
In the case of A. tobianus and A. marinus, these results support the information on useful distinguishing characters between both species reported for instance by Reay (1986): A. tobianus differs from A. marinus by its belly scales that are organised in tight chevrons, scales which are present over musculature at base of caudal fin, as well as by lower numbers of dermal plicae, dorsal-fin rays and vertebrae. It should be mentioned that our analyses included also A. tobianus from the Baltic Sea, for which no meristic or morphometric data had been published, except for the number of vertebrae and pectoral-fin length (Wiecaszek et al. 2007).

Discrimination of genera and species based on molecular data
The successful discrimination of the two sand lance species A. marinus and H. immaculatus by DNA barcoding was already demonstrated by  and could be confirmed by the present study. An additional three specimens of A. marinus were added to the dataset and the NJ analysis revealed well-supported monophyletic species clusters for A. marinus and H. immaculatus, indicating an unambiguous separation of these two species. Successful species discrimination can also be demonstrated by the presence of gaps between intra-and interspecific genetic distances (Meyer and Paulay 2005), which were in case of A. marinus and H. immaculatus 2.73 and 3.34 respectively. The BIN analysis performed on BOLD revealed two separate species BINs: one concordant BIN exclusively contained sequences which were taxonomically annotated as A. marinus, and a second discordant BIN contained all specimens of H. lanceolatus and two further entries referring to as A. marinus. These two individuals were provided by other sources and may represent cases of misidentification, as all other A. marinus entries appeared in the concordant BIN.
Surprisingly, the two species A. tobianus and H. lanceolatus belonging to different genera cannot be clearly separated on the basis of genetic distances, as the lowest distance (K2P) between these two species was only 0.15% and within species variation was found to be 0.15 and 0.62% respectively. In the NJ dendrogram both species appeared together in a well supported clade and were also found within the same BIN cluster when analysed together with data on BOLD. However, A. tobianus and H. lanceolatus do not show haplotype sharing, as A. tobianus sequences appeared together in a separate cluster. The two species may therefore be separated by applying tree-based approaches like GMYC or model-based ones like ABGD.
In contrast to the barcoding results, both genera of Ammodytes and Hyperoplus can undoubtedly be separated by morphological character traits as discussed above. DNA barcoding failure between closely related congeneric species is usually more common than between species belonging to different genera (e.g. McCusker et al. 2013, Knebelsberger et al. 2015. For congeneric species of the genus Ammodytes inconsistencies between morphological data and DNA barcodes have already been demonstrated. For instance A. americanus DeKay, 1842 and A. dubius Reinhardt, 1837 from the northwest Atlantic Ocean could not be separated by DNA barcoding, possibly caused by inadequate taxonomy (McCusker et. al. 2013), which may also concern the two species A. personatus Girard, 1856 and A. hexapterus Pallas, 1814 from the north Pacific (Turanov and Kartavtsev 2014).
In the present work, inadequate taxonomy, erroneous species designation or identification error can be excluded as possible explanation for DNA barcoding failure in unambiguously separating A. tobianus from H. lanceolatus. In addition, true biological phenomena such as the occurrence of hybridisation or incomplete lineage sorting seem to be unlikely, as no interspecific haplotype sharing was found. In cases where mitochondrial COI sequences fail to distinguish between species, the application of nuclear DNA markers may be tested alternatively. In fish, the nuclear Rhodopsin gene has already been proposed as supplementary marker in order to identify species (Sevilla et al. 2007). However, most studies demonstrated reduced species discrimination success using nuclear Rhodopsin sequences compared to COI barcodes , Collins et al. 2012, Behrens-Chapuis et al. 2015. In our study, the analysis of a short nuclear Rhodpsin gene fragment revealed diagnostic nucleotides for the species Ammodytes marinus, H. lanceolatus and H. immaculatus. The species Ammodytes tobianus can be characterised by the lack of species specific mutations compared to the other three species. Consequently, all four species of sand lances can be identified using the diagnostic character approach in combination with nuclear Rhodopsin sequences. In contrast to that, COI provided diagnostic characters only for the two species A. marinus and H. immaculatus. Ammodytes tobianus and H. lanceolatus cannot be characterised by this approach.
Our study clearly demonstrated that nuclear Rhodopsin constitutes a preferable alternative marker to discriminate successfully between the four investigated species of sand lances.
Finally, it should be pointed out that the present results are not meant to provide a phylogenetic reconstruction with regard to the genera Ammodytes and Hyperoplus, since the latter requires a more detailed study of more species of both genera, as well as other members of the group. However, accurate identification of these sand lance species is the basis to assess the status of their stocks and to implement appropriate measures of fisheries management or conservation, and as such, the aim of successfully identifying the NE Atlantic species has been accomplished.

Conclusion
With this study a robust genus-and species-level discrimination of the four most abundant and closely related species of sand lances of the genera Ammodytes and Hyperoplus in the NE Atlantic Ocean and adjacent waters has been provided. It is expected that these results will facilitate the accurate identification of A. marinus, A. tobianus, H. immaculatus, and H. lanceolatus combining morphological and molecular methods.

Table S1
Authors: Ralf Thiel, Thomas Knebelsberger Data type: specimen data Explanation note: Supplementary metadata for specimens used for both morphological and genetic analyses; Museum and Sample IDs are specimen identifiers, BOLD Process IDs are unique codes automatically generated for each record on BOLD, GenBank Accession NOs represent sequence identifiers. Copyright notice: This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.

Table S2
Authors: Ralf Thiel, Thomas Knebelsberger Data type: specimen data Explanation note: Museum IDs and collection data for specimens of Ammodytes tobianus used for morphological analyses only. Copyright notice: This dataset is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). The Open Database License (ODbL) is a license agreement intended to allow users to freely share, modify, and use this Dataset while maintaining this same freedom for others, provided that the original source and author(s) are credited.