Reidentification of Decapterus macarellus and D. macrosoma (Carangidae) reveals inconsistencies with current morphological taxonomy in China

Abstract Decapterus macarellus and D. macrosoma are economically important pelagic fish species that are widely distributed in tropical and subtropical seas. The two species are often mistakenly identified due to their morphological similarities as described in the Chinese literature on fish identification. In this study, D. macarellus and D. macrosoma samples were collected in the Eastern Indian Ocean and the South China Sea and reidentified using morphological and DNA barcoding techniques. The characteristics that distinguish the two species primarily include the scute coverage of the straight portion of the lateral line (the most indicative characteristic for classification), the shape of the predorsal scaled area and its location relative to the middle axis of the eye, and the shapes of the posterior margin of the maxilla and the posterior margin of the operculum. The results revealed a large number of misidentified sequences among the homologous cytochrome oxidase (COI) sequences of the two species in the NCBI database and that the genus Decapterus may include cryptic species. In terms of genetic structure, the Sundaland has not blocked genetic exchange between D. macarellus populations in the South China Sea and the Eastern Indian Ocean, giving rise to a high level of genetic diversity. In this study, we made corrections to the Chinese classification standards for D. macarellus and D. macrosoma and the erroneous reference sequences in the NCBI database, thereby providing accurate reference points for the future exploration of cryptic species in the genus Decapterus.


Introduction
Fish species of the genus Decapterus in the family Carangidae are pelagic fish widely distributed in tropical and subtropical waters around the world and are generally of high economic value. Fishes of the genus Decapterus present one free finlet behind the second dorsal fin and the anal fin and varying degrees of scute coverage along the straight-line portion of the lateral line but no coverage along the curved portion of the lateral line. These characteristics make the fishes easily distinguishable from other species of the family Carangidae (Smith-Vaniz 1999). Currently, the genus Decapterus includes 11 species worldwide: D. akaadsi Abe, 1958, D. koheru (Hector, 1875, D. kurroides Bleeker, 1855, D. macarellus (Cuvier, 1833), D. macrosoma Bleeker, 1851, D. maruadsi (Temminck & Schlegel, 1843, D. muroadsi (Temminck & Schlegel, 1843), D. punctatus (Cuvier, 1829), D. russelli (Rüppell, 1830), D. tabl Berry, 1968, andD. smithvanizi Kimura, Katahira & Kuriiwa, 2013(Kimura et al. 2013. Decapterus macrosoma (shortfin scad) and D. macarellus (mackerel scad) are morphologically similar and thus often confused with each other. In Chinese literatures on fish morphological classification, the morphological descriptions of D. macrosoma and D. macarellus are largely incorrect (Zhu et al. 1962(Zhu et al. , 1963(Zhu et al. , 1979(Zhu et al. , 1985Cheng and Zheng 1987;Meng et al. 1995); for example, "D. macarellus shows a convex posterior end of maxilla, and the majority of the rear straight-line portion the lateral line is covered with scutes" and "D. macrosoma shows a truncate posterior end of maxilla, and scutes cover the rear half of the straight-line portion of the lateral line". These descriptions contradict those from international studies, particularly those of type specimen morphology (Cuvier and Valenciennes 1833;Bleeker 1851;Nakabo 2013). Thus, in this study, samples of D. macarellus and D. macrosoma were collected from surveys of the fishery resources in the South China Sea and the Eastern Indian Ocean and were morphologically reidentified.
The mitochondrial cytochrome oxidase (COI) gene fragment varies little within species but significantly between species; this fragment can be amplified via polymerase chain reaction (PCR) using universal primers and standardized experimental procedures and is thus employed for DNA barcoding, which has been widely accepted and utilized (Hebert et al. 2003) for identifying species (Li et al. 2019a;Xu et al. 2019), discovering new species and new records Chao et al. 2019;Wu et al. 2020), identifying cryptic species (Cheng and Sha 2017;Delrieu-Trottin et al. 2018), identifying ichthyoplankton species (Hubert et al. 2015, Li et al. 2017, and detecting invasive species (Hernández-Triana et al. 2019), among other purposes. Therefore, in this study, we employed DNA barcoding to genetically compare D. macarellus and D. macrosoma and then aligned the sequences with homologous sequences retrieved from GenBank for further analysis. The barrier formed by the Sundaland has caused the differentiation of various fish species, e.g., Pampus chinensis (Euphrasen, 1788) (Li et al. 2019b), between the Indian and Pacific Oceans. The question of whether the geographical barrier formed by the Sundaland has also driven species differentiation in the genus Decapterus will be addressed in this study based on the samples collected during surveys of the South China Sea and the Eastern Indian Ocean.
In summary, we aimed to reevaluate D. macarellus and D. macrosoma by combining morphological analysis with molecular genetics to discern the major diagnostic morphological characteristics and correct DNA barcoding for identification and to provide a timeline for the differentiation of the two species. The findings of this study can provide a scientific reference for the classification of fishes in China and the identification of Carangidae fishes and a theoretical basis for the protection, utilization, development and management of Decapterus species germplasm resources.

Sample collection
Decapterus macarellus and D. macrosoma samples were collected from the South China Sea (10°N, 110°30'E) and the Eastern Indian Ocean (2°N, 88°E) in July and October 2019, respectively (Fig. 1); both species were collected from the South China Sea with light purse seining, whereas D. macarellus samples were collected from the Eastern Indian Ocean using lightnet lifting. Morphological identification of all samples was conducted with reference to Nakabo (2013) and Yamada et al. (2009). From the samples, 24 individuals of D. macarellus (A1~A24) and 21 individuals of D. macrosoma (B1~B21) from the South China Sea, in addition to 24 individuals of D. macarellus from the Eastern Indian Ocean, were randomly selected; the dorsal muscle was excised from each and preserved in 95% alcohol for use in subsequent molecular genetic analysis.

Morphological analysis
Using the methods of Kimura and Suzuki (1981) and Xu and Huang (1983), morphological measurements and description of the fish samples were conducted. The countable characteristics included spines and rays in the dorsal fin, rays in the pectoral fin, spines and rays in the pelvic fin, spines and rays in the anal fin, rays in the caudal fin, scutes, and vertebrae (counted from X-ray images), and the measurable characteristics included body length and fork length, which were performed using a Vernier caliper with an accuracy of 0.1 mm. The major morphological diagnostic characteristics included the location on the top of the head reached by the scaled area, the distribution of scutes in the straight-line portion of the lateral line, the morphological characteristics of the scutes, the shape of the posterior margin of the maxilla, and the shape of the posterior margin of the operculum.

Molecular analysis
Genomic DNA was extracted from specimens of both Decapterus species with a Qiagen DNeasy Kit and stored at 4 °C. Using universal primers for the mitochondrial COI gene fragment (F2: 5 '-TCGACTAATCATAAAGATATCGGCAC-3'; R2: 5'-ACTTCAGGGTGACCGAAGAATCAGAA-3') (Ward et al. 2005), the targeted fragment was amplified in a 25 μL PCR system consisting of 17.5 μL of ddH 2 O, 0.15 μL of Taq DNA polymerase, 2.5 μL of dNTPs (2 mM), 2 μL of 10 × Taqbuffer (with Mg 2+ ), 1 μL each of the forward and reverse primers (2 mM), and 1 μL of the genomic DNA template. The following conditions were applied: 4 min of predenaturation at 94 °C, followed by 28 cycles of 94 °C for 45 sec, 50 °C for 40 sec, and 72 °C for 40 sec, with a final extension at 72 °C for 10 min. A negative control was included to detect DNA contamination. The PCR products (3 μL) were analyzed using 1.5% agarose gel electrophoresis (U = 5 V/cm) and were later submitted to Personal Biotechnology Co., Ltd., for purification and bidirectional sequencing.
To ensure the accuracy of the DNA barcoding for the two Decapterus species, we retrieved all homologous COI gene sequences of the two species from GenBank (Table 1) to facilitate subsequent comparative analyses. All the obtained sequences were processed and aligned using DNASTAR software (Madison, WI, USA) to ensure consistency. Using Decapterus maruadsi and Trachurus japonicus as outgroups, a neighbor-joining (NJ) tree of all the sequences was constructed based on the Kimura two-  parameter (K2P) model in MEGA 5.0 software (Tamura et al. 2011), and the genetic distances within and among groups were calculated. All the sequences were searched against the NCBI database using BLAST to validate the accuracy of the sequences of the two Decapterus species investigated in this study according to the following criteria: a pairwise sequence similarity ≥ 98% indicated the same species, a pairwise sequence similarity = 92~98% indicated the same genus, and a pairwise sequence similarity = 85~92% indicated the same family (Li et al. 2017). Due to a lack of fossil records for fishes from the genus Decapterus, it is impossible to precisely determine the timing of their differentiation. In this study, the divergence time of investigated fishes was estimated based on a nucleotide site divergence rate of 1.2% per million years (Bermingham et al. 1997).
To determine whether the Decapterus species from the two sides of the Sundaland have differentiated, we assessed the genetic diversity and genetic structure of D. macrosoma and D. macarellus based on the acquired COI sequences. Specifically, diversity parameters and unrooted minimum spanning tree (MST) data were analyzed using ARLEQUIN software (Excoffier et al. 2005); the MST was constructed with the MINSPNET algorithm with manual correction.
Combining the findings of previous studies (Zhu et al. 1962(Zhu et al. , 1985Meng et al. 1995;Nakabo 2013) with observations of the morphological characteristics of the samples in this study, the major diagnostic characteristics of D. macarellus and D. macrosoma can be summarized as follows: (1) the straight-line portion of the lateral line of D. macrosoma, the majority (approximately 3/4) of which is covered with scutes in the rear end, begins below rays 13~14 of the second dorsal fin, and the scutes show no particular external characteristics; in contrast, the straight-line portion of the lateral line of D. macarellus, with the rear half covered with scutes, begins below rays 12~13 of the second dorsal fin, and the highest scute is approximately half the eye diameter; (2) The predorsal scales of D. macrosoma do not reach the middle axis of the eye, presenting an "m" shape, whereas the predorsal scaled area of D. macarellus reaches or extends past the middle axis of the eye, taking on a "∩" shape; (3) The posterior end of the maxilla of D. macrosoma is truncated, and the operculum has a straight posterior margin, whereas the posterior end of the maxilla of D. macarellus is convex and round, and the operculum has an oblique posterior margin.

Molecular analysis
The 652 bp COI gene fragments from both D. macarellus and D. macrosoma were amplified using the F2 and R2 primers, and D. macarellus exhibits a higher level of genetic diversity than that of D. macrosoma. The haplotype diversity (h) and the nucleotide diversity (π) were 0.862 ± 0.067 and 0.0037 ± 0.0023, respectively, for D. macarellus from the Eastern Indian Ocean; 0.797 ± 0.086 and 0.0030 ± 0.0019, respectively, for D. macarellus from the South China Sea; and 0.486 ± 0.124 and 0.0008 ± 0.0007, respectively, for D. macrosoma from the South China Sea. The MST constructed based on the COI sequences of the two fish species (Fig. 2) showed that the two species were distinct, with a significant mutation distance. However, the genetic structure did not correspond to the geological locations observed for individuals of D. macarellus in the South China Sea and the Eastern Indian Ocean, and there were only two shared haplotypes, one of which was clearly an ancestral haplotype; all other haplotypes were unique to the two seas.
After annotating and aligning all the sequences retrieved from GenBank and gained in this study, a 534 bp target fragment was obtained that hosted 142 mutation sites, including 24 single-nucleotide polymorphisms, 118 parsimony-informative sites, and no insertions/deletions. The A+T content was 51.7%, slightly higher than the G+C content, revealing an AT preference. The NJ tree was constructed using all studied sequences with D. maruadsi and T. japonicus as outgroups (Fig. 3). Eight groups were obtained, with genetic distance among groups ranging from 0.031 (between Groups 5 and 6) to 0.198 (between Groups 3 and 8) ( Table 4) and genetic distance within groups of 0-0.009, consistent with the ten-fold rule between species and genera (Ward et al. 2005), which confirmed that each group is a valid species. After realignment, we found that Group 1 corresponded to D. macrosoma, Group 2 to Decapterus sp. 2, Group 3 to Decapterus sp. 1, Group 4 to D. macarellus, Group 5 to D. russelli, Group 6 to D. maruadsi, Group 7 to T. japonicus, and Group 8 to Selar crumenophthalmus, indicating that the most barcoding of D. macarellus and D. macrosoma was correct. Notably, for Groups 2 and 3, the highest similarity of the alignment with sequences from the GenBank database was below 95%, which enabled us to assign the species to the genus Decapterus but not to identify the species.
Based on a 1.2% nucleotide divergence rate per million years, we estimated the divergence time of the species (Table 4). The results showed that the genetic divergence time of the eight species was in the range of 2.58-16.50 million years, corresponding to the early Miocene Epoch and late Pliocene Epoch. The earliest differentiation appeared between S. crumenophthalmus and Decapterus sp. 1, and the latest differentiation appeared between D. russelli and D. maruadsi.

Discussion
Biodiversity is an important material basis and condition for human survival and sustainable development and usually encompasses species diversity, genetic diversity, ecosystem diversity, and landscape diversity. To study biodiversity, we must first accurately identify the existing species; only with this approach do follow-up studies  make sense. For example, both D. macrosoma and D. macarellus are economically important species in China, but due to historical reasons, the domestic literature on the identification of these two species has been confused, with the species descriptions from China contradictory to those from international literature. In this study, using samples collected in the Eastern Indian Ocean and the South China Sea, we re-examined the two Decapterus species from the perspectives of morphology and molecular genetics and provided their major morphological diagnostic characteristics and correct DNA barcoding.
The comparison of countable and measurable characteristics between the two species showed that most of the characteristics are identical or significantly overlapping, making it impossible to distinguish the two species, whereas some directly observable morphological characteristics allow differentiation of the two species (Cuvier and Valenciennes 1833;Bleeker 1851;Nakabo 2013) (Table 3). These characteristics include the scute coverage of the straight-line portion of the lateral line (the most indicative identification characteristic), the shape of predorsal scaled area and its relative location to the middle axis of the eye, and the shapes of the posterior end of the maxilla and the posterior margin of the operculum, among others, indicating that there are appropriate morphological characteristics that enable rapid and correct classification of the two Decapterus species. Therefore, correction of the relevant Chinese literature is needed, supporting the significance of the present study.
The DNA barcoding technique has been repeatedly applied for species identification and has successfully revealed the "cryptic biodiversity" in many taxa (Seidel et al. 2009). In this study, we employed DNA barcoding to reevaluate homologous sequences of D. macrosoma and D. macarellus and, regrettably, found many errors in Table 3. Comparison of major morphological diagnostic characteristics of D. macarellus and D. macrosoma.

D. macarellus D. macrosoma straight-line portion of the lateral line covered with scutes
posterior end, approximately 1/2 majority in the rear, approximately 3/4 external morphological characteristics of scutes the highest scute is approximately half the eye diameter no particular external characteristics whether the predorsal scaled area reaches the middle of the eye reaching or extending past not reaching shape of the predorsal scales "∩" "m" shape of the posterior end of the maxilla convex and round truncated shape of the posterior margin of the operculum oblique straight Table 4. Genetic distance of COI gene among (below the diagonal) and within (on the diagonal) groups, and the divergence time between groups (above the diagonal). We estimated the timing of divergence within the genus Decapterus to be in the early Miocene Epoch to the late Pliocene Epoch based on the COI nucleotide site divergence rate, which provides a rough timeline for the evolution of species in the family Carangidae. The species in Carangidae originated through differentiation via geographical isolation and adaptive evolution during the diffusion process (Cheng et al. 2011). These two evolutionary processes complemented and interacted with each other, such that the species in Decapterus gradually adapted to the surrounding environment and ultimately formed the current geographical distribution pattern.
Decapterus macarellus shows significantly higher genetic diversity than D. macrosoma and additional mutation characteristics, suggesting that it has higher adaptability, most likely related to its wider distribution. At the level of the COI gene, the genetic differentiation appeared in P. chinensis (Li et al. 2019b) was absent in D. macarellus from the South China Sea and the Eastern Indian Ocean, indicating that the Sundaland did not block genetic exchange, a result possibly related to the sensitivity of the molecular marker applied in this study and the long-distance migration of the species. We found a large number of unique haplotypes of D. macarellus in the two seas, and in the future, we will use more sensitive molecular markers to detect the genetic structure and adaptive evolution of this species in the two seas.
Currently, the shortage of experienced taxonomists capable of completing and updating the descriptions and cataloging work of biodiversity is a major challenge for the scientific community. Species classified by external morphological characteristics are referred to as morphospecies (Primack 2010). It is impossible to correctly classify D. macrosoma and D. macarellus in China based on morphological characteristics, however, no misidentified sequences corresponding to the morphological classification results were detected among the DNA barcoding data in the NCBI (among which a large number of sequences have been submitted by Chinese investigators from samples collected from various Chinese waters). This is most likely due to DNA barcoding technology maturation and streamlining, which enables investigators to readily obtain targeted sequences that can be aligned with referenced sequences in the database, allowing investigators to overlook the importance of morphology-based classification and instead only refer to data by others.
Initially, species classification primarily depended on the experience of the taxonomist and the accuracy of the literature. However, taxonomists do not necessarily have a background in genetics, whereas geneticists lack expertise in species identification and are unaware of the classification characteristics of the species, resulting in a rift between the two methods. Only by combining the two methods and using DNA barcoding technology as a new identification method enabling the disciplines to complement each other is it possible to classify species rapidly and accurately based on correctly identified morphological characteristics. For example, by combining morphological characteristics and DNA barcoding technology, Li et al. (2019a) accurately classified the Pampus species of the world, proposed classification keys for Pampus species, and accurately described the distribution of seven Pampus species. Using the same strategy, Li et al. (2018) revealed that the originally described Gymnothorax reticularis is actually G. minor, which is widely distributed in China's coastal areas, whereas G. reticularis is not present in China and is only distributed from the Indian Ocean to the Red Sea. Chen et al. (2018) found that the originally described Platyrhina tangi is actually P. sinensis, which is present in the coastal area of Zhoushan, China. Therefore, only after correctly identifying a species is it possible to accurately determine the distribution and niche of the species, such that the accuracy of other, related studies can be ensured.
In summary, when identifying fish species, marine biologists need to understand the research status of different taxonomic categories of the fish at home and abroad to ensure the validity of morphological classification. The findings of this study have implications for the classification and evolution of fish species in the genus Decapterus and for the conservation of species diversity.

Conclusion
Decapterus macarellus and D. macrosoma in the Eastern Indian Ocean and the South China Sea waters were collected and reidentified using morphological and DNA barcoding techniques. The results showed that the morphological diagnostic characteristics of the two species primarily include the scute coverage of the straight portion of the lateral line (the most indicative characteristic for classification), the shape of the predorsal scaled area and its relative location to the middle axis of the eye, and the shapes of the posterior margin of the maxilla and the posterior margin of the operculum. Molecular analysis revealed that both the two species have high genetic diversity, and no genetic differentiation in D. macarellus from the South China Sea and the Eastern Indian Ocean was detected. By comparing the COI sequences obtained in this study and those homologous sequences downloaded from GenBank, we speculated that the genus Decapterus may include cryptic species and corrected a number of erroneous referenced sequences in the NCBI database.