Phylogeographic, morphometric and taxonomic re-evaluation of the river sardine, Mesobola brevianalis (Boulenger, 1908) (Teleostei, Cyprinidae, Chedrini)

Abstract The river sardine, Mesobola brevianalis (Boulenger, 1908), is the type species of Mesobola Howes, 1984. Standard phylogenetic analyses of partial sequences of the cytochrome oxidase I gene of individuals from populations across southern Africa that are currently identified as Mesobola brevianalis showed that these populations represent four genetically distinct allopatric lineages. Furthermore, Engraulicypris sardella (Günther, 1868), the type species of Engraulicypris Günther, 1894, was convincingly nested amongst these clades. These findings support synonymisation of Engraulicypris and Mesobola syn. n.; restoration of Engraulicypris gariepinus (Barnard, 1943), stat. rev. for the lower Orange River population; description of two new species, Engraulicypris ngalala sp. n. and Engraulicypris howesi sp. n. from the Rovuma and Kunene river systems, respectively; affirmation of the synonymy of Engraulicypris brevianalis (Boulenger, 1908), comb. n. sensu stricto and Engraulicypris whitei van der Horst, 1934; and restoration of Engraulicypris bredoi Poll, 1945, stat. rev. and Engraulicypris spinifer Bailey & Matthes, 1971, stat. rev. from Mesobola. Discriminant function analysis of a truss network of five traditional morphometric measurements and 21 morphometric measurements that characterised the shape of the fishes was used to seek morphological markers for the genetically distinct populations. Only Engraulicypris gariepinus was morphometrically distinctive, but live colouration differed between the lineages. Detailed taxonomic descriptions and an identification key for the species are provided.


Specimens
Specimens identified as Mesobola brevianalis (Boulenger, 1908) were collected from twelve river systems from ten African countries (Fig. 1, Tables 1, 2). The fish were collected under permit by various methods including hand, seine netting and electrofishing device. Specimens were killed by over-dosing in a mixture of clove oil and water and when possible, photographs were taken of the left side of the fish to record its live colouration. The specimens were then fixed in 10% formalin and specimens collected in the same event were placed together into a container with a waterproof label bearing the date, sample number, location, details of the capture and preservation methods, the sample and specimen numbers (Tables 1, 2). In the laboratory, the fixed specimens were transferred through a series of dilutions up into 70% ethanol for long-term preservation.
When a fresh or ethanol-preserved fish was selected for genetic analysis, the entire caudal fin, or a muscle tissue sample taken between the end point of its dorsal fin and the beginning of its caudal fin, was placed in 95% ethanol in a separate microcentrifuge tube. The tissue samples and the whole specimens were catalogued into the South African Institute for Aquatic Biodiversity (SAIAB), Grahamstown.

Phylogenetic relationships
The relationships of the sampled populations identified as M. brevianalis and representatives of its near relatives in the Chedrini (Tang et al. 2010;Liao et al. 2012) were estimated using phylogenetic analysis of mtDNA sequences. Each tissue sample used for DNA extraction (Table 1) was dried completely before being placed in a new microcentrifuge tube. DNA was extracted using the DNeasy® blood and tissue kit (Qiagen, Valencia, CA) and the NucleoSpin® Tissue kit (Machery-Nagel GmbH & Co. KG) following the manufacturer's instructions for animal tissue isolation, except that the incubation period was 12 h to allow for complete tissue digestion and the final dilution step was performed with 50 µl (rather than 200 µl) nuclease-free distilled water during extraction with the DNeasy® kit to provide a higher concentration of DNA. The concentration and purity of each DNA extract was determined by using a NanoDrop 2000 Spectrophotometer. The DNA concentration, A260, A280, 260/280 and 260/230 values were documented to ensure that the DNA was sufficiently concentrated and pure.
A 658 basepair (bp) fragment of the protein-coding Cytochrome Oxidase 1 (COI) mitochondrial gene was amplified using the LCOI490 and HCO2198 primer set (Folmer et al. 1994). The PCR conditions for this gene fragment were 94°C for 1 min, 45°C for 1.5 min, 72°C for 1.5 min, annealing of 94°C for 1 min, 50°C for 1.5 min and 72°C for 1 min for 40 cycles and a final elongation stage at 72°C for 5 min. The PCR products was electrophoretically separated on a 1% agarose gel at 80 V for 30 min. Attempts to amplify the protein-coding Recombination Activating Gene 1 (RAG1) nuclear gene failed, and although the 28S rRNA nuclear gene was amplified, it (predictably) showed no informative variation within Mesobola.
Sequencing by capillary electrophoresis was conducted by Macrogen Inc. (Seoul, South Korea) using the amplification primers. The forward and reverse nucleotide sequences were aligned using the ClustalX multiple sequence alignment module (Larkin et al. 2007) within the BioEdit sequence alignment software (Hall 2004) to form consensus sequences and deposited in Genbank (https://www.ncbi.nlm.nih.gov/Genbank) ( Table 1).
All of the sequences were aligned using ClustalX (Larkin et al. 2007) and saved in a Nexus-format file. MrModelTest (Nylander 2004) was used to access the model of best fit for the sequences using the Akaike Information Criterion (Akaike 1973), and  the TrN+I+G model was selected and used to build a Bayesian inference tree in Mr-Bayes (Huelsenbeck and Ronquist 2001) using a total of ten million generations (until the split frequency was below 0.05), with a tree sampled every 1000 generations. After examining the trace file, the first 20% of the sampled trees were discarded as burn-in. The Bayesian inference trees were viewed and annotated using TreeView (Page 1996).

Morphological characterization
The morphology and live colouration of representatives of each clade was examined in details for diagnostic traits; measures follow Howes (1980Howes ( , 1984. Preserved specimens were each placed into a black-or white-based container (to provide contrast) filled with 70% ethanol and a photograph was taken of its left side using a Canon 550D SLR camera (18.1 megapixels) and 50 mm fixed macro lens. A scale bar was included in each photograph to calibrate measurements. Each specimen was then labelled with waterproof paper bearing its specimen number and photograph number, placed in a separate vial for further reference, and returned to its collecting lot. The available type specimens of M. brevianalis and its synonyms, and of E. sardella were also examined using photographs supplied by the Natural History Museum, London (BMNH).
Based on these results, morphometric analysis of selected specimens (Table 2) from each major clade found in the phylogenetic analysis was used to find morphological features suitable for identification. The photographs were imported into the imaging software, AnalySIS Docu (Olympus Soft Imaging Systems: http://www.soft-imaging. net/) to measure six standard linear measurements: standard length (SL), orbit length, snout-to-orbit distance, and the lengths of the dorsal, anal and pelvic fins. A box truss network (Strauss and Bookstein 1982) of 21 measurements was used to capture the shape of each fish, based on ten landmark points ( Fig. 2) that lay in areas of strong skeletal support, where distortion of soft tissue was likely to be minimal. All measurements were entered into a spreadsheet with each specimen's collection number, geographical origin (country, river, river system) and nomenclatural status (e.g. holotype, syntype). Measuring and transcription errors were sought using scatter plots and corrected. The measurement data were log-transformed to rectilinearise allometric variation (Strauss and Bookstein 1982), and a principal component analysis was used to seek morphological groups in the samples. A discriminant function analysis was performed to pinpoint diagnostic measurements of taxa defined by the genetic analysis. Both analyses were done using the Statistica 12 (http://www.statsoft.com/Products/STA-TISTICA-Features/Version-12) software package.

Taxonomy
Type specimens and their metadata were housed in the South African Institute for Aquatic Biodiversity, Grahamstown (SAIAB), the Albany Museum, Grahamstown (AMGT) and the Natural History Museum (BMNH), London. Photographs of the holotypes of M. brevianalis and E. sardella were received from the BMNH as the specimens were too fragile to transport. Catalogued SAIAB specimens of undescribed species was selected for description based on their physical condition (e.g. fin rays and scales intact) and whether they had associated genetic sequences.
Specimens were photographed with a scale bar. Measurements were made on each specimen with standard unbranded electronic digital callipers. The holotype photographs were measured using AnalySIS Docu software, but measurements that involved the width of the specimen including body width or inter-orbit length could not be measured or included in the description.
Meristic data, including fin ray counts, where gathered using a Leica Zoom 2000 microscope. Scale counts were made on a maximum of only three specimens because it required dyeing specimens with Alizarin Red for an average of five-to-ten minutes and then placing them directly into Acid Blue dye for a further five-to-ten minutes, after which visualising the scales was still very difficult. Because the dye did not wash out well, scale counts were not be made on type specimens. Vertebra counts were made on X-rays of some specimens including all holotypes except for the holotype of E. sardella for which no X-ray was available. A single specimen from each population was cleared and stained using standard methods (Taylor and van Dyke 1985), preserved in 70% glycerol, and dissected to count the gill rakers on both the ceratobranchial and epibranchial of the first gill arch.
The data were used to populate a character database in the DELTA software package (Dallwitz 1980, Dallwitz et al. 1993, which was used to generate the species descriptions and key.

Phylogenetic relationships
The Bayesian phylogenetic analysis with a maximum-likelihood model showed that the biogeographically disparate populations identified as M. brevianalis represent inde- pendent evolutionary clades (support values = 100% in all cases) with relative branch lengths (i.e. numbers of base substitutions per site) indicating larger average evolutionary divergence between the clades than within them (Fig. 3). These clades were collectively paraphyletic with respect to E. sardella (Fig. 3), but the monophyly of the whole ensemble received bootstrap support of 96%. Support for relationships between the independent clades was weak, possibly suggesting a relatively rapid radiation, with the strongest evidence (p = 0.755) supporting a biogeographically plausible sister-group relationship between E. sardella from Lake Malawi and the population from the neighbouring Rovuma River system (Fig. 3). The Malawi Rift Basin began to form ~8.6 mya, in the Late Miocene (Delvaux 1995;Danley et al. 2012), cutting across the headwaters of the Palaeo-Rovuma River. This would provide a first approximation for the time of vicariance of these two clades.
The sister group to Mesobola remains uncertain for the same reasons that affected the study by Tang et al. (2010), which used four genes and many more taxa: limited taxon sampling within the African radiation of Chedrini and the involvement genera like Raiamas and Opsaridium that are potentially polyphyletic and not represented by their type species. The average evolutionary divergence between taxa is represented as number of base substitutions per site (Fig. 3).

Morphological identification
Although the phylogenetic analysis showed distinct populations within Mesobola brevianalis sensu lato, these could not be detected in a principal component analysis of the morphometric data. The first Eigenvector summarised 89% of the variance and its coefficients were all fairly similar in magnitude and uniform sign (Table 2), indicating that it summarised a general effect in the data, i.e. size, as is usual with morphometric analyses of organisms when variation in the sizes of specimens outweighs their variation in shape. Being orthogonal to the first axis, the remaining axes summarised variation in shape and allometry independent of gross differences in size. A plot of the second and third axes (Fig. 4) showed that populations from the Kunene River and eastern South Africa (including the syntypes of E. whitei) overlapped entirely in that morphospace, and partially overlapped those of the Rovuma and Orange rivers, which were mutually distinct. This supported the synonymization of M. brevianalis and E. whitei, which both occupy the Limpopo River system, and explains why most of the populations have not yet been recognised as distinct taxa. The second axis summarised 2.6% of the variance and differentiated the Rovuma and Orange River populations by emphasising truss measurements DE, DF, CD, DG and dorsal fin length (Table 3), which described the shapes of the caudal peduncle and the dorsal fin (Fig. 2). The third axis summarised 2.2% of the variance in morphology and emphasised eye length and the truss measurements AB, AJ and BJ (Table 3), which all described the head (Fig. 2), but did little to separate the populations further (Fig. 4). The remaining 24 axes collectively summarised only 6.1% of the variation and did not describe patterns that related to the populations. Discriminant function analysis of the morphology of the genetically well-supported Mesobola populations and E. sardella successful assigned most specimens to their population of origin (Table 4; Fig. 5), although Mesobola brevianalis sensu stricto and E. howesi overlapped substantially in morphospace, at least on the first two canonical axes (Fig. 5). The first canonical axis tended to have negative weights for measurements along the body axis and positive weights for those across the body axis (Table 3; Fig.  5), thus describing the elongation of the body. The second axis contrasted measurements involving the dorsal fin with those of the caudal peduncle (Table 3; Fig. 5), while the third axis did showed no clear morphological pattern in its weights (Table 3).   (Boulenger, 1908), comb. n. We also restore two other species currently placed in Mesobola but originally placed in Engraulicypris by their authors (Eschmeyer et al. 2016): Engraulicypris bredoi Poll, 1945, stat. rev. andEngraulicypris spinifer Bailey &Matthes, 1971, stat. rev. The species-level paraphyly in the phylogeographical analysis (Fig. 3) can be resolved by recognising the independent populations as species. In South Africa, specimens from the eastern populations of Mesobola grouped with specimens from the type locality of E. brevianalis (Fig. 3) and were somewhat phylogenetically intermingled with specimens from the western populations from which M. whitei was collected. These two species are therefore either synonymous or show incomplete lineage sorting or hybridization. The lower Orange River population can be recognised by restoring E. gariepinus stat. rev. from synonymy with M. brevianalis. Engraulicypris bredoi and E. spinifer occur in Lake Albert and the Malagarasi River system, respectively (Lévêque et al. 1991), and are therefore unlikely to represent the Kunene and Rovuma River populations, for which there are thus currently no names available.  Boulenger, 1908) Diagnosis. With the synonymisation of Mesobola and Engraulicypris, Günther's (1894) diagnosis of Engraulicypris must be modified to include the species assigned to Mesobola. Engraulicypris is a genus of moderately small African chedrin barbs (sensu Tang et al. 2010;Liao et al. 2011Liao et al. , 2012 identified by a lack of a scaly lobe at the base of the pelvic or pectoral fin; a large mouth reaching the anterior border of the orbit or beyond; a dorsal fin origin originating behind midpoint of standard length, more or less above the origin of the anal fin; a pectoral fin not reaching the origin of the anal fin; and body colouration lacking vertical bars or bands. Osteological characters are discussed by Liao et al. (2011Liao et al. ( , 2012 for Mesobola and by Liao et al. (2012) for Engraulicypris. Live colouration. (Fig. 6). Body without vertical bars or bands. Etymology. Engraulicypris alludes to the anchovy-like form (eggraulis, -eos [eggraulis, -eos]; Greek) of these relatives of the carp (kyprinos [kyprinos]; Greek).

Descriptions
Distribution. Southern and Eastern Africa.
Engraulicypris brevianalis (Boulenger, 1908), comb. n. Boulenger, 1908  Diagnosis. Caudal fin membrane clear towards vivid yellow at fork; anal fin extending two thirds of length of caudal peduncle; caudal peduncle moderately long; operculum entirely (not partially) shiny; body midline silver (not black); iris dark to light grey (not white); head with tubercles along lower jaw and lower head in breeding males; snout rounded (not pointed), darker dorsally; pelvic fin melanophores absent.

Neobola brevianalis
Morphology. (Figs 6-8; Table 5). Maximum SL 75 mm. Body elongated; somewhat fusiform; laterally compressed. Maximum body depth at middle pelvic and pectoral fin origin. Pre-dorsal profile straight or slightly convex behind head. Head length 20% SL; with tubercles along lower jaw and lower head. Snout rounded; short; 30% of head length. Mouth terminal; slightly crescent-shaped with long anterior side; reaching anterior border of orbit. Nostrils large; level with dorsal margin of eye; separated from orbit by less than one orbit radius. Tubular anterior naris short; adjacent to open posterior naris. Eye lateral; visible from above and below (more prominent); diameter 35% of head length. First gill arch with 8+3 gill rakers on cerato-and epibranchial arms, respectively. Gill rakers long; pointed; widely-spaced. Pharyngeal bones in three rows. Pharyngeal teeth 4,3,2-2,3,4; robust and long; falcate.
Modal fin formulae in Table 5. Fins large in relation to body size. Dorsal fin closer to caudal fin than tip of snout; more or less above origin of anal fin; length 17% SL; posterior margin straight; rays soft; anterior-most branched fin ray longest. Pectoral fins largest; reaching 1 / 2 to 3 / 4 distance to base of pelvic fin; fin lacking lobe at base. Pelvic fins reaching 2 / 3 distance to base of anal fin; relatively small; pointed; fin lacking a basal lobe. Anal fin moderately long; extending 2 / 3 length of caudal peduncle; last unbranched ray longest. Ano-genital opening at anterior of base of anal fin. Caudal peduncle moderately long. Caudal fin forked; lobes with slightly concave interior and extending into point; upper lobe shorter.
Scales small to medium relative to body size; in regular rows; cycloid, slightly elongate; radially striate. Base of anal fin lacking sheath scales. Lateral line present; complete; dipping sharply towards ventral at tip of pectoral fin; joining midline at posterior of caudal peduncle; scale count 53-57 (n = 2) along lateral line, 18 around caudal peduncle.
Live colouration. (Fig. 6). Body silver, without vertical bars or bands. Dorsum pale brown with small dark brown melanophores, midline silver. Snout darker dorsally. Operculum entirely metallic silver. Iris dark to light grey. Dorsal fin membrane clear; rays clear with olive melanophores; fading towards tips. Caudal fin membrane clear, vivid yellow at fork; rays light olive; rays lighter towards tips; melanophores small, dark, fading towards rear. Anal fin rays clear; membrane clear; dark spotting above origin; melanophores dark olive fading towards tips. Pectoral fin membranes clear; rays clear; first ray with few dark melanophores. Pelvic fin rays clear; membrane clear.
Biology. Pelagic species preferring close proximity to substrate and seeking out slacker areas such as backwater, eddies and pools below riffles. Occurs in shoals and prefers well-aerated, open water in flowing rivers (Skelton 2001), favouring the upper stratum (Engelbrecht and Mulder 1999). Feeds from water column on planktonic crustaceans and insects (e.g. midges and ants) (Skelton 2001). Caught at night with light. Breeding occurs in early summer (Skelton 2001). Found in dams where appears to propagate successfully with little predation and moves around in rivers according to seasonal flows. Appears to migrate up streams in spring to breed where it is found in tributaries.
Remarks. The specimen (SAIAB 66270) used by Liao et al. (2012) to represent a DNA sequence of M. brevianalis and is from the Usuthu River (Table 1), and does belong to that species (Fig. 3).

Engraulicypris gariepinus Barnard, 1943, stat. rev.
Engraulicypris gariepinus Barnard, 1943 Diagnosis. Caudal fin membrane clear to pale orange towards midline; anal fin extending over three quarters of length of caudal peduncle; caudal peduncle short; operculum entirely (not partially) shiny; body midline silver (not black); iris dark to light grey (not white); head with tubercles along lower jaw and lower head in breeding males; snout rounded, with dense dark spotting on tip; pelvic fin melanophores absent.
Morphology. (Figs 6-8; Table 6). Maximum SL 46 mm. Body elongated; somewhat fusiform; laterally compressed. Maximum body depth before pelvic fin. Pre-dorsal profile straight or slightly convex behind head. Head length 21% SL; with tubercles along lower jaw and lower head. Snout rounded; short; 32% of head length. Mouth terminal; slightly crescent-shaped with long anterior side; reaching anterior border of orbit. Nostrils large; level with dorsal margin of eye; separated from orbit by less than one orbit radius. Tubular anterior naris short; adjacent to open posterior naris. Eye lateral; visible from above and below (more prominent); diameter 32 % of head length. First gill arch with 7+3 gill rakers on cerato-and epibranchial arms, respectively. Gill rakers long; pointed; widely-spaced. Pharyngeal bones in three rows. Pharyngeal teeth 4,3,2-2,3,4; robust and long; falcate.
Live colouration. (Fig. 6). Body without vertical bars or bands. Dorsum transparent pale brown with melanophores concentrated around dorsal fin; midline silver. Snout with dense dark spotting on tip. Operculum entirely metallic silver. Iris dark to light grey. Dorsal fin membrane clear; rays clear; melanophores fading towards tips. Caudal fin membrane clear to pale orange towards midline; rays dark grey, lighter towards tips; melanophores small, dark, fading towards rear. Anal fin rays clear; membrane clear; pale orange spotting above origin; melanophores few to absent. Pectoral fin membranes clear; rays clear; first ray few dark melanophores. Pelvic fin rays clear; membrane clear.
Preserved colouration. (Fig. 7). Body and head orange with small dark brown spotting along dorsal surface, midline and above anal fin. Scales on dorsal surface lightly pigmented. Ventral scale pigmentation less intense than dorsal. Dorsal surface of head lightly pigmented. Melanophores small, dark; grouped on rear of head, below orbit, and on lips and snout; along midline, increasing in intensity to caudal fin; brownish on dorsal surface, darkening between origin of pectoral and dorsal fin; forming small dark line above anal fin. Membranes between fin rays clear. Pelvic fin clear membranes and rays.
Etymology. 'Gariepinus' refers to the Gariep, a San name for the Orange River that means 'Great water'.
Type locality. Orange River and Fish River, Namibia (Barnard 1943). Biology. This shoaling fish favours open, shallow water, normally occurring in slack pools and particularly below riffles. Populations found in the lower Orange and Fish Rivers are limited by the Augrabies and Fish River Falls. They are thought to feed mainly on small autochthonous invertebrates (planktonic crustaceans or insects), and are caught in large numbers where they occur. They are restricted to turbid waters, which provide protection from visual predators (R. Bills, pers. obs.).
Remarks. The two syntypes of E. gariepinus Barnard, 1943 were originally stored in the South African Museum, but were moved to the Albany Museum, Grahamstown, South Africa (AMG 106 and 1009) (Eschmeyer 2014). The Albany Museum fish collection has now been moved to SAIAB and these specimens have not been traced (I.R. Bills, pers. obs.). There is no 'exceptional need' (ICZN, Articles 75.2 and 75.3) for a neotype, since there is only one species of Mesobola in the topotypical river system, and the species is sufficiently physically distinctive that even if another species was introduced, they would be easy to distinguish on the basis of published descriptions.
Diagnosis. Anal fin extending over three quarters of length of caudal peduncle; caudal peduncle short; operculum entirely (not partially) shiny; body midline silver (not black); iris dark to light grey (not white); head with tubercles along lower jaw and lower head in breeding males; snout rounded; pelvic fin melanophores absent.
Modal fin formulae in Table 7. Fins large in relation to body size. Dorsal fin closer to caudal fin than tip of snout; more or less above origin of anal fin; length 14% of SL; posterior margin straight; rays soft; anterior-most branched fin ray longest. Pectoral fins largest; reaching 1 / 2 to 3 / 4 distance to base of pelvic fin; fin lacking lobe at base. Pelvic fins reaching 2 / 3 distance to base of anal fin; relatively small; pointed; fin lacking a basal lobe. Anal fin moderately long; extending 2 / 3 length of caudal peduncle; last unbranched ray longest. Ano-genital opening at anterior of base of anal fin. Caudal peduncle moderately long; depth half of length. Caudal fin forked; lobes pointed; upper lobe shorter.
Preserved colouration. (Fig. 7). Body and head orange with small dark brown spots along dorsal surface, midline and above anal fin. Scales on dorsal surface lightly pigmented. Ventral scale pigmentation less intense than dorsal. Dorsal surface of head lightly pigmented. Melanophores small, dark; grouped on rear of head, below orbit, and on lips and snout; along midline, increasing in intensity to caudal fin; browner on dorsal surface, darkening between origin of pectoral and dorsal fin; forming small dark line above anal fin. Operculum with silver sheen. Side of body with silver sheen extending from pectoral fin to anal fin origin. Membranes between fin rays white to clear towards end. Pelvic fin clear membranes and rays. Dorsal, caudal and pectoral fin membranes white to clear; rays with small, widely-spaced, melanophores fading towards edges; rays pale brown to clear.
Biology. Very little is known of the biology of this species. Individuals appear to favour turbid, rocky, river regions where they can gather in pockets of recirculating currents. The holotype and some paratypes were collected in the shallow, turbid Olushandja Dam in the Namibian upper reaches of the system. They feed on drifting invertebrate larvae and adults and plankton. Diagnosis. Operculum shiny only on ventral posterior edge and small area at posterior edge of orbit (not entire area); body midline black (not silver); head with tubercles along lower jaw and lower head in breeding males; snout rounded (not pointed); iris white to light grey (not dark grey) with a few melanophores; pelvic fin melanophores present, dark and widely dispersed.
Modal fin formulae in Table 8. Fins large in relation to body size. Dorsal fin closer to caudal fin than tip of snout; more or less above origin of anal fin; length 14% of stocks may differ ecologically because Lake Chiuta offers a lacustrine pelagic and benthic prey community (copepods, etc.) that is not found in the Rovuma River channel, where fish would predominantly have access to invertebrate drift.