﻿Comparison of seven complete mitochondrial genomes from Lamprologus and Neolamprologus (Chordata, Teleostei, Perciformes) and the phylogenetic implications for Cichlidae

﻿Abstract In this study, mitochondrial genomes (mitogenomes) of seven cichlid species (Lamprologuskungweensis, L.meleagris, L.ornatipinnis, Neolamprologusbrevis, N.caudopunctatus, N.leleupi, and N.similis) are characterized for the first time. The newly sequenced mitogenomes contained 37 typical genes [13 protein-coding genes (PCGs), two ribosomal RNA genes (rRNAs) and 22 transfer RNA genes (tRNAs)]. The mitogenomes were 16,562 ~ 16,587 bp in length with an A + T composition of 52.1~58.8%. The cichlid mitogenomes had a comparable nucleotide composition, A + T content was higher than the G + C content. The AT-skews of most mitogenomes were inconspicuously positive and the GC-skews were negative, indicating higher occurrences of C than G. Most PCGs started with the conventional start codon, ATN. There was no essential difference in the codon usage patterns of these seven species. Using Ka/Ks, we found the fastest-evolving gene were atp8. But the results of p-distance indicated that the fastest-evolving gene was nad6. Phylogenetic analysis revealed that L.meleagris did not cluster with Lamprologus species, but with species from the genus Neolamprologus. The novel information obtained about these mitogenomes will contribute to elucidating the complex relationships among cichlid species.


Introduction
Cichlids (Teleostei: Perciformes: Cichlidae) are widely distributed across the Neotropics, Africa, the Middle East, Madagascar, as well as southern India and Sri Lanka (Smith et al. 2008;López-Fernández et al. 2010).They stand out as one of the most species-diverse groups of acanthomorphs.Kullander (1998) divided the family Cichlidae into eight subfamilies: Astronotinae, Cichlasomatinae, Cichlinae, Etroplinae, Geophaginae, Heterochromidinae, Pseudocrenilabrinae, and Retroculinae.The ninth subfamily, the Ptychochrominae, was later recognized by Sparks and Smith (2004).Cichlids gained recognition as a prominent model species for the study of evolutionary biology due to the numerous species, diverse genetics, distinct evolutionary lineages, and significant ecological and morphological divergences (Kocher 2004;Schwarzer et al. 2015;Reis et al. 2016;Nam et al. 2021).
African cichlids (subfamily Pseudocrenilabrinae) boasted an abundant variety of more than 2000 species (Brawand et al. 2014;Astudillo-Clavijo et al. 2022).Biologists have long been fascinated by the diversity of cichlids in the East African cichlid radiation (EAR), which has promoted high levels of endemism in the Lakes Tanganyika, Malawi, and Victoria (Kornfield and Smith 2000).Lake Tanganyika is a deep tropical and large Rift Valley lake with an age of 9-12 million years (Irisarri et al. 2018).It has the most diverse species of cichlid fish in terms of morphology, ecology, and behavior, including several mouth-brooding and substrate-spawning lineages (Takahashi 2003;Salzburger 2009).The cichlid fauna of Lake Tanganyika is dominated by lamprologine cichlids, which colonized most lacustrine habitats, but most often inhabits the littoral zone (Sturmbauer et al. 2010).Although classified as a single tribe, lamprologine cichlids exhibit significant diversity in morphology, ecology, and behavior.Lamprologus kungweensis, Lamprologus meleagris, Lamprologus ornatipinnis, Neolamprologus brevis, Neolamprologus caudopunctatus, Neolamprologus leleupi, and Neolamprologus similis are among the smallest species within the lamprologine cichlids, small enough to live inside the empty shells of gastropod mollusks (Sturmbauer et al. 2010).These species are regarded as a highly valuable ornamental species in the aquatic trade industry due to their ease of maintenance and handling in aquariums (Nam et al. 2021).
The genera Lamprologus and Neolamprologus can be difficult to distinguish due to their similar morphology, ecology, and behavior.As discussed by Stiassny (1991), meristic and morphometric measurements, osteology, and dentition were insufficient to differentiate between the species, as many of these traits were homoplastic.Furthermore, there might be instances of ancient ancestral polymorphism, introgressive hybridization, or lack of diagnostic synapomorphic characters among certain species within these two genera, further complicating their classification (Sturmbauer et al. 2010;Gante et al. 2016).Therefore, additional method, like molecular analysis might be required for more accurate classification.
Mitochondria are organelles found in most eukaryotic cells that play a critical role in energy production (Hebert et al. 2010).The mitochondrial genome (mitogenome) of acanthomorph fishes is usually a circular, double-stranded molecule that ranges from 16 to 23 kbp in size.It typically contains 13 protein-coding genes (PCGs), two ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs), and one control region (CR) (Iwasaki et al. 2013).Mitogenomes have the characteristics of high evolutionary rate, matrilineal inheritance, low molecular weight, simple structure, and ease of amplification, which makes them a reliable marker for studying phylogenetics (Ye et al. 2022;Wang et al. 2023).Mitogenome components, such as nad2 or rrnL, are widely used for phylogenetic analyses (Sturmbauer et al. 2010;Schwarzer et al. 2015).Although partial mitochondrial sequences can offer some insights into evolutionary relationships, they are limited in their ability to provide a comprehensive understanding due to the absence of information such as gene rearrangement, genetic code changes, replication, and transcriptional regulation patterns.Therefore, complete mitogenome sequences can be more beneficial as they can provide improved resolution and sensitivity for investigating evolutionary relationships (Li et al. 2019;Fiteha et al. 2023;Wang et al. 2023).
In this study, we report the complete mitogenome organizations and characteristics of seven species (L. kungweensis, L. meleagris, L. ornatipinnis, N. brevis, N. caudopunctatus, N. leleupi, and N. similis).We also performed a phylogenetic analysis of the seven complete mitogenomes obtained in this study with the published complete cichlid mitogenomes.We hope that our study can enable better comprehension of cichlid biodiversity and expand genetic resources for future cichlid comparisons.

Sample collection and DNA extraction
The seven species are commonly sold as ornamental fish and can be found in many pet markets.Specimens were obtained from the Qiqiaoweng pet market in Nanjing, Jiangsu province, China.The specimens were identified using morphological characteristics described in FishBase (https://www.fishbase.de/).No fish were sacrificed during this study.The fish were reared at the Laboratory of Animal Molecular Evolution, Nanjing Forestry University.Total genomic DNA was extracted from each fin using a FastPure Cell/Tissue DNA Isolation Mini Kit (Vazyme, Nanjing, China), and stored at -80 °C for future use.

Genome sequencing, assembly, and annotation
Seven complete mitogenomes were sequenced on an Illumina platform (Personalbio Nanjin, China) using total genomic DNA.The genomic DNA was used to generate an Illumina library with an insert size of 400 bp.The clean data were then assembled in Geneious Prime 2022 software, using Lamprologus signatus (MZ427900.1)as a template.The mitogenomes were assembled and manually revised using DNAstar v. 7.1 (Madison, WI, USA).

Phylogenetic analysis
Phylogenetic analysis was conducted using the sequences of 13 PCGs and two rRNA genes from the complete mitogenomes of 105 species, including seven species from this study (Suppl.material 1).Channa andrao and Hyphessobrycon sweglesi were selected as outgroups, while the remaining specimens belonged to the Cichlidae family.Phylogenetic analysis was conducted using maximum likelihood (ML) and Bayesian inference (BI) methods with PhyloSuite v. 1.2.3 software package (Zhang et al. 2020;Xiang et al. 2023).All genes were aligned using MAFFT v. 7.313, and the best-fit substitution model and partitioning scheme were determined using ModelFinder.ML phylogenies were inferred using IQ-TREE with the Edge-linked partition model for 5000 ultrafast bootstraps (Minh et al. 2013;Nguyen et al. 2015).BI phylogenies were inferred using MrBayes v. 3.2.7awith a partition model (Ronquist et al. 2012).The analysis consisted of two parallel runs with 2,000,000 generations each, and the initial 25% of sampled data was discarded as burn-in.The trees were visualized and edited using iTOL v. 6 (Letunic and Bork 2021).

Nucleotide composition
The nucleotide composition of the seven newly sequenced Lamprologus and Neolamprologus mitogenomes were biased toward A and T (Table 2).The ATskews exhibited inconspicuously positive values, while all GC-skews were markedly negative.The analysis revealed a clear preference for the utilization of C, along with a minor inclination towards A, across the entire genome (Table 2).

Protein-coding genes
In the seven newly sequenced mitogenomes, PCG nad6 was on the L-strand, while other PCGs were on the H-strand.The average A + T content of the PCGs ranged from 53.0% (N.leleupi and N. brevis) to 54.7% (L.meleagris).Six of them had the same 13 PCGs length of 11,466 bp, while the remaining species, N. leleupi, had a slightly shorter length of 11,421bp.The reason for this difference was that the cox1 gene in N. leleupi had a mutation causing a premature stop codon compared to other species, resulting in a reduction of 45 base pairs in length (Tables 1, 2).
Most of the PCGs in the seven newly sequenced mitogenomes began with the start codon ATG, except for cox1, which started with GTG.Most PCGs terminated with the codon TAA or incomplete codon (TA− / T−−), with the exception of nad1, which ended with TAG (Table 2).The cichlid species are relatively conservative in their use of start codons, and their preferences are generally consistent with those of the seven newly sequenced species with the only exception of the occurrence of a rare start codon ATC in the cox1 and nad3.All the Cichlids share the stop codons with TAA, TAG, AGA, and incomplete codons (TA− / T−−) (Fig. 3).RSCU was calculated to identify the predominant synonymous codon (Grantham et al. 1980).The comparative analysis based on RSCU of all PCG codons showed that the codon usage patterns of these seven species were similar (Fig. 4).Genes encoding Ile and Leu2 had high frequency, while those encoding Cys, Met, and Ser1 were infrequent.

Evolutionary analyses
The selection pressure was analyzed by calculating the ratio of Ka/Ks across Lamprologus and Neolamprologus for each aligned PCG (Fig. 5) (Yang and Nielsen 2002).It was found that atp8 showed the largest Ka/Ks value among the 13 PCGs, which suggested more amino acid variety in the biomolecule.This suggests that the atp8 gene might have evolved faster than other PCGs due to slight selection pressure (Hassanin et al. 2005).The faster evolution of the atp8 gene could result in greater amino acid diversity, indicating its potential as an effective marker for population classification.The Ka/Ks values for all PCGs were lower than 1, suggesting that purifying selection was likely the main driver of mitochondrial PCG evolution (Hurst et al. 2002).Besides the Ka/Ks analysis, an assessment of the degree of divergence in Lamprologus and Neolamprologus was conducted by analyzing the overall p-distance between nucleotides of 13 PCGs + two rRNA genes (Fig. 6).The results of p-distance indicated that the fastest-evolving gene was nad6, which was inconsistent with the results of Ka/Ks value.However, the difference in this gene might be not comparable with the selection since this force is acting in a contemporary period.

Ribosomal RNA genes, transfer RNA genes, and control regions
The size of the rrnS genes were between 943 bp (L.meleagris, N. brevis, and N. caudopunctatus) and 946 bp (N.similis), while the size of the rrnL genes in seven species ranged between 1,652 bp (N.similis) to 1,671 bp (N.brevis) (Table 1).The two rRNA genes located between trnF and trnL2, with trnV separating them.The A + T content of rRNAs ranged from 53.0% ~ 54.1% (Table 2).
The sizes of the tRNA genes ranged from 66 bp (trnY of N. caudopunctatus) to 74 bp (trnK).The combined length of the 22 tRNA genes varied between 1,552 bp (N.caudopunctatus) and 1,554 bp (L.kungweensis, L. ornatipinnis, and N. similis).The A + T contents of tRNA genes ranged from 54.7% to 55.8% among the seven species analyzed in this study (Table 2).
As with other fish mitogenomes, the CRs were discovered to exist between trnF and trnP in all seven species.The sizes of the CRs ranged from 884 bp (N.similis) to 891 bp (L.kungweensis).The A + T contents of PCGs, tRNAs, and rRNAs sequences were found to be similar to that of the entire mitogenomes, whereas CR sequences had a higher A + T content (62.5% ~ 63.8%) (Table 2).

Phylogenetic analysis
To elucidate the phylogenetic inter-relationships within the family Cichlidae and genera Lamprologus and Neolamprologus, concatenated nucleotide sequences of 13 PCGs + two rRNAs from 103 cichlid species were obtained.Additionally, Channa andrao, and Hyphessobrycon sweglesi from two other families were used as outgroups.It was found that BI and ML analysis generated the same topology structure on most nodes (Fig. 7).
Specifically, the seven complete mitogenomes covered two genera in this study have good clustering in phylogenetic trees, and within the family Cichlidae, the subfamily Etroplinae and Ptychochrominae were monophyletic across analyses.They diverged with species in other subfamilies early in the evolutionary history of cichlid fishes.This result was similar to a previous molecular phylogenetic study (Astudillo-Clavijo et al. 2022).Thirty-one species from subfamily Astronotinae, Cichlasomatinae, Cichlinae, Geophaginae, and Retroculinae were clustered into one branch, indicating these five subfamilies were closely related.Moreover, 67 Pseudocrenilabrinae species also formed a monophyletic clade.Pseudocrenilabrinae tribes and their interrelationships were for the most part well supported as reported by Astudillo-Clavijo et al. (2022).Due to the addition of seven newly sequenced mitogenomes, three pairs of sisters (N.brichardi + N. leleupi, N. caudopunctatus + L. meleagris, and L. signatus + L. kungweensis) were newly identified, as shown in Fig. 7. Lamprologus meleagris did not cluster with Lamprologus species, but with species from the genus Neolamprologus.Previous studies have identified such taxonomic issues in the genera Lamprologus and Neolamprologus (Schelly et al. 2006;Sturmbauer et al. 2010).Sturmbauer et al. (2010) think a viable way might be to re-assign the genus name Lamprologus to most Neolamprologus species.Our results also support this scenario.However, the species from the genera Lamprologus and Neolamprologus used in this study were limited, making it impossible to perform a more detailed analysis.Therefore, to better understand the relationships between members of these two genera, it will be beneficial to include more species in future studies.
In conclusion, our study increased the database of mitogenome in Cichlidae, and showed that mitogenome sequences are efficient molecular markers for studying the phylogenetic relationships within Cichlidae.However, there is a lack of analyses in nuclear genes.In the future study, we will further improve these deficiencies.

Figure 1 .
Figure 1.The gene maps of the seven newly sequenced mitogenomes.Different gene types are shown in different colors.

Figure 2 .
Figure 2. A + T content vs AT-skew and G + C content vs GC-skew in the 103 mitogenomes of family Cichlidae.Values are calculated on H-strands for full-length mitogenomes.

Figure 3 .
Figure 3. Start codon and stop codon usage for the mitochondrial genome protein-coding genes of 103 cichlid species.

Figure 4 .
Figure 4.The codon distribution and RSCU of the mitogenomes of the seven newly sequenced mitogenomes.

Figure 5 .
Figure 5. Ka/Ks values for the 13 PCGs.Pale pink box plots, five species of gnus Neolamprologus; orange box plots, four species of Lamprologus; blue box plots, nine species of Lamprologus and Neolamprologus.The band inside the box represents the median; upper and lower hinges correspond to the 25 th and 75 th percentiles; circles, to outliers.

Figure 6 .
Figure 6.Genetic p-distances for nucleotide sequences for 13 PCGs and 2 rRNAs.Pale pink box plots, five species of Neolamprologus; orange box plots, four species of Lamprologus; blue box plots, 9 species of genera Lamprologus and Neolamprologus.

Figure 7 .
Figure 7. 13 PCGs-based phylogenetic tree of 103 cichlid species and two outgroups.Numbers at nodes represent the posterior probability and bootstrap values for BI and ML analysis, respectively."-" indicates this clade not supported by BI or ML analysis.

Table 2 .
Base compositions of the complete genomes, PCGs, rRNAs, tRNAs, and CRs of the seven newly sequenced mitogenomes.