Characterization of the complete mitochondrial genome of Parabreviscolexniepini Xi et al., 2018 (Cestoda, Caryophyllidea)

Abstract Parabreviscolexniepini is a recently described caryophyllidean monozoic tapeworm from schizothoracine fish on the Tibetan Plateau. In the present study, the complete mitochondrial genome of P.niepini is determined for the first time. The mitogenome is 15,034 bp in length with an A+T content of 59.6%, and consists of 12 protein-encoding genes, 22 tRNA genes, two rRNA genes, and two non-coding regions. The secondary structure of tRNAs exhibit the conventional cloverleaf structure, except for trnS1(AGN) and trnR, which lack DHU arms. The anti-codon of trnS1(AGN) in the mitogenome of P.niepini is TCT. The two major non-coding regions, 567 bp and 1428 bp in size, are located between trnL2 and cox2, trnG and cox3, respectively. The gene order of P.niepini shows a consistent pattern with other caryophyllideans. Phylogenetic analysis based on mitogenomic data indicates that P.niepini has a close evolutionary relationship with tapeworms Breviscolexorientalis and Atractolytocestushuronensis.


Introduction
The Caryophyllidea is an ancient group of tapeworms, consisting of four families, 42 genera, and approximately 190 species parasitic in cypriniform and siluriform fishes in most zoogeographical regions (Scholz and Oros 2017). Some caryophyllideans, especially those in cyprinids (e.g. Khawia sinensis Hsü, 1935), cause severe fish diseases. The simplification and limited number of morphological characters cause species identification and taxonomic classification is problematic. Recent research found that the present classification of caryophyllideans could not reveal the natural phylogenetic relationships (Brabec et al. 2012;Xi et al. 2018). Further studies were desired to re-construct the taxonomic system. Maternal inheritance and rapid evolution have proven to be key factors in phylogenetic studies in tapeworms, making mitochondrial DNA a powerful marker for species identification (e.g. Brabec et al. 2012;Li et al. 2017).
Parabreviscolex Xi, Oros, Chen & Xie, 2018 is a recently erected genus in the family Capingentidae Hunter, 1930 (Cestoda: Caryophyllidea), with the type species Parabreviscolex niepini Xi, Oros, Chen & Xie, 2018 from schizothoracine fish on the Tibetan Plateau (Xi et al. 2018). The historical uplift of the Tibetan Plateau has caused significant differentiation of the Tibetan biotas, resulting in many endemic species. The evolution and adaptation processes of those species have attracted much attention. In this study, the complete mitogenome of P. niepini was sequenced, which may provide useful information for better understanding the evolution and taxonomy within caryophyllideans.

Specimen collection and DNA extraction
Parabreviscolex niepini were collected from the schizothoracine fish Schizopygopsis younghusbandi Regan, 1905 in the Yarlung Tsangpo River at Linzhi (29°39'N, 94°21'E), Tibet, China, and the specimens were fixed in 100% ethanol and stored at Freshwater Fisheries Research Center, Chinese Academy of Fishery Sciences. Total genomic DNA was extracted using a TIANamp Micro DNA Kit (Tiangen Biotech, Beijing, China), according to the manufacturer's instructions. DNA was stored at -20 °C for further molecular analyses.

Sequence annotation and analyses
The amplified fragments were quality-proofed, and BLASTN (Altschul et al. 1990) to confirm the fragments were the actual target sequence. The complete mitochondrial genomic sequence of P. niepini was assembled manually in a stepwise manner using the DNAstar v7.1 program (Burland 2000). To determine the gene boundaries, it was aligned against the reference mitogenomic sequences of Atractolytocestus huronensis Anthony, 1958 (KY486754) using the program MAFFT 7.149 (Katoh and Standley 2013) integrated with Geneious (Kearse et al. 2012). The mitogenome was annotated and characterized mainly following previous descriptions (Zhang et al. 2017a, b;Zou et al. 2017;Li et al. 2018). Protein-coding genes (PCGs) were found by searching for ORFs (employing genetic code 9, an echinoderm mitochondrial genome) and checking the nucleotide alignments against the reference genome in Geneious. All tRNAs were identified, and confirmed with ARWEN (Laslett and Canback 2008) and MI-TOS (Bernt et al. 2013) web servers. Similarly, rrnL and rrnS were initially found using MITOS and their boundaries were determined by the alignments with the reference genome in Geneious. The NCBI submission file and tables with statistics for mitogenomes were generated using a GUI-based program, MitoTool (Zhang 2016b). Tandem Repeats Finder (Benson 1999) was employed to find tandem repeats in the non-coding regions. A nucleotide composition table was then used to make the broken line graph of A+T content in ggplot2 (Hadley 2009). Codon usage and relative synonymous codon usage (RSCU) for twelve protein-encoding genes (PCGs) of P. niepini was computed and sorted using MitoTool, and finally imported to ggplot2 to draw the RSCU figure. Ggplot2 was used to draw scatter diagrams for the principal component analysis (PCA) and nucleotide skews. Input files for the PCA of codon usage pattern, as well as analyses of amino acid usage pattern and nucleotide skews, were generated by MitoTool. PASW 18.0 (Allen and Bennett 2010) was used to conduct a principal component analysis and generate data for the scatter diagram.

Phylogeny and gene order
Phylogenetic analyses were undertaken using nucleotide sequences of all 36 genes of the newly sequenced mitogenome of P. niepini and 36 selected cestodes mitogenomes available in the GenBank (Suppl. material 2). The mitogenomic sequences of Khawia sinensis (NC_034800/KR676560) and Caryophyllaeus brachycollis Janiszewska, 1953 (NC_035430/KT028770) from the common carp, sequenced and deposited in GenBank by the same researchers (Feng et al. 2017), were reassigned herein as Khawia sp. 1 and Khawia sp. 2, respectively, because the species identifications were questionable. Caryophyllaeus brachycollis mainly infests the cyprinid Barbus and Abramis in European countries, while its occurrence in China is rare (Barčák et al. 2014). We considered that the researchers have misidentified the two common tapeworms Khawia sinensis and Khawia japonensis (Yamaguiti, 1934) from the common carp.
Two trematode species, Dicrocoelium dendriticum (Rudolphi, 1819) (NC_025280) and Dicrocoelium chinensis Tang & Tang, 1978 (NC_025279), were used as outgroups. The nucleotide sequences for all 12 PCGs, two rRNAs and 22 tRNAs were extracted from GenBank files. The PCGs were translated into amino acid sequences (employing genetic code 9) using MitoTool, and aligned in batches with MAFFT integrated into another GUI-based program BioSuite (Zhang 2016a) using codon-alignment mode. RNAs were aligned with structural alignment mode using the Q-INS-i algorithm incorporated into MAFFT-with-extensions software. BioSuite was then used to concatenate these alignments and remove ambiguously aligned fragments from the concatenated alignments by another plug-in program, Gblocks 0.91b (Talavera and Castresana 2007). Phylogenetic analyses were conducted using maximum likelihood (ML) and Bayesian inference (BI) methods. Selection of the most appropriate evolutionary model for the dataset was carried out using ModelFinder (Kalyaanamoorthy et al. 2017). Based on the Bayesian information criterion, GTR+I+G was chosen as the optimal model for both analyses. ML analysis was performed in RaxML GUI (Silvestro and Michalak 2011) using a ML+rapid bootstrap (BP) algorithm with 1000 replicates. BI analysis was performed in Mr-Bayes 3.2.6 (Ronquist et al. 2012) with default settings, and 6×10 6 metropolis-coupled MCMC generations.

Selection analyses
To determine lineage-specific positively selected sites in individual mitochondrial PCGs, a branch-site model incorporated by CodeML within PAML package (Yang 2007) was used. The resultant ML and/or BI tree (unrooted tree with outgroups removed) was employed for the analysis. The alternative model, MA fixes ω at 1 for each branch except for the specified branch leading to P. niepini (foreground branch), wherein ω is presumed to be greater than 1. The first null model MAnull fixes ω at 1 for every branch in the tree, whereas the second null model M1a fixes ω at 1 for every branch except for the foreground branch, where ω is assumed to be in the range 0 to 1. The null model and alternative model were compared via a likelihood ratio test (LRT), and positive selection was confirmed when P<0.05. Comparing MA to MAnull can estimate positive selection, while comparing MA to M1a can identify instances of relaxation of selective constraints as well as positive selection (Láruson 2017). The posterior probabilities value (≥ 95%) of Bayes Empirical Bayes (BEB) method was used to identify for positively selected sites (Yang 2005).

Genome organization and base composition
The closed-circular mitochondrial genome of Parabreviscolex niepini is 15,034 bp in size (GenBank accession number: MG674140). The mitogenome is composed of 12 protein-encoding genes (PCGs), 22 tRNA genes, two rRNA genes, two non-coding regions, and it lacks the atp8 gene (NCR) (Fig. 1). As is common in flatworms, all genes are transcribed from the same strand (Le et al. 2002). Eight overlapping regions and 16 intergenic regions were found in the genome (Table 1). In accordance with other caryophyllidean species, the A+T content of the whole genome (59.6%) and its elements are lower than in the segmented cestodes (Fig. 2d). The mitogenome of P. niepini exhibits G-skew and T-skew, which is also the case in other cestodes (Fig. 2a). However, the unsegmented cestodes appear to exhibit less mutation bias than segmented cestodes (lower GC-skew and higher AT-skew values, Fig. 2a).
Codon usage, RSCU, and codon family proportion (corresponding to the amino acid usage) of P. niepini was investigated (Suppl. material 5). The four most abundant codon families (Phe, Val, Leu2, and Gly) encompass 38.41% of all codon families. Among these codon families, G+T-rich codons are favored over synonymous codons with lower G+T content in P. niepini (Suppl. material 5). This G+T preference corresponds well with the relatively high G+T content (Suppl. material 3) as well as G and T preference in the skewness analysis for PCGs (Suppl. material 2). Additionally, the principal component analyses (PCA) suggested that the overall amino acid usage patterns of the unsegmented cestodes (except for Khawia sinensis KY486753) were apparently different from segmented cestodes (Fig. 2c). Noteworthy, in contrast to segmented cestodes, which have notably heightened A+T content at the 3 rd codon position, these unsegmented cestodes (except Khawia sinensis KY486753) exhibit lower and/or similar A+T content to other elements of the mitogenome (Fig. 2d).

Transfer and ribosomal RNA genes
The two rRNAs, rrnL, and rrnS are 953 and 707 bp in size, with 59.6% and 60.4% A+T content, respectively (Suppl. material 3). All 22 commonly found tRNAs are present in the mitochondrial genome of P. niepini, ranging from 58 bp (trnQ and trnR) to 66 bp in size (trnN, trnW, trnT and trnL1), and adding up to 1381 bp in total coalesced length (Table 1 and Suppl. material 2). All of the secondary structures (predicted by MITOS and ARWEN) exhibit the conventional cloverleaf structure, except for trnS1 (AGN) and trnR, which lack DHU arms. The unorthodox trnS1 (AGN) and trnR were also found in the Caryophyllidea (Li et al. 2017) and the Anoplocephalidae (Guo 2016). Additionally, the anti-codon of trnS1 (AGN) in the mitogenome of P. niepini is TCT, in contrast to other eucestodes, which use GCT, except for Khawia sinensis (KY486753) (Suppl. material 4).

Phylogeny and gene order
The phylogenetic topology constructed using BI and ML methods show concordant branches and high statistical support. All bootstrap support values (BS) are higher than 68 and Bayesian posterior probabilities (BPP) are higher than 0.96. Parabreviscolex niepini exhibits the closest phylogenetic relationship with Breviscolex orientalis Kulakovskaya, 1962 and Atractolytocestus huronensis with robust support (Fig. 4). Moreover,  the similarity of the codon usage pattern (Fig. 2b) lends further support to the phylogenetic affinity of P. niepini, B. orientalis and A. huronensis. The mitochondrial gene arrangement of P. niepini (Fig. 1) shows a consistent pattern in the Caryophyllidea, and obviously differs from the segment tapeworms ( fig. S3 of Li et al. 2017).

Branch-site analysis
The branch-site model tests based on the criteria of posterior probabilities ≥ 95% in the BEB analyses and in the likelihood ratio test (LRT) (P<0.05), found the amino acid positions V(6) and H(49) of P. niepini cytb (Suppl. material 6, Table 2) were under positive selection. Moreover, several sites in nad4, nad5, and cox3 were also identified to exhibit relaxed selective pressure (Table 2).

Discussion
The different evolution rate of individual genes render phylogenetic analysis of cestode complicated or unreliable for some taxa; however, the complete mtDNA data was considered to provide the best interrelationship estimate (Waeschenbach et al. 2012). So far, the amount of mitogenome data available was limited. In this study, we sequenced and characterized the sixth mitogenome of caryophyllidean. The phylogenetic analysis constructed here placed caryophyllideans in the basal clade of eucestodes, and supported the position of unsegmented tapeworms as the earliest divergent group. In the caryophyllidean clade, the family Lytocestidae was found to be polyphyletic group, with lingeages Khawia spp. and Atractolytocestus huronensis recovered as distantly related. Atractolytocestus huronensis clustered robustly with Parabreviscolex niepini and Breviscolex orientalis of the family Capingentidae. Thus, further study is needed to recircumscribe the Lytocestidae. The mitogenome of P. niepini showed the consistent characters of unsegmented tapeworms determined by Li et al. (2017), and differed significantly from the segmented tapeworms in codon usage and gene order. The unsegmented caryophyllideans consisted of two clades according to the fish hosts, cypriniform and siluriform (Xi et al. 2018). The tapeworms sequenced in the present study were all collected from cypriniform fishes; however, specimens from catostomid fishes have never been reported. Further studies are required to determine the similarity of the mitogenome of caryophyllideans from catostomid fish.

Conclusions
In this study, the complete mitogenome of the tapeworm Parabreviscolex niepini from a schizothoracine fish Schizopygopsis younghusbandi was sequenced, annotated, and characterized. The mitogenome organization analysis indicated that it possessed a similar pattern to those caryophyllideans deposited in the GenBank database. Phylogenetic analysis based on mitogenomic data further confirmed the taxonomic validity of P. niepini, and its closest evolutionary relationship with Breviscolex orientalis and Atractolytocestus huronensis.