The complete mitochondrial genome of Xizicus (Haploxizicus) maculatus revealed by Next-Generation Sequencing and phylogenetic implication (Orthoptera, Meconematinae)

Abstract Xizicus Gorochov, 1993, the quiet-calling katydid, is a diverse genus with 68 species in world, which includes more than 45 species in China, has undergone numerous taxonomic revisions with contradicting conclusions. In this study the complete mitochondrial genome of Xizicus (Haploxizicus) maculatus collected from Hainan for the first time was sequenced using the Next-Generation Sequencing (NGS) technology. The length of whole mitogenome is 16,358 bp and contains the typical gene arrangement, base composition, and codon usage found in other related species. The overall base composition of the mitochondrial genome is 37.0 % A, 32.2 % T, 20.2 % C, and 10.6 % G. All 13 protein-coding genes (PCGs) began with typical ATN initiation codon. Nine of the 13 PCGs have a complete termination codon, but the remaining four genes (COI, COIII, ND5, and ND4) terminate with an incomplete T. Phylogenetic analyses are carried out based on the concatenated dataset of 13 PCGs and two rRNAs of Tettigoniidae species available in GenBank. Both Bayesian inference and Maximum Likelihood analyses recovered each subfamily as a monophyletic group. Regardless of the position of Lipotactinae, the relationships among the subfamilies of Tettigoniidae were as follows: ((((Tettigoniinae, Bradyporinae) Meconematinae) Conocephalinae) Hexacentrinae). The topological structure of the phylogeny trees showed that the Xizicus (Haploxizicus) maculatus is closer to Xizicus (Xizicus) fascipes than Xizicus (Eoxizicus) howardi.


Introduction
Insect mitochondrial genome (mitogenome) occurs as a small (15-20kb), circular, and double-stranded DNA molecules, including 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes (Boore 1999;Cameron 2014) and at least one large non-coding region related to the control of replication and transcription (Clayton 1992;Fernandez-Silva et al. 2003). Recently, Next-Generation Sequencing (NGS) technology, combined with bioinformatic annotation, which presently used for generating mitogenomes without using PCR, has led to a rapid increase in the number of sequenced mitogenomes (Cameron 2014). In recent years, more and more mitochondrial entire genomes of Orthoptera have been sequenced (Yang et al. 2016;Zhou et al. 2014;. Up to now, 24 Tettigoniidae (Orthoptera) mitogenomes have been reported or registered in GenBank, including only five Meconematine mitogenomes.
Xizicus Gorochov, 1993 is a diverse genus in Meconematinae with 68 species in the world, currently divided into seven subgenera by traditional taxonomy based on comparative morphology: Xizicus s. str., Axizicus, Eoxizicus, Furcixizicus, Haploxizicus, Paraxizicus, Zangxizicus according to the OSF website (Cigliano et al. 2018). However, great controversies still exist in the defining characteristics and taxonomic status of these subgenera, and the assignment of some species changes frequently and has undergone numerous taxonomic revisions with contradicting conclusions (Gorochov 1998;Jiao et al. 2013;Wang et al. 2014). Integrative taxonomy combining multiple kinds of data and complementary perspectives (e.g. morphology, ecology and DNA sequences) has been recognized as a particularly efficient means to species delimitation and genuslevel classification (Cruz-Barraza et al. 2012;Heneberg et al. 2015;Korshunova et al. 2017). The mitogenome is considered a powerful marker and is extensively applied for resolving metazoan phylogenetic relationships at both deep and shallow taxonomic levels (Cameron 2014;Zhou et al. 2017).
To date, the mitogenomes of Xizicus (Xizicus) fascipes and Xizicus (Eoxizicus) howardi have been sequenced (Yang et al. 2012;Liu 2017). Xizicus (Haploxizicus) maculatus (Xia & Liu, 1993) is a representative species of subgenus Haploxizicus distributed in Hunan of China, which was reported by Xia and Liu (1993) within the genus Xiphidiopsis. Later, Gorochov (1998) proposed the genus Axizicus and transferred Xiphidiopsis maculatus to this new genus. Wang et al. (2014) subsequently transferred it to the new subgenus Haploxizicus mainly based on the typical characteristic of vertex disc with four longitudinal bands and the simple cercus, and this classification viewpoint was adopted by the Orthoptera Species File website (Cigliano et al. 2018). Type locality of X. (H.) maculatus is Cili country of Hunan Province. Specimen used in this study was collected from Hainan Province for the first time. In this study, we provided a thorough description of the complete mitochondrion genome of X. (H.) maculatus, compared the relative synonymous codon usage with other two subgenera species. Additionally, phylogenomic analyses were conducted based on mitochondrial genome data of Tettigoniidae available in GenBank with the purpose of investigating the phylogenetic position of X. (H.) maculatus and better understanding the phylogenetic relationship of Tettigoniidae.

Taxon sampling and sequencing
The specimen of X. (H.) maculatus was collected at Jianfengling in Hainan, China in June 2017 and stored in 100 % ethanol at −4 °C. Genomic DNA was extracted from the leg muscle tissue of a single adult male species of using a DNeasy Blood & Tissue Kit (Qiagen, USA) according to the manufacturer's instruction and sent to a company for library prep and sequencing (Genesky Biotechnologies Inc., Shanghai). The library was prepared using a TruSeq DNA sample Preparation kit (Vanzyme, China) and sequenced with 150 bp pair-end reads on the Illumina Hiseq 2500 sequencing platform (Illumina, USA).

De novo assembly and annotation of the Xizicus (Haploxizicus) maculatus mitogenome
13,799,778 raw reads were sequenced by the Illumina Hiseq 2500 platform. The raw paired-end reads were filtered to obtain high-quality clean reads by using CLC Genomics Workbench 8 (CLC Bio, Aarhus, Denmark) with default parameters. Then the filtered reads were aligned to the mitochondrial genome of X. (X.) fascipes (JQ326212) as a reference using MITObim v1.8 (Hahn et al. 2013) and Mira 4.0.2 (Chevreux et al. 2004) to assembly. All of 32,608 clean mitochondrial reads yielded an average coverage of 255.4 X. The complete mitochondrial genome sequence was annotated using the software Geneious v 10.1.2 (Biomatters Ltd., Auckland, New Zealand) by comparing with the mitochondrial genome of X. (X.) fascipes (JQ326212). The tRNA genes were predicted using online software MITOS (Bernt et al. 2013). The nucleotide relative synonymous codon usage (RSCU) values of PCGs were analysed with MEGA v7 software (Kumar et al. 2016).

Phylogenetic analyses
Phylogenetic analyses were performed on the concatenated datasets of PCGs and rR-NAs of the newly sequenced mitogenome and 24 Tettigoniidae species downloaded from GenBank, with two Phaneropteridae taxa (Ducetia japonica and Phyllomimus sinicus) selected as the outgroups. Alignment of each protein-coding gene inferred from the amino acid alignment was performed using MEGA v7.0 (Kumar et al. 2016), and the alignment results were then concatenated. Bayesian inference (BI) analysis was used for phylogenetic reconstruction with MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003) under the partitioned models chosen by PartitionFinder 2 (Lanfear et al. 2017). In the BI analysis, 10,000,000 generations were run, with four MC chains, and the trees were sampled every 1000 generations with a burn-in step. The confidence values for the BI tree were expressed as the Bayesian posterior probabilities in percentages. A maximum likelihood (ML) tree was constructed using RAxML 8.0 (Stamatakis 2014) and the optimal partitions and best models were also selected by PartitionFinder 2 and the robustness of the phylogenetic results was tested through bootstrap analysis with 1000 replicates in RAxML and the bootstrap support values were printed on the best ML tree.

Genome organization
The complete mitogenome of X. (H.) maculatus is 16,358 bp in length and has been deposited in GenBank under accession no. MG779499. Xizicus (H.) maculatus mtD-NA is larger than that observed in other species in Meconematinae, which typically ranged from 16,044 bp (Zhou et al. 2017) to 16,166 bp (Yang et al. 2012).
The mitochondrial genome structure is detailed in Table 1. It contained a typical gene content found in metazoan mitogenomes: 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal (rRNA) genes, and one control region (Table 1, Fig. 1). Gene order and arrangement was identical to the X. (X.) fascipes mitogenome. The X. (H.) maculatus mitochondrial genes were separated by a total of 65 bp intergenic spacer sequences, which spreaded over ten regions and range in size from one to 17 bp. There were 13 overlaps amongst all 48 bp and the longest overlaps of 8 bp were located between tRNA Trp -tRNA Cys and tRNA Tyr -COI. The overall base composition of the whole mitochondrial genome was 37.0 % A, 32.2 % T, 20.2 % C, and 10.6 % G, exhibiting obvious anti-G and AT bias (69.2 %) which was slightly lower than X. (X.) fascipes (70.2 %) and X. (E.) howardi (71.0 %).

Protein-coding genes
The total length of all 13 PCGs was 11,224 bp, and the overall A+T content of X. (H.) maculatus PCGs was 68.3 %. The initiation codons of all PCGs were typical with  ND4 had an incomplete termination codon T (Table 1). The incomplete termination codon is common in metazoan mitochondrial genomes and exhibit function after post-transcription polyadenylation converts into full stop codon (Ojala et al. 1981).
The four most-used amino acids in X. (H.) maculatus were Leu (16.1 %), Ser (8.9 %), Phe (8.7 %), and Ile (8.4 %), whose proportions were similar to those observed in other Tettigoniidae species. All codons were present in the protein-coding genes of this mitogenome. Excluding incomplete termination codons, there were 3,735 codons in the X. (H.) maculatus protein-coding genes. The codon usage in X. (H.) maculatus appeared to be typical of other insect mitochondrial sequences. The RSCU analysis indicated that codons including A or T at the third position were always overused compared with other synonymous codons in Xizicus (Fig. 2). The codon usage could also reflect nucleotide bias.

Ribosomal and transfer RNA genes
The length of tRNA genes ranged from 63 to 71 bp and the relative locations for each tRNA are shown in Table 1. All tRNA genes had the typical cloverleaf secondary struc-tures except for tRNA Ser(AGN) . The secondary structures of tRNA Ser(AGN) was completely identical with X. (X.) fascipes and showed a lengthened anticodon stem (9 bp) with a bulging nucleotide in the middle, an unusual 6 bp length T-stem, a mini DHU arm (2 bp), and no connector nucleotides (Yang et al. 2012).
The lrRNA and srRNA were 1304bp and 785bp in length, respectively. They were located between tRNA Leu(CUN) and A+T-rich region, being separated by tRNA Val .

Non-coding regions
The control region was 1570 bp in length and located between srRNA and tRNA Ile in X. (H.) maculatus mitogenome, and was composed of 64.4 % A and T nucleotides ( Fig.1; Table 1). The control region is characterized by a high AT content and is thought to be involved in the regulation of mtDNA transcription and replication (Bae et al., 2004), whose size differences are not only because of high rates of nucleotide substitution, insertion or deletion, but also due to the length of tandem repeat unit and the number of tandem repetitions (Yang et al. 2012). The sequence analysis revealed four tandem repeats size from 26 to 162 bp, contributing 738 bp to the length of the region.

Phylogenetic relationships
Bayesian analyses and maximum likelihood produced identical topologies using the best-fit partitioning scheme and site-homogeneous models, excepting the location of Lipotactes tripyrga (Lipotactinae) and genera relationships in Meconematinae (Fig. 3). Bayesian inference recovered each Tettigoniinae, Bradyporinae, Conocephalinae, and Meconematinae as a monophyletic group with strongly supported (PP ≥ 0.93), while the monophyly of Conocephalinae and Meconematinae (only include the tribe Meconematini) was not well supported in ML topology. The position of L. tripyrga was at basal in the ML tree while it formed the most basal clade together with Hexacen- trinae in the BI analysis. The relationship of Lipotactinae with other subfamilies was unascertainable may due to only one taxa used in analyses.
In present study, regardless of the position of Lipotactinae, the relationships among the subfamilies of Tettigoniidae were as follows: ((((Tettigoniinae, Bradyporinae) Meconematinae) Conocephalinae) Hexacentrinae), which was congruent with the phylogenetic results using site-homogeneous both in ML and BI analyses (Zhou et al. 2017). The relationships among subfamilies within Tettigoniidae were sensitive to the methods used for tree reconstruction and the molecular maker used. Previous phylogenetic studies based on multi-molecular makers (28SrDNA, 18SrDNA, COII, Histone and Wingless genes) inferred a striking difference results that placed Hexacentrinae and Meconematinae as more 'advanced' groups sister to the clade consisting of Tettigoniinae and Lipotactinae, and Conocephalinae as the more 'primitive' group diverging at an earlier node (Mugleston et al. 2013). The present topologies placed Tettigoniinae as an apical node sister to Bradyporinae and then assembled with Meconematinae, which was congruent with the phylogenetic results using site-homogeneous both in ML and BI analyses (Zhou et al. 2017), while differed with the site-heterogeneous CAT-GTR model tree (Zhou et al. 2017). Recent studies suggest that analysis model used in mitochondrial phylogenetic reconstruction potentially impact phylogenies result when genomic data existing lineage compositional heterogeneity and saturation due to accelerated substitution rates causing homoplasy (Li et al. 2015;Zhou et al. 2017).