Research Article |
Corresponding author: Mingsheng Yang ( yms-888@163.com ) Academic editor: Thomas Simonsen
© 2019 Mingsheng Yang, Bingyi Hu, Lin Zhou, Xiaomeng Liu, Yuxia Shi, Lu Song, Yunshan Wei, Jinfeng Cao.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Yang M, Hu B, Zhou L, Liu X, Shi Y, Song L, Wei Y, Cao J (2019) First mitochondrial genome from Yponomeutidae (Lepidoptera, Yponomeutoidea) and the phylogenetic analysis for Lepidoptera. ZooKeys 879: 137-156. https://doi.org/10.3897/zookeys.879.35101
|
The complete mitochondrial genome (mitogenome) of Yponomeuta montanatus is sequenced and compared with other published yponomeutoid mitogenomes. The mitogenome is circular, 15,349 bp long, and includes the typical metazoan mitochondrial genes (13 protein-coding genes, two ribosomal RNA genes, and 22 transfer RNA genes) and an A + T-rich region. All 13 protein-coding genes use a typical start codon ATN, the one exception being cox1, which uses CGA across yponomeutoid mitogenomes. Comparative analyses further show that the secondary structures of tRNAs are conserved, including loss of the Dihydorouidine (DHU) arm in trnS1 (AGN), but remarkable nucleotide variation has occurred mainly in the DHU arms and pseudouridine (TψC) loops. A + T-rich regions exhibit substantial length variation among yponomeutoid mitogenomes, and conserved sequence blocks are recognized but some of them are not present in all species. Multiple phylogenetic analyses confirm the position of Y. montanatus in Yponomeutoidea. However, the superfamily-level relationships in the Macroheterocera clade in Lepidoptera recovered herein show considerable difference with that recovered in previous mitogenomic studies, raising the necessity of extensive phylogenetic investigation when more mitogenomes become available for this clade.
Mitogenome evolution, next-generation sequencing, protein-coding genes
The mitochondrial genome (mitogenome) is a circular and double-stranded molecule that usually encodes 37 genes (13 protein-coding genes (PCGs), two ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs)), and an A + T-rich region (
Lepidoptera is the second largest insect order after Coleoptera, with more than 157,000 extant species in 43 superfamilies (
Mitogenomic data of major Yponomeutoidea lineages would play an important role for better understanding the evolution of the superfamily or even Lepidoptera as a whole. In the present study, we sequenced the complete mitogenome of Yponomeuta montanatus Moriuti, 1977, the first mitogenome from the family Yponomeutidae. Moreover, detailed comparative analyses were conducted based on this and all other published yponomeutoid mitogenomes. In addition, extensive phylogenetic analyses using three different datasets and three different tree-constructed methods were performed to test phylogenetic implications of the Y. montanatus mitogenome in Lepidoptera phylogeny. This study contributes to further understanding the mitogenome evolution and phylogeny of the Yponomeutoidea and Lepidoptera.
Adult Y. montanatus specimens were sampled by light trap at Mountain Jigongshan, Henan, China in May 2018. Fresh specimens were stored in 95–100% ethanol in the field and then maintained at –80 °C until used for DNA extraction. Dry specimens were identified based on the morphological description and illustrations provided by
Next-generation sequencing methods were used to obtain the complete mitogenome sequence of Y. montanatus. Briefly, total genomic DNA was firstly quantified and fragmented to an average size of 400 bases using Covaris M220 system with the Whole Genome Shotgun method (Covaris, Woburn, MA, USA). Then, a library was constructed using the TruSeq DNA PCR-Free Sample Preparation Kit (Illumina, USA). Lastly, Illumina HiSeq 2500 was used for sequencing with the strategy of 251 paired-ends.
A total of 3,707,876 raw paired reads were retrieved for Y. montanatus. FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) was used for quality control (avg. Q20 > 95.1%, avg. Q30 > 88.65%). After processing with AdapterRemoval v. 2 (
The MITOS webserver was employed to annotate the complete mitogenome sequence with the invertebrate genetic code (
To investigate phylogenetic implications of the Y. montanatus mitogenome in Lepidoptera phylogeny, a total of 33 mitogenomes representing 15 lepidopteran superfamilies with mitogenome available (Suppl. material
Maximum likelihood (ML) analyses were conducted using two methods. The raxmlGUI version 1.539 interface (
Bayesian inference (BI) analysis was performed using MrBayes v. 3.1.2 (
The complete mitogenome of Y. montanatus (GenBank accession number: MK256747) is circular, double-stranded, and 15,349 bp long (Fig.
Feature | Strand | Location | Size (bp) | Start codon | Stop codon | Anticodon | Intergenic nucleotides |
trnM | J | 1–67 | 67 | CAT | 0 | ||
trnI | J | 68–136 | 69 | GAT | –3 | ||
trnQ | N | 134–202 | 69 | TTG | 49 | ||
nad2 | J | 252–1265 | 1014 | ATT | TAA | –2 | |
trnW | J | 1264–1331 | 68 | TCA | –8 | ||
trnC | N | 1324–1385 | 62 | GCA | 11 | ||
trnY | N | 1397–1460 | 64 | GTA | 2 | ||
cox1 | J | 1463–2998 | 1536.9 | CGA | TAA | –5 | |
trnL2 (UUR) | J | 2994–3059 | 66 | TAA | 0 | ||
cox2 | J | 3060–3744 | 685 | ATG | T | –3 | |
trnK | J | 3742–3812 | 71 | CTT | –1 | ||
trnD | J | 3812–3876 | 65 | GTC | 0 | ||
atp8 | J | 3877–4035 | 159 | ATT | TAA | –7 | |
atp6 | J | 4029–4706 | 678 | ATG | TAA | –1 | |
cox3 | J | 4706–5497 | 792 | ATG | TAA | 2 | |
trnG | J | 5500–5565 | 66 | TCC | 0 | ||
nad3 | J | 5566–5919 | 354 | ATT | TAA | 2 | |
trnA | J | 5922–5984 | 63 | TGC | –1 | ||
trnR | J | 5984–6050 | 67 | TCG | 8 | ||
trnN | J | 6059–6123 | 65 | GTT | –1 | ||
trnS1 (AGN) | J | 6123–6188 | 66 | GCT | 0 | ||
trnE | J | 6189–6250 | 62 | TTC | –1 | ||
trnF | N | 6250–6315 | 66 | GAA | 22 | ||
nad5 | N | 6338–8050 | 1713 | ATT | TAA | 12 | |
trnH | N | 8063–8129 | 67 | GTG | 0 | ||
nad4 | N | 8130–9468 | 1339 | ATG | T | 0 | |
nad4I | N | 9469–9756 | 288 | ATG | TAA | 7 | |
trnT | J | 9764–9828 | 65 | TGT | 0 | ||
trnP | N | 9829–9894 | 66 | TGG | 2 | ||
nad6 | J | 9897–10430 | 534 | ATT | TAA | 9 | |
cob | J | 10440–11591 | 1152 | ATG | TAA | –2 | |
trnS2 (UCN) | J | 11590–11657 | 68 | TGA | 35 | ||
nad1 | N | 11693–12631 | 939 | ATG | TAA | 1 | |
trnL1 (CUN) | N | 12633–12699 | 67 | TAG | 0 | ||
rrnL | N | 12700–14073 | 1374 | 0 | |||
trnV | N | 14072–14135 | 64 | TAC | –1 | ||
rrnS | N | 14135–14903 | 769 | 0 | |||
A + T-rich region | 14904–15349 | 446 |
As in other insect mitogenomes (
Taxon | Size (bp) | A + T (%) | PCGs | rrnS RNA | rrnL RNA | tRNAs | A + T-rich region | GenBank accession no. |
No. of codon A + T (%) | Size (bp) A + T (%) | Size (bp) A + T (%) | Size (bp) A + T (%) | Size (bp) A + T (%) | ||||
Yponomeutidae | ||||||||
Yponomeuta montanatus | 15,349 | 81.08 | 3,727 79.6 | 769 85.7 | 1,374 85.1 | 1,453 80.8 | 446 96.2 | MK256747 |
Praydidae | ||||||||
Prays oleae | 16,499 | 81.8 | 3,720 79.1 | 773 85 | 1,372 85 | 1,486 81.3 | 1,483 96.3 | KM874804 |
Plutellidae | ||||||||
Plutella xylostella | 16,014 | 81 | 3,731 79.4 | 783 86.1 | 1,382 85.1 | 1,465 81.2 | 888 93.1 | KM023645 |
Plutella xylostella | 16,179 | 81.4 | 3,729 79.4 | 783 86.1 | 1,415 84.9 | 1,468 81.3 | 1,081 n.a. | JF911819 |
Lyonetiidae | ||||||||
Leucoptera malifoliella | 15,646 | 82.5 | 3,719 80.7 | 770 87.1 | 1,351 85.5 | 1,488 83.7 | 733 95.3 | JN790955 |
The total length of the 13 PCGs in Y. montanatus mitogenome is 11,183 bp, approximately accounting for 72.9% of the whole mitogenome (Table
Most PCGs in yponomeutoid mitogenomes use the conventional ATN start codon (Table
To investigate evolutionary patterns of all PCGs, nucleotide diversity and the ratio of Ka to Ks were calculated for each PCG. As shown in Figure
The Y. montanatus mitogenome contains 22 tRNAs with the length ranging from 62 bp (trnC, trnE) to 71 bp (trnK) (Fig.
Putative secondary structures of tRNAs from Yponomeuta montanatus mitogenome. The tRNAs are labeled with the abbreviations of their corresponding amino acids. The tRNA arms are illustrated as for trnV. Dashes indicate the Watson-Crick base pairs; dots indicate the wobble GU pairs; and the other non-canonical pairs are not marked. The nucleotides marked indicate the variable sites among published yponomeutoid mitogenomes.
Comparative tRNA analyses among yponomeutoid mitogenomes found that each tRNA structure is highly conserved, including the loss of the DHU arm in trnS1 (AGN). However, substantial nucleotide variation exists, most of which occurred in the DHU arm and TψC loops (Fig.
Similar to other yponomeutoid mitogenomes, two rRNA genes, rrnS and rrnL, were recognized in the Y. montanatus mitogenome (Fig.
In the Y. montanatus mitogenome, 36 gene overlapping sites were recognized across 13 gene junctions from one to eight bp in length (Table
A The overlapping region between atp8 and atp6. The nucleotides colored red indicate the sequence of overlapping region; the nucleotides with green underline indicate partial sequence of the atp8 gene, and the nucleotides with blue underline indicate the partial sequence of the atp6 gene B The intergenic region between nad6 and cob. The microsatellite (TA)n are marked red C The intergenic region between trnQ and nad2 D The intergenic region between trnS2 and nad1. The nucleotides colored red indicate the conserved motif sequence E Schematic illustration of the A + T-rich region from all yponomeutoid mitogenomes. The conserved motif ATAG (colored red) and subsequent poly-T stretch (colored green), the conserved motif ATTTA (colored blue) and subsequent (TA)n sequence (colored orange) are emphasized. Dots indicate omitted sequences, and the number of dot is not proportional to nucleotide number of corresponding part.
In addition to the A + T-rich region, a total of 162 intergenic nucleotides across 13 gene junctions from one to 49 bp were identified in Y. montanatus mitogenome (Table
As in other yponomeutoid mitogenomes, the A + T-rich region of the Y. montanatus mitogenome is located between the rrnS and trnM genes (Fig.
Insect mitochondrial A + T-rich region is usually structured in base composition, mainly exhibiting the existence of conserved sequence blocks responsible for mitogenome replication and transcription (
To investigate phylogenetic implications of the Y. montanatus mitogenome in Yponomeutoidea and Lepidoptera, we constructed the superfamily-level relationships within Lepidoptera using three inference methods and three different datasets.
As shown in Figures
Regarding the phylogenetic pattern of other superfamilies, mostly identical results were obtained by different analyses, which are also similar to other mitogenome-based studies (
We sincerely appreciate the anonymous reviewers for comments on this manuscript. This work was supported by the National Natural Science Foundation of China (31702046), Scientific and Technological Innovation Talent Project of Henan Province (19HASTIT015) and Science and Technology Program of Henan Province (182106000047).
Supplementary Tables S1–S8
Data type: molecular data
Explanation note: Table S1. List of Lepidoptera species used in phylogenetic analyses. Table S2. The best scheme and substitution models for the PCG123R dataset. Table S3. The best scheme and substitution models for for the PCG123 dataset. Table S4. AT-skew and GC-skew of the Yponomeuta montanatus mitogenome. Table S5. A + T content (%) in three codon positions in mitochondrial protein-coding genes of reported yponomeutoid mitogenomes. Table S6. Codon usage in mitochondrial protein-coding genes of the Yponomeuta montanatus. Table S7. Start and stop codons of mitochondrial protein-coding genes of four yponomeutoid species. Table S8. Evolutionary rates of mitochondrial protein-coding genes among reported yponomeutoid mitogenomes.
Supplementary Figures S1–S3
Data type: molecular data
Explanation note: Figure S1. ML trees inferred from RAxML method based on PCG123 (A) and PCGAA (B) datasets. Number on node represents bootstrap replicate. Figure S2. ML trees inferred from IQ-TREE method based on PCG123 (A) and PCGAA (B) datasets. Number on node represents bootstrap replicate. Figure S3. BI trees inferred from MrBayes method based on PCG123 (A) and PCGAA (B) datasets. Number on node represents posterior probability.