Research Article |
Corresponding author: Yan Dong ( dongyan_bio@126.com ) Academic editor: Pavel Stoev
© 2016 Yan Dong, Lixin Zhu, Yu Bai, Yongyue Ou, Changbao Wang.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Dong Y, Zhu L, Bai Y, Ou Y, Wang C (2016) Complete mitochondrial genomes of two flat-backed millipedes by next-generation sequencing (Diplopoda, Polydesmida). ZooKeys 637: 1-20. https://doi.org/10.3897/zookeys.637.9909
|
A lack of mitochondrial genome data from myriapods is hampering progress across genetic, systematic, phylogenetic and evolutionary studies. Here, the complete mitochondrial genomes of two millipedes, Asiomorpha coarctata Saussure, 1860 (Diplopoda: Polydesmida: Paradoxosomatidae) and Xystodesmus sp. (Diplopoda: Polydesmida: Xystodesmidae) were assembled with high coverage using Illumina sequencing data. The mitochondrial genomes of the two newly sequenced species are circular molecules of 15,644 bp and 15,791 bp, within which the typical mitochondrial genome complement of 13 protein-coding genes, 22 tRNAs and two ribosomal RNA genes could be identified. The mitochondrial genome of A. coarctata is the first complete sequence in the family Paradoxosomatidae (Diplopoda: Polydesmida) and the gene order of the two flat-backed millipedes is novel among known myriapod mitochondrial genomes. Unique translocations have occurred, including inversion of one half of the two genomes with respect to other millipede genomes. Inversion of the entire side of a genome (trnF-nad5-trnH-nad4-nad4L, trnP, nad1-trnL2-trnL1-rrnL-trnV-rrnS, trnQ, trnC and trnY) could constitute a common event in the order Polydesmida. Last, our phylogenetic analyses recovered the monophyletic Progoneata, subphylum Myriapoda and four internal classes.
gene order, mitochondrial genome, Myriapoda , next-generation sequence, Polydesmida
The Myriapoda comprise more than 18,000 species worldwide and are a diverse and ecologically important group of terrestrial arthropods (Miyazawa et al. 2014). Four groups have been united as Myriapoda: the Chilopoda (centipedes) and Diplopoda (millipedes), containing the vast majority of species, and the poorly investigated Pauropoda and Symphyla. An ancient group, the myriapods inhabited Pangea, and as a result they occur on all continents today except for Antarctica (
Extant myriapods have not been the subject of extensive molecular phylogenetic study and studies that have been done tend to focus on relationships at the phylum level. Comparative morphological, molecular and higher-level systematic evidence has largely confirmed the monophyly of myriapods (
Mitochondrial genomes are used extensively to study phylogeography and phylogenetic relationships (
Gene order in the centipede Cermatobius longicornis and the millipede Prionobelum sp. is identical to that of Limulus polyphemus (Arthropoda: Xiphosura) (
Compositional heterogeneity and accelerated substitution rates have proven to be major sources of systematic bias in mtDNA based phylogeny (
Prior to our study, the mt genomes of one flat-backed millipede representing the family Xystodesmidae: Appalachioria falcifera (Keeton, 1959) was sequenced. In this species, the entire side of the mt genome is inverted and all genes are on a single strand. Whole genome shotgun reads sequenced with the Illumina sequencing platform were used to obtain two complete mt genomes from the millipedes Asiomorpha coarctata and Xystodesmus sp. These species are representatives of the families Paradoxosomatidae and Xystodesmidae and further our understanding of how gene rearrangement occurred in the Polydesmida. Phylogenetic analysis including myriapods, three other arthropod classes and outgroups (species of Onychophora and Priapulida) were also performed to explore the internal relationships within the Myriapoda using sequence alignments from mitochondrial genes.
Specimens of Asiomorpha coarctata and Xystodesmus sp. were collected from Langya Mountain, Chuzhou, Anhui, China (32°16'N, 118°16'E) and stored at the Molecular Biology Laboratory, Chuzhou University, Chuzhou, Anhui, China (MBLCZU). Species identification was performed by Dong Y (first author) and Qian CY (Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China). Voucher specimens (MBLCZU000145, MBLCZU000146) were deposited at the MBLCZU. Total genomic DNA was extracted from one individual representing each species using the DNeasy tissue Kit (Qiagen China, Shanghai).
Standard PCR reactions to amplify three different fragments of mtDNA (cox1, cob and nad5) were undertaken for each sample. Primers are listed in Suppl. material
For Illumina sequencing, double index sequencing libraries with average insert sizes of around 300 bp were prepared. The libraries were sequenced as 250 bp paired-end runs on an Illumina Hi-Seq 2000 (about 2 Gb raw data each species). The resulting bait sequences (cox1, cob and nad5) were subsequently employed as references in the manner detailed below. De novo assemblies were conducted with Geneious v8.1 using the Map to Reference program with the following settings applied: medium-low sensitivity/fast; iterate up to five times; with a maximum of 2% mismatches, a maximum gap size of 3 bp and requiring a minimum overlap of 100 bp; do not trim. After manual inspection, the longest contigs resulting from the respective assemblies were then aligned and ensured for correct translation frames with MEGA v5.0, together with reference protein-coding gene sequences from seven millipedes (Narceus annularus, Thyropygus sp., Antrokoreana gracilipes, Appalachioria falcifera, Abacion magnum, Brachycybe lecontii, Prionobelum sp.). The contig ends of A. coarctata and Xystodesmus sp. overlapped.
The assembled consensus sequence of each millipede mtDNA was further annotated and analyzed. Preliminary annotation using MITOS webserver provided overall information on mt genomes (
Phylogenetic trees were reconstructed focused on the Arthropoda defining Onychophora and Priapulida as outgroups in the analyses, including 31 taxa (Suppl. material
The optimal partition strategy and models were selected by PartitionFinder v1.1.1. We created an input configuration file that contained 13 pre-define partitions by gene. We used the ‘greedy’ algorithm with branch lengths estimated as ‘unlinked’ and Akaike Information Criterion (AIC) to search for the best-fit scheme (Suppl. material
Maximum likelihood phylogenetic analysis searches were carried out with RAxML through a web portal (http://phylobench.vital-it.ch/raxml-bb/index.php). Bootstrap values, indicating the robustness of the internal nodes of the gene trees, were set at 100 replicates. Bayesian analyses of nucleotide and amino acid data sets were performed with MrBayes v3.1.2 (
The complete mt genome of Asiomorpha coarctata (GenBank accession KU721885) is 15,644 bp long and was assembled with coverage between 54–1062 reads; the genome of Xystodesmus sp. (GenBank accession KU721886) is 15,791 bp in length and has a coverage within 86–1392 reads. The sizes of these two mt genomes are within the range reported for those of other myriapods ranging from 14,487–16,833 bp (GenBank accessions NC_016676 and NC_021403).
The two genomes encode 13 protein-coding regions, 22 tRNA and two rRNA genes, consistent with metazoan mitochondrial DNA structure, but contain a number of unique features (Figure
Mitochondrial genomes of the two millipedes sequenced in this study. A Asiomorpha coarctata B Xystodesmus sp. Circular maps were drawn with Geneious v9.1.2. Arrows indicate the orientation of gene transcription. Abbreviations of gene names are: atp6 and atp8 for ATP synthase subunits 6 and 8; cox1–3 for cytochrome oxidase subunits 1–3; cob for cytochrome b, nad1–6 and nad4L for NADH dehydrogenase subunits 1–6 and 4L; and lrRNA and srRNA for large and small rRNA subunits. tRNA genes are indicated with their one-letter corresponding amino acids. CR for control region. The GC content was plotted using a green sliding window and the AT content was blue.
Feature | From | To | Length (nt) | Codons | Spacer/Overlap(-) | |
---|---|---|---|---|---|---|
Start Stop | ||||||
cox1 | 1 | 1533 | 1533 | ATG | TAG | 11 |
cox2 | 1545 | 2222 | 678 | ATG | TAG | 8 |
trnK | 2231 | 2297 | 67 | CTT | 1 | |
trnD | 2299 | 2363 | 65 | GTC | 0 | |
atp8 | 2364 | 2522 | 159 | ATG | TAA | -7 |
atp6 | 2516 | 3182 | 667 | ATG | TA | 0 |
cox3 | 3183 | 3967 | 785 | ATG | TA | 0 |
trnG | 3968 | 4031 | 64 | TCC | 0 | |
nad3 | 4032 | 4381 | 334 | GTG | TA | 0 |
trnA | 4382 | 4443 | 62 | TGC | 0 | |
trnR | 4444 | 4509 | 66 | TCG | 2 | |
trnN | 4512 | 4579 | 68 | GTT | 0 | |
trnS1 | 4580 | 4638 | 69 | GCT | 0 | |
trnE | 4639 | 4703 | 65 | TTC | 0 | |
nad6 | 4704 | 5176 | 473 | ATG | AGT | 0 |
cytb | 5177 | 6297 | 1121 | ATG | TA | 0 |
trnS2 | 6298 | 6358 | 61 | TGA | 0 | |
CR1 | 6359 | 7321 | 963 | 0 | ||
rrnS | 7322 | 8097 | 776 | 0 | ||
trnV | 8098 | 8162 | 65 | TAC | 0 | |
rrnL | 8163 | 9402 | 1240 | 0 | ||
trnL1 | 9403 | 9465 | 63 | TAG | 0 | |
trnL2 | 9469 | 9533 | 65 | TAA | 3 | |
nad1 | 9534 | 10460 | TTG | TAA | 0 | |
trnP | 10461 | 10524 | 64 | CCA | 0 | |
nad4L | 10525 | 10806 | 62 | ATG | TAG | 0 |
nad4 | 10800 | 12143 | 1344 | GTG | TAG | 0 |
trnT | 12144 | 12206 | 63 | TGT | 0 | |
CR2 | 12207 | 12393 | 187 | 0 | ||
trnH | 12394 | 12458 | 65 | GTG | 1 | |
nad5 | 12460 | 14160 | 1701 | ATG | TAG | 3 |
trnF | 14164 | 14231 | 68 | GAA | 0 | |
trnY | 14232 | 14299 | 68 | GTA | 1 | |
trnQ | 14301 | 14371 | 71 | TTG | 0 | |
trnC | 14372 | 14436 | 65 | GCA | 0 | |
trnI | 14437 | 14503 | 67 | GAT | 0 | |
trnM | 14504 | 14570 | 67 | CAT | 0 | |
nad2 | 14571 | 15575 | 1005 | TTG | GGA | 0 |
trnW | 15576 | 15644 | 69 | TCA | 0 | |
Total | 15644 | -7/30 |
Feature | From | To | Length (nt) | Codons | Spacer/Overlap(-) | |
---|---|---|---|---|---|---|
Start Stop | ||||||
cox1 | 1 | 1533 | 1533 | ATG | TAG | 6 |
cox2 | 1540 | 2217 | 678 | ATG | TAG | 3 |
trnK | 2221 | 2287 | 67 | CTT | 0 | |
trnD | 2288 | 2351 | 64 | GTC | 0 | |
atp8 | 2352 | 2513 | 162 | ATG | TAA | -7 |
atp6 | 2507 | 3175 | 669 | ATG | TA | 0 |
cox3 | 3179 | 3963 | 785 | ATG | TA | 0 |
trnG | 3964 | 4029 | 66 | TCC | 0 | |
nad3 | 4030 | 4380 | 351 | GTG | TAG | 1 |
trnA | 4382 | 4445 | 64 | TGC | 4 | |
trnR | 4450 | 4514 | 65 | TCG | 3 | |
trnN | 4518 | 4579 | 68 | GTT | 0 | |
trnS1 | 4580 | 4640 | 61 | GCT | 2 | |
trnE | 4643 | 4710 | 68 | TTC | 2 | |
nad6 | 4713 | 5181 | 469 | ATG | T | 0 |
cytb | 5182 | 6303 | 1122 | ATG | TAA | 0 |
trnS2 | 6304 | 6360 | 57 | TGA | 0 | |
CR1 | 6361 | 7392 | 963 | 0 | ||
rrnS | 7393 | 8172 | 780 | 0 | ||
trnV | 8173 | 8238 | 66 | TAC | 0 | |
rrnL | 8239 | 9465 | 1227 | 0 | ||
trnL1 | 9466 | 9530 | 65 | TAG | 51 | |
trnL2 | 9582 | 9650 | 69 | TAA | 3 | |
nad1 | 9651 | 10575 | 925 | TTG | T | 0 |
trnP | 10576 | 10640 | 65 | TGG | 2 | |
nad4L | 10643 | 10924 | 282 | ATG | TAG | 0 |
nad4 | 10918 | 12264 | 1347 | GTG | TAA | 1 |
trnT | 12266 | 12327 | 62 | TGT | 0 | |
CR2 | 12328 | 12535 | 208 | 0 | ||
trnH | 12536 | 12599 | 64 | GTG | 1 | |
nad5 | 12602 | 14299 | 1698 | ATG | TAG | 3 |
trnF | 14303 | 14369 | 67 | GAA | 0 | |
trnY | 14370 | 14434 | 65 | GAT | 15 | |
trnQ | 14450 | 14515 | 66 | TTG | 2 | |
trnC | 14518 | 14582 | 65 | GCA | 3 | |
trnI | 14586 | 14650 | 65 | GAT | 0 | |
trnM | 14651 | 14716 | 66 | CAT | 0 | |
nad2 | 14717 | 15722 | 1006 | TTG | T | 0 |
trnW | 15723 | 15790 | 68 | TCA | 1 | |
Total | 15791 | -7/103 |
Some millipede mt genomes that have been sequenced (e.g. Antrokoreana gracilipes) include two major non-coding regions and others contain a single non-coding region, such as Narceus annularus, Prionobelum sp. and Appalachioria falcifera (Figure
The largest non-coding region (CR1, 963 bp) is located between trnS2 and rrnS in A. coarctata (Table
There are 22 potential tRNA genes in Asiomorpha coarctata and Xystodesmus sp., respectively (Figure
Gene order arrangements were compared with mt genome organization in other myriapods (Figure
Comparison of gene arrangements in mtDNA of the arthropod ground pattern. Gene segments are not drawn to scale. Genes shaded gray have different relative positions compared to the ground pattern. Underlining indicates the gene is encoded on the opposite strand, and arrows indicate translocation of trnT. CR: putative control region. Gene arrangements of two diplopods, N. annularus and Thyropygus sp. are similar and represented as one.
Bayesian inference and maximum likelihood phylogenetic analysis were performed using conserved blocks of amino acid and nucleotide data sets (Figure
Phylogenetic tree of the Arthropoda, including Myriapoda, Hexapoda, Crustacea and Chelicerata and outgroups reconstructed based on protein-coding genes from mtDNA genomes. Each group of four numbers indicates node confidence values (from top left): Bayesian posterior probabilities in percent (BPP) in amino acid and nucleotide datasets; maximum likelihood bootstrapping values (MLBP) in amino acid and nucleotide datasets.
Within the Diplopoda, the Polydesmida clade, including Appalachioria falcifera, Asiomorpha coarctata, and Xystodesmus sp. is well supported. A clade consisting of Polydesmida plus Colobognatha (Brachycybe lecontii) is also recovered. All millipedes in this study except Prionobelum sp. recovered as a monophyletic group, Helminthomorpha. The grouping with Helminthomorpha and Pentazonia (the basal lineage Prionobelum sp.) is supported.
Four myriapod clades are resolved, with Chilopoda (BPP = 1.00 and 1.00; MLBP = 99 and 98) as the basal group. A monophyletic Progoneata (“Symphyla + Pauropoda” + Diplopoda) is recovered as the sister group of the Chilopoda (BPP = 1.00 and 0.99; MLBP = 89 and 80). Our Bayesian inference analysis recognizes a monophyletic Myriapoda with strong support, while the maximum likelihood analysis is only weakly supported.
Next-generation sequencing such as the 454 pyrosequencing, Solexa, and SOLiD provided by Roche, Illumina and Applied Biosystems, has the ability to generate a large number of sequences within a very short time compared to Sanger’s method (
We found that the genome size was larger in A. coarctata (15,644 bp) and Xystodesmus sp. (15,791 bp) than other millipede mt genomes (14747–15282 bp; Antrokoreana gracilipes, GenBank accession NC_010221; Appalachioria falcifera, GenBank accession NC_021933). Intergenic spacer length variation may have arisen through retention of partial duplication, or incomplete multiple deletions of redundant genes (
Most of the tRNAs appear to be truncated and lack one of the arms found in the canonical tRNAs of pauropods and symphylans (
Compared with other millipedes, the mt gene order in A. coarctata and Xystodesmus sp. is very similar to that in Appalachioria falcifera. These three species belong to the order Polydesmida (Figure
Gene arrangement in three flat-backed millipedes is similar to that in other Helminthomorpha sequenced previously in which nad6 + cob placements occurred. We agree with
This phylogenomic study provides a strongly supported phylogenetic framework for the monophyletic origin of the Myriapoda and the monophyly of the Progoneata and extant myriapod subgroups. Among the Myriapoda, the union of Diplopoda with Pauropoda as a monophyletic group (= Dignatha) was once widely accepted (
The Polydesmida and Helminthomorpha are recovered as monophyletic. Gene order in Prionobelum sp. which is the basal lineage of the millipede mt genome is assumed to represent the millipede or myriapod ground pattern (
Polydesmida has been considered the sister group of Juliformia and a ‘ring-forming’ clade from the perspective of morphology (
Many internal relationships of the Diplopoda remain unresolved and several groups are paraphyletic. The Bayesian inference tree seems to have better support at shallower nodes with amino acid data sets, whereas the maximum likelihood tree has better support at deeper nodes. The maximum likelihood tree has very low support at most nodes and confuses the relationships of taxa that are confidently placed in monophyletic groups by other studies (
The authors declare no competing interests. Special thanks to Pavel Stoev and three reviewers for constructive comments that improved this paper. This research was supported by the National Natural Science Foundation of China Project (31401971), Anhui Provincial Natural Science Foundation (1408085QC67), and Natural Science Foundation of the Higher Education Institutions of Anhui Province (KJ2013Z245).
Supplementary tables
Data type: molecular data