2021080620211042517240A5F11F-E54D-5FCC-B904-46B4AAC83A4DB80A1A95-A189-4CC1-9D5F-7F0C80CD9FF049628982112202024052021Ling Zhao, Jiufeng Wei, Wanqing Zhao, Chao Chen, Xiaoyun Gao, Qing ZhaoThis is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.http://zoobank.org/B80A1A95-A189-4CC1-9D5F-7F0C80CD9FF0
Pentatomarufipes (Linnaeus, 1758) is an important agroforestry pest widely distributed in the Palaearctic region. In this study, we sequence and annotate the complete mitochondrial genome of P.rufipes and reconstruct the phylogenetic trees for Pentatomoidea using existing data for eight families published in the National Center for Biotechnology Information database. The mitogenome of P.rufipes is 15,887-bp-long, comprising 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and a control region, with an A+T content of 77.7%. The genome structure, gene order, nucleotide composition, and codon usage of the mitogenome of P.rufipes were consistent with those of typical Hemiptera insects. Among the protein-coding genes of Pentatomoidea, the evolutionary rate of ATP8 was the fastest, and COX1 was found to be the most conservative gene in the superfamily. Substitution saturation assessment indicated that neither transition nor transversion substitutions were saturated in the analyzed datasets. Phylogenetic analysis using the Bayesian inference method showed that P.rufipes belonged to Pentatomidae. The node support values based on the dataset concatenated from protein-coding and RNA genes were the highest. Our results enrich the mitochondrial genome database of Pentatomoidea and provide a reference for further studies of phylogenetic systematics.
Zhao L, Wei J, Zhao W, Chen C, Gao X, Zhao Q (2021) The complete mitochondrial genome of Pentatoma rufipes (Hemiptera, Pentatomidae) and its phylogenetic implications. ZooKeys 1042: 51–72. https://doi.org/10.3897/zookeys.1042.62302
Introduction
The mitochondrion is a semi-autonomous organelle with its own genetic material, known as the mitochondrial genome (mitogenome) (Nass and Nass 1963). The mitogenome is widely used in the fields of molecular evolution, phylogenetic analysis, molecular ecology, biogeography, and population genetics because of its advantages of small size, stable genetic composition, and maternal inheritance (Ballard and Whitlock 2004; Simon and Hadrys 2013; Cameron 2014; Yuan and Guo 2016). Insects, as the most diverse, numerous, and widely distributed animals on Earth, are hotspots in mitogenome research (Boore 1999). To date, mitochondrial genome research has been very extensive, covering all orders of insects (Cameron 2014). Insect mitogenomes are covalently closed, double-stranded, circular DNA molecules (14–20 k bp long), and usually contain a control region and 37 genes: 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, and two ribosomal RNA (12S rRNA and 16S rRNA) genes (Boore 1999; Cameron and Whiting 2008; Cameron 2014). The structure of mitogenome in most known insects is stable, and the gene arrangement is relatively conservative, which are consistent with the genome composition and arrangement of the most typical insect mitochondrial genome, namely Drosophilayakuba Burla (Clary and Wolstenholme 1985).
Pentatomoidea, one of the most commonly encountered groups in Hemiptera, includes 1,410 genera and 8,042 species which are widely distributed worldwide (Rider et al. 2018). Pentatomoid insects have diverse feeding habits, although the majority are herbivorous. Some cause huge economic losses, such as Dolycorisbaccarum (Linnaeus) and Halyomorphahalys Stål. In addition, some pentatomoid insects are predatory, including most of the species of Asopinae; and a few groups are suspected to be fungus feeders, such as members of the Canopidae and Megarididae (Rider et al. 2018; Zhao et al. 2018). Classification of the superfamily Pentatomoidea has long been contentious; and different scholars have distinct opinions. For example, Schaefer (1993) divided Pentatomoidea into 16 families, whereas Henry (1997) recognized 17 families, placing Eumenotidae and Thyreocoridae at the family level. Grazia et al. (2008) supported the monophyly of Pentatomoidea and most of the included families, which was based on morphological characters and molecular markers (16S rRNA, 18S rRNA, 28S rRNA and COI); Wu et al. (2016) reconstructed the phylogenetic relationships of 16 families within Pentatomoidea using 18S and 28S rDNAs sequences and showed that Cydnidae and Tessaratomidae might be polyphyletic; Lis et al. (2017) combined 28S+18S rDNA sequence, questioned the monophyleticity of the “cydnoid” complex of pentatomoid families (Cydnidae, Parastrachiidae, Thaumastellidae, and Thyreocoridae), and demonstrated the polyphylicity of Cydnidae. Recently, many taxonomists reorganized the families, genera, and species of Pentatomoidea, and divided Pentatomoidea into 18 families (Rider et al. 2018). With the development of next-generation sequencing (NGS), an increasing number of pentatomoid mitogenome sequences have been obtained, which provide the possibility of resolving the phylogenetic relationships among the superfamily at the genetic level (Yuan et al. 2015b; Bai et al. 2018; Zhao et al. 2018). Furthermore, Wu et al. (2017) confirmed the monophyly of Scutelleridae (based on 18S + 28S rDNAs + 13PCGs), and Liu et al. (2019) reconstructed the phylogeny of Pentatomomorpha and Pentatomoidea based on PCGRNA and PCG12RNA. However, despite the abundance of species in the superfamily, only 97 species have complete or nearly complete mitogenomes published in the National Center for Biotechnology Information (NCBI; https://www.ncbi.nlm.nih.gov/2020.07); these represent only eight families. Moreover, there has been no discussion about the phylogenetic position of Pentatoma species, except for the description of Pentatomasemiannulata (Motschulsky) mitogenome by Wang et al. (2021). Therefore, it is necessary to determine more mitogenome sequences of Pentatoma species to better understand its phylogenetic relationships.
Pentatomarufipes (Linnaeus, 1758) (Hemiptera, Heteroptera, Pentatomidae) is a medium-sized to large, dark brown insect with reddish-orange spots and bright orange legs (Hsiao 1977; Bantock and Botting 2013). These insects are widely distributed in the Palearctic region (Ling and Zheng 1987; Fan and Liu 2012). They can damage oak, poplar, elm, hawthorn, apricot, pear, and other trees, and they constitute an important agricultural and forestry pest (Hsiao et al. 1977; Powell 2020). There are also records of P.rufipes preying on Zygaenafilipendulae L.(Lepidoptera, Zygaenidae) (Hamilton and Heath 1976). Previous studies on P.rufipes mostly focused on its physiological and morphological characteristics (Ling and Zheng 1987; Neupert et al. 2009), with limited molecular data on the mitochondrial COI and COII genes (Bu et al. 2005; Liang 2009), along with some studies identifying biological characteristics and potential control strategies (Peusens and Beliën 2012; Powell 2020).
In this study, we sequenced and annotated the mitogenome of P.rufipes and analyzed its mitogenome in detail, including the genome structure, nucleotide composition, and codon usage, and constructed RNA secondary structures. In addition, we combined the complete mitogenome of P.rufipes with the existing data for the eight families of Pentatomoidea to explore the phylogenetic position of P.rufipes.
Materials and methodsSample collection
Adult Pentatomarufipes specimens were collected in Baiji Hill (Tonghua City, Jilin Province, China; 41°58.14'N, 126°06.58'E) on 24 July 2015. All samples were immediately placed in absolute ethanol and stored in a freezer at –20 °C until DNA extraction. Specimen identification was performed by Qing Zhao. The voucher specimen is maintained at the Institute of Entomology of Shanxi Agricultural University (voucher number: SXAU 007; Taigu, China). The complete mitogenome of P.rufipes has been submitted to GenBank (accession number: MT861131).
DNA extraction and sequencing
Whole-genome DNA was extracted from the thoracic muscle of adult samples using the Genomic DNA Extraction Kit (Sangon Biotech, Shanghai, China). The mitogenomes were sequenced using the whole-genome shotgun method on the Illumina Miseq platform (Personalbio, Shanghai, China), with 400-bp inserts and paired-end model. A5-miseq v. 20150522 (Coil et al. 2015) and SPAdes v. 3.9 (Bankevich et al. 2012) were used to assemble the data.
Genome annotation and sequence analysis
After assembly, the complete mitogenome was manually annotated using Geneious v. 8.1.4 software (Kearse et al. 2012). A reference sequence of Eurydemagebleri Kolenati for annotation was obtained from the basic local alignment search tool (BLAST) in the NCBI database. The boundaries of the PCGs were determined using Open Reading Frame Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) on the NCBI website. MEGA v. 7.0 (Kumar et al. 2016) was used to translate the proteins to verify the start codons, stop codons, and amino acid sequences and to ensure the accuracy of the sequences. We annotated tRNA sequences using tRNAscan-SE (http://lowelab.ucsc.edu/tRNAscan-SE/) (Lowe and Eddy 1997) and MITOS (http://mitos.bioinf.uni-leipzig.de/index.py/) (Bernt et al. 2013) with the invertebrate mitochondrial code. The boundaries of rRNA genes were completed according to the positions of adjacent genes and published rRNA gene sequences from Pentatomidae insects in GenBank (Boore 2006). The codon usage, base composition, and amino acid composition of the mitogenome were analyzed using MEGA v. 7.0. The skew of the nucleotide composition was calculated with the formulas: AT-skew = (A – T) / (A + T) and GC-skew = (G – C) / (G + C) (Perna and Kocher 1995).
Phylogenetic analyses
In this study, we selected the mitogenomes of P.rufipes, representative species from eight other Pentatomoidea families, and two Coreoidea species (outgroup) to analyze the phylogenetic position of P.rufipes and the phylogenetic relationships within Pentatomoidea. All species included in this analysis are listed in Table 1. The nucleic acid sequences of the 13 PCGs were extracted using Geneious v. 8.1.4. All PCGs were translated into their amino acid sequences and aligned using MUSCLE with default parameters in MEGA v. 7.0 (Edgar 2004). The tRNA and rRNA genes were also aligned using the MUSCLE algorithm in MEGA v. 7.0. The resulting alignments were concatenated into a combined matrix.
List of species used to construct the phylogenetic tree.
Classificationstatus
Family
Species
Accession number
Outgroup
Coreoidea
Coreidae
Hydaropsislongirostris
EU427337
Anoplocnemiscurvipes
NC_035509
Ingroup
Pentatomoidea
Acanthosomatidae
Acanthosomalabiduroides
JQ743670
Sastragalaedessoides
JQ743676
Anaxandrataurina
NC_042801
Cydnidae
Macroscytusgibbulus
NC_012457
Adrisamagna
NC_042429
Scoparipessalvazai
NC_042800
Dinidoridae
Cyclopeltaparva
KY069962
Megymenumgracilicorne
NC_042810
Pentatomidae
Halyomorphahalys
NC_013272
Eurydemagebleri
NC_027489
Graphosomarubrolineatum
NC_033875
Gonopsisaffinis
NC_036745
Dinorhynchusdybowskyi
NC_037724
Plautiafimbriata
NC_042813
Pentatomarufipes
MT861131
Plataspidae
Coptosomabifaria
EU427334
Megacoptacribraria
NC_015342
Scutelleridae
Cantaoocellatus
NC_042803
Eurygastertestudinaria
NC_042808
Tessaratomidae
Dalcanthadilatata
JQ910981
Eusthenescupreus
NC_022449
Tessaratomapapillosa
NC_037742
Urostylididae
Urostylisflavoannulata
NC_037747
To determine if the sequences contained phylogenetic information, we tested nucleotide substitution saturation, and plotted transition and transversion substitutions against the TN93 distance for all datasets before reconstructing the phylogenetic trees using DAMBE v. 4.5.32 (Xia and Xie 2001; Xia and Lemey 2009). The optimal substitution models for each dataset were calculated using PartitionFinder v. 1.1.1 (Lanfear et al. 2012). Phylogenetic analyses were conducted using the Bayesian inference method, in MrBayes v. 3.2.5 (Ronquist et al. 2012) under the GTR+G+I substitution model with four independent Markov chains run for 10,000,000 generations and stopped when the average standard deviation value was below 0.01. The first 25% of trees were discarded as burn-ins, and the remaining trees were used to construct a 50% majority-rule consensus tree (Zhao et al. 2018). The phylogenetic trees were constructed using three types of datasets: (1) all codon positions of the 13 PCGs; (2) the 13 PCGs, excluding the third codon position (PCG12); and (3) the PCGs, 22 tRNA genes, and two rRNA genes (PCGRNA).
ResultsGenomic features
The mitochondrial genome of Pentatomarufipes is 15,887-bp-long and contains a control region and 37 genes comprising 13 PCGs, 22 tRNA genes and two rRNA genes (Fig. 1; Table 2). Among these genes, 14 genes are located on the minority strand (N-strand), including four PCGs (ND5, ND4, ND4L, and ND1), eight tRNA genes (trnQ, trnC, trnY, trnF, trnH, trnP, trnL1(CUN), and trnV), and two rRNA genes (12S rRNA and 16S rRNA genes), whereas the remaining 23 genes are encoded on the majority strand (J-strand). The mitogenome is compact, with a total of nine gene overlaps, ranging in length from 1 to 8 bp; the longest overlap is between trnW and trnC. Furthermore, there were 16 gene spacers from 1 bp to 23 bp, comprising 116 bp in total; the longest spacer region falls between trnS2 and ND1.
Mitochondrial genome map of Pentatomarufipes. Arrows indicate the orientation of gene transcription. Protein coding and ribosomal genes are shown with standard abbreviations.
https://binary.pensoft.net/fig/552981
Organization of the mitochondrial genome of Pentatomarufipes.
Gene
Strand
Position
Anticodon
Size(bp)
Start codon
Stop codon
Intergenetic nucleotides*
trnI
J
1–67
GAT
67
trnQ
N
65–134
TTG
70
–3
trnM
J
137–205
CAT
69
2
ND2
J
206–1189
984
ATT
TAA
0
trnW
J
1198–1265
TCA
68
8
trnC
N
1258–1321
GCA
64
–8
trnY
N
1331–1397
GTA
67
9
COX1
J
1407–2943
1537
TTG
T
9
trnL2UUR
J
2944–3008
TAA
65
0
COX2
J
3009–3687
679
ATA
T
0
trnK
J
3688–3761
CTT
74
0
trnD
J
3761–3822
GTC
62
–1
ATP8
J
3823–3981
159
TTG
TAA
0
ATP6
J
3975–4649
675
ATG
TAA
–7
COX3
J
4652–5440
789
ATG
TAA
2
trnG
J
5446–5510
TCC
65
5
ND3
J
5511–5864
354
ATC
TAA
0
trnA
J
5873–5943
TGC
71
8
trnR
J
5960–6024
TCG
65
16
trnN
J
6033–6101
GTT
69
8
trnS1AGN
J
6101–6170
ACT
70
–1
trnE
J
6171–6238
TTC
68
0
trnF
N
6237–6301
GAA
65
–2
ND5
N
6301–8007
1707
ATT
TAA
–1
trnH
N
8009–8076
GTG
68
1
ND4
N
8079–9410
1332
ATG
TAA
2
ND4L
N
9404–9691
288
ATT
TAA
–7
trnT
J
9694–9758
TGT
65
2
trnP
N
9759–9820
TGG
62
0
ND6
J
9823–10299
477
ATG
TAA
2
CYTB
J
10304–11440
1137
ATG
TAA
4
trnS2UCN
J
11456–11524
TGA
69
15
ND1
N
11548–12477
930
ATA
TAA
23
trnL1CUN
N
12472–12539
TAG
68
–6
16S rRNA
N
12540–13816
1277
0
trnV
N
13817–13886
TAC
70
0
12S rRNA
N
13887–14707
821
0
CR
14708–15887
1180
0
* Numbers correspond to nucleotides separating a gene from an upstream one; negative numbers indicate that adjacent cent genes overlap.
Nucleotide composition and codon usage
The base content and skewness of the genes in the P.rufipes mitogenome is shown in Table 3. The base composition of the entire sequence is in the order of A(42.0%)>T(35.7%)>C(12.4%)>G(9.9%), with a bias toward A + T. This bias was observed in all genetic elements, with an A + T content of 77.1% in PCGs, 77.7% in tRNAs, 79.8% in rRNAs, and 78.7% in the control region. The complete genome also shows a clear AC-skew (AT-skew = 0.08, GC-skew = −0.11), indicating a greater abundance of A than T and of C than G.
Nucleotide composition and skewness of the Pentatomarufipes mitochondrial genome.
Feature
Length(bp)
A%
C%
G%
T%
A+T%
AT-skew
GC-skew
Whole genome
15737
42.0
12.4
9.9
35.7
77.7
0.08
–0.11
PCGs
11046
34.2
11.1
11.8
42.9
77.1
–0.11
0.03
PCG-J
6800
37.2
12.6
11.7
38.5
75.7
–0.02
–0.04
PCG-N
4246
29.4
8.8
11.9
49.9
79.3
–0.26
0.15
tRNA genes
1460
39.7
10.0
12.3
38.0
77.7
0.02
0.10
tRNA genes-J
936
40.6
11.0
11.1
37.3
77.9
0.04
0.01
tRNA genes-N
524
38.0
8.2
14.4
39.3
77.3
–0.02
0.27
rRNA genes
2053
35.6
7.6
12.6
44.2
79.8
–0.11
0.25
Control region
1142
38.3
13.6
7.6
40.4
78.7
–0.03
–0.28
The preference for nucleotide composition is also reflected in codon use. The relative synonymous codon usage values for the P.rufipes mitogenome are summarized in Figure 2 and Table 4. Figure 3 shows the amino acid composition of the P.rufipes mitogenome. The most common amino acids are Phe, Leu, Ile, and Met, and their most abundant codons (UUU for Phe, UUA for Leu2, AUU for Ile, and AUA for Met) are all composed of A and/or T. For each amino acid, the most commonly used coded codons are NNA and NNU, reflecting the skew of the nucleotide composition toward AT. In addition, the most frequently used codons do not strictly correspond to the tRNA anticodons for most amino acids.
Amino acid composition in the Pentatomarufipes mitogenome. Codon families are provided on the x-axis. Numbers of codons of each amino acid are provided on the y-axis.
https://binary.pensoft.net/fig/552984
Codon number and RSCU in the Pentatomarufipes mitochondrial PCGs.
Amino acid
Codon
N
RSCU
N+
RSCU+
N–
RSCU–
Phe
UUU
260
1.7
118
1.49
142
1.92
UUC
46
0.3
40
0.51
6
0.08
Leu2
UUA
440
4.92
238
4.89
202
4.95
UUG
16
0.18
6
0.12
10
0.24
Leu1
CUU
47
0.53
19
0.39
28
0.69
CUC
1
0.01
1
0.02
0
0
CUA
30
0.34
26
0.53
4
0.1
CUG
3
0.03
2
0.04
1
0.02
Ile
AUU
382
1.83
255
1.8
127
1.91
AUC
35
0.17
29
0.2
6
0.09
Met
AUA
274
1.83
179
1.86
95
1.78
AUG
25
0.17
13
0.14
12
0.22
Val
GUU
80
1.99
33
1.43
47
2.72
GUC
5
0.12
1
0.04
4
0.23
GUA
68
1.69
51
2.22
17
0.99
GUG
8
0.2
7
0.3
1
0.06
Ser2
UCU
95
2.11
31
1.24
64
3.18
UCC
9
0.2
6
0.24
3
0.15
UCA
111
2.46
76
3.04
35
1.74
UCG
1
0.02
0
0
1
0.05
Pro
CCU
74
2.31
48
2.04
26
3.06
CCC
13
0.41
9
0.38
4
0.47
CCA
41
1.28
37
1.57
4
0.47
CCG
0
0
0
0
0
0
Thr
ACU
60
1.47
42
1.33
18
1.95
ACC
11
0.27
5
0.16
6
0.65
ACA
91
2.23
78
2.48
13
1.41
ACG
1
0.02
1
0.03
0
0
Ala
GCU
61
1.88
42
1.77
19
2.17
GCC
11
0.34
9
0.38
2
0.23
GCA
54
1.66
44
1.85
10
1.14
GCG
4
0.12
0
0
4
0.46
Tyr
UAU
170
1.85
67
1.7
103
1.96
UAC
14
0.15
12
0.3
2
0.04
His
CAU
59
1.66
45
1.58
14
2
CAC
12
0.34
12
0.42
0
0
Gln
CAA
47
1.84
35
2
12
1.5
CAG
4
0.16
0
0
4
0.5
Asn
AAU
179
1.8
114
1.74
65
1.91
AAC
20
0.2
17
0.26
3
0.09
Lys
AAA
102
1.79
73
1.9
29
1.57
AAG
12
0.21
4
0.19
8
0.43
Asp
GAU
63
1.88
38
1.81
25
2
GAC
4
0.12
4
0.19
0
0
Glu
GAA
73
1.7
56
1.9
17
1.26
GAG
13
0.3
3
0.1
10
0.74
Cys
UGU
42
1.71
12
1.6
30
1.76
UGC
7
0.29
3
0.4
4
0.24
Trp
UGA
91
1.88
68
1.97
23
1.64
UGG
6
0.12
1
0.03
5
0.36
Arg
CGU
13
0.96
2
0.23
11
2.32
CGC
2
0.15
1
0.11
1
0.21
CGA
35
2.59
30
3.43
5
1.05
CGG
4
0.3
2
0.23
2
0.42
Ser1
AGU
40
0.89
14
0.56
26
1.29
AGC
5
0.11
3
0.12
2
0.1
AGA
96
2.13
69
2.76
27
1.34
AGG
4
0.09
1
0.04
3
0.15
Gly
GGU
64
1.32
28
0.9
36
2.06
GGC
6
0.12
2
0.06
4
0.23
GGA
102
2.1
82
2.65
20
1.14
GGG
22
0.45
12
0.39
10
0.57
N, N+, and N– are respectively the number of codons used in the total protein codon gene, the majority strand protein codon gene (J-strand), and the minority strand protein codon gene (N-strand). Values in bold type stand for most commonly used codon for the amino acid. Underlined codons stand for the cognate codon of tRNA for each amino acid.
PCG regions
Most P.rufipesPCGs share the ATN start codon (five with ATG, three with ATT, two with ATA, and one with ATC), except for COX1 and ATP8, which start with TTG. COX1 and COX2 sequences terminate with a single T, and the stop codon for the remaining genes is TAA. The AT content (77.1%) of the 13 PCGs exceeded the GC content (22.9%), and the AT bias is moderately negative (absolute value: 0.1–0.2).
In addition, we calculated the synonymous substitutions (Ks), non-synonymous substitutions (Ka), and the Ka/Ks ratios of the 13 PCGs from Pentatomoid insects. We also compared the evolutionary rates of the 13 PCGs (Fig. 4). The evolutionary rate of ATP8 was the fastest, followed by that of ND6, and the COX1 gene was the most conservative with the slowest rate. The evolutionary rates of the other genes were in the order of ND2 > ND4 > ND5 > ND4L > ATP6 > ND3 > ND1 > COX2 > COX3 > CYTB. Moreover, the Ks values of the 13 PCGs were all greater than the Ka values, and the Ka/Ks ratio was <1, which indicates that the genes are subject to purifying selection.
Evolutionary rates of 13 PCGs in Pentatomoidea. Rate of non-synonymous substitutions (Ka), rate of synonymous substitutions (Ks) and ratio of rate of non-synonymous substitutions to rate of synonymous substitutions (Ka/Ks) are calculated for each PCG.
https://binary.pensoft.net/fig/552985tRNA genes, rRNA genes, and the control region
We detected 22 tRNA genes, which can transport all 20 amino acids, in the mitogenome of P.rufipes. There are two tRNAs each for leucine and serine: trnL1 (CUN) and trnL2 (UUR), and trnS1 (AGN) and trnS2 (UCN), respectively. The anticodons of trnL are TAA and TAG, and the anticodons of trnS are ACT and TGA. The 22 tRNA genes span 1,481 bp, between 62 and 74 bp in length. Although trnS1 lacks a dihydrouridine arm, the other tRNA genes all have the classic clover leaf secondary structure. In addition to the typical base pairs (A-U and G-C), some wobble G-U pairs appear in these secondary structures, which can form stable chemical bonds between G and U; In addition, atypical pairing of U-U and U-C is also found (Fig. 5).
Predicted secondary structure of tRNA genes in the Pentatomarufipes mitogenome.
https://binary.pensoft.net/fig/552986
The two P.rufipesrRNA genes (12S rRNA and 16S rRNA) are encoded on the N-strand. The 16S rRNA gene is located between trnL1 (CUN) and trnV, which is 1,277 bp in length, and there is no gene overlap between 16S rRNA and the two tRNA genes. The 12S rRNA gene (821 bp) is located between trnV and the control region, similar to the published pentatomid mitogenomes. The base content of the rRNA genes is in the order of T (44.2%) > A (35.6%) > G (12.6%) > C (7.6%). The AT-skews are negative, and the GC-skews are positive. The complete secondary structures of the 12S rRNA and 16S rRNA genes are shown in Figures 6, 7, respectively.
Predicted secondary structure of the 16S rRNA in the Pentatomarufipes mitogenome.
https://binary.pensoft.net/fig/552988
The control region of the mitogenome of P.rufipes is located between the 12S rRNA gene and trnI. The control region is 1,180 bp long, making it the longest noncoding region in the mitogenome, and has an A + T content of 78.7%. The AT-skew and GC-skew in the control area are –0.03 and –0.28, respectively, indicating that the content of T is higher than that of A and the content of C is higher than that of G.
Saturation test
To eliminate the negative effect of the substitution saturation in the phylogenetic analysis, saturation tests on the three data sets were conducted. Nucleotide sequence substitution saturation is usually determined by analyzing the relationship between the transition and transversion values against the corresponding corrected genetic distance. In all tests, the Xia saturation index (Iss) was below the critical values for a symmetric (Iss.cSym) and asymmetric (Iss.cAsym) topology (Fig. 8). The values for base transition and transversion were linearly associated with the corrected genetic distance, indicating that the nucleotide sequences of these three datasets were not saturated, making them suitable for constructing phylogenetic trees.
Substitution patterns of PCGRNA, PCG and PCG12 matrices. The graphs represent the increase in TN93 distance A PCGRNA saturation plot B PCG saturation plot C PCG12 saturation plot.
We reconstructed the phylogenetic trees of eight families in Pentatomoidea from three datasets (PCGRNA, PCG, and PCG12) using Bayesian inference method. The topological structures of the trees were similar, especially PCG and PCG12 showed similar family-level relationships (Figs 9–11). Among the three trees, the Bayesian posterior probability value of the phylogenetic tree based on the PCGRNA dataset was the highest. Phylogenetic analysis based on PCGRNA data showed that P.rufipes and Dinorhynchusdybowskyi Jakovlev were closely related, these two species formed sister groups with E.gebleri, and P.rufipes and Graphosomarubrolineatum (Westwood) had the farthest relationship. However, the results in the phylogenetic analysis based on PCG data were somewhat different from the above. In this analysis, P.rufipes and E.gebleri were the most closely related species, and they were sister groups with D.dybowskyi.
Phylogenetic tree inferred from PCG12 constructed using BI analysis. The number on the branches indicates Bayesian posterior probabilities.
https://binary.pensoft.net/fig/552992Discussion and conclusions
In this study, we sequenced the complete mitogenome of P.rufipes using NGS technology, revealing a mitogenome that is 15,887-bp-long containing 37 genes. The order of the 37 genes is consistent with other published mitogenome of Hemiptera (Hua et al. 2009; Zhang et al. 2018; Zhao et al. 2019). There are three obvious overlapping regions in mitochondrial genome of P.rufipes. The longest overlap located between trnW and trnC, which is 8 bp in length, and the overlap bases are AAGCTTTA. This overlap also showed in most species of Pentatomidae (Yuan et al. 2015a and Zhao et al. 2019). The other two pairs of genes, namely ATP8/ATP6 and ND4/ND4L, overlap by 7 bp, and both overlap bases are ATGATAA, which is consistent with other hemipteran insects (Zhang et al. 2019; Zhao et al. 2020). The longest spacer region falls between trnS2 and ND1, which is consistent with the findings of other studies (Hua et al. 2008; Zhao et al. 2019). The difference of mitogenome size between P.rufipes and other species of Hemiptera due to the length difference of the noncoding region.
The composition of the four bases in the P.rufipes mitogenome suggested highly unbalanced (A>T>C>G). The nucleotide composition shows an obvious AT preference, and the entire genome shows AT-skew and CG-skew. The above characteristics of mitogenome base composition of P.rufipes are ubiquitous to all sequenced species of Pentatomidae. The preference of bases composition is generally considered to be caused by asymmetric mutation and selection pressure of the four bases (Brown et al. 2005). Consistent with most species of Hemiptera, the PCGs of this species use the common triplet codon ATN as the start codon, TAA and a single T as the stop codon (Hua et al. 2008; Zhao et al. 2019).
The secondary structures of tRNAs for P.rufipes is conserved and trnS1 lacks DHU arm, these features meet the character of metazoan mitochondrial genomes (Wolstenholme 1992). In addition to the typical Watson-Crick pairing (G-C and A-U), there are also some typical pairings such as U-G, U-C and U-U. Some scholars have proposed that those tRNAs with non-Watson-Crick matches can be transformed into fully functional proteins through post-transcriptional mechanisms (Chao et al. 2008; Pons et al. 2014). The rRNA secondary structure of this species is also conserved. The 12S rRNA sequence includes three domains and the 16S rRNA sequence includes six domains (domain III is absent), which is similar to pentatomoid insects.
The phylogenetic result suggested that there are some different topology compared to other studies, but we infer that the possible reasons are as follows: first, the number and taxon of samples selected are different. In this study, the phylogenetic relationship between Pentatoma and Plautia was relatively close, and they were far from Graphosoma. However, when the phylogenetic tree was constructed with Pentatomasemiannulata, the relationship between Pentatoma and Graphosoma was closer (Wang et al. 2021). Second, the selection of outgroup also affects the topological structure of phylogenetic tree. Comparing our results with Zhao et al. (2017) and Liu et al. (2019), because of the three studies choose different species as the outgroup, we got different phylogenies. Third, different molecular markers also might affect phylogenetic relationships. Grazia et al. (2008) supported the monophyly of Pentatomoidea and most of the included families based on morphological characters and molecular markers (16S rRNA, 18S rRNA, 28S rRNA, and COI). Lis et al. (2012) constructed similar phylogenetic trees to our study using 12S and 16S rDNA datasets. Tian et al. (2011) (based on Hox genes), Liu et al. (2019) (based on PCGRNA and PCG12RNA) and Li et al. (2005) (based on 18S rDNA and COX1 sequence) also put forward their own opinions on the phylogenetic relationship of the superfamily. Our three topologies revealed that the Bayesian posterior probability of the tree based on PCGRNA sequences was significantly higher than that of the trees based on the PCG data, indicating that inclusion of tRNA and rRNA genes improves the accuracy of the analysis, which is consistent with the findings of the study conducted by Cameron et al. (2007, 2009).
In summary, the mitogenome of P.rufipes has typical sequence structures, and the gene content, nucleotide composition, codon usage, RNA structures, and rates of PCGs evolution are similar to those of other published pentatomid genomes. The mitochondrial genome of P.rufipes reveals the phylogenetic location of Pentatoma, indicating that the mitogenome can be used to reveal phylogenetic relationships among different taxonomic levels of insects. However, more insect mitogenomes should be sequenced, which would provide more insight into the phylogenetic relationships of species from different taxa.
Acknowledgments
This research was supported by National Science Foundation of China [no. 31872272 and 31501876]; Shanxi Scholarship Council of China [no. 2020-064 and 2020-065], and Shanxi Graduate Innovation Project of Shanxi Province [no. 2020SY215]. The authors have declared that no competing interests exist.
ReferencesBaiJXuSNieZWangYZhuCWangYMinWCaiYZouJZhouX (2018) The complete mitochondrial genome of Huananpotamonlichuanense (Decapoda: Brachyura) with phylogenetic implications for freshwater crabs.646: 217–226. https://doi.org/10.1016/j.gene.2018.01.015BallardJWOWhitlockMC (2004) The incomplete natural history of mitochondria.13: 729–744. https://doi.org/10.1046/j.1365-294X.2003.02063.xBankevichANurkSAntipovDGurevichAADvorkinMKulikovASLesinVMNikolenkoSIPhamSPrjibelskiADPyshkinAVSirotkinAVVyahhiNTeslerGAlekseyevMAPevznerPA (2012) SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.19: 455–477. https://doi.org/10.1089/cmb.2012.0021BantockTBottingJ (2013) British bugs: an online identification guide to UK Hemiptera. http://www.britishbugs.org.uk/index.htmlBerntMDonathAJühlingFExternbrinkFFlorentzCFritzschGPützJMiddendorfMStadlerPF (2013) MITOS: improved de novo metazoan mitochondrial genome annotation.69: 313–319. https://doi.org/10.1016/j.ympev.2012.08.023BrownTACecconiCTkachukANBustamanteCClaytonDA (2005) Replication of mitochondrial DNA occurs by strand displacement with alternative light-strand origins, not via a strand-coupled mechanism.19: 2466–2476. https://doi.org/10.1101/gad.1352105BooreJL (1999) Animal mitochondrial genomes.27: 1767–1780. https://doi.org/10.1093/nar/27.8.1767BooreJL (2006) The use of genome-level characters for phylogenetic reconstruction.21: 439–446. https://doi.org/10.1016/j.tree.2006.05.009BuYZhengZMGuoK (2005) Sequence analysis of mtDNA-COII gene and molecular phylogeny on five species of Pentatoma (Hemiptera: Pentatomidae).27: 90–96.CameronSL (2014) Insect mitochondrial genomics: implications for evolution and phylogeny.59: 95–117. https://doi.org/10.1146/annurev-ento-011613-162007CameronSLWhitingMF (2008) The complete mitochondrial genome of the tobacco hornworm, Manducasexta, (Insecta: Lepidoptera: Sphingidae), and an examination of mitochondrial gene variability within butterflies and moths.408: 112–123. https://doi.org/10.1016/j.gene.2007.10.023CameronSLLambkinCLBarkerSCWhitingMF (2007) A mitochondrial genome phylogeny of Diptera: whole genome sequence data accurately resolve relationships over broad timescales with high precision.32: 40–59. https://doi.org/10.1111/j.1365-3113.2006.00355.xCameronSLSullivanJSongHMillerKBWhitingMF (2009) A mitochondrial genome phylogeny of the Neuropterida (lace-wings, alderflies and snakeflies) and their relationship to the other holometabolous insect orders.38: 575–590. https://doi.org/10.1111/j.1463-6409.2009.00392.xChaoJAPatskovskyYAlmoSCSingerRH (2008) Structural basis for the coevolution of a viral RNA–protein complex.15: 103–105. https://doi.org/10.1038/nsmb1327ClaryDOWolstenholmeDR (1985) The mitochondrial DNA molecule of Drosophilayakuba: nucleotide sequence, gene organization, and genetic code.22: 252–271. https://doi.org/10.1007/BF02099755CoilDJospinGDarlingAE (2015) A5-miseq: an updated pipeline to assemble microbial genomes from Illumina MiSeq data.31: 587–589. https://doi.org/10.1093/bioinformatics/btu661EdgarRC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5: e113. https://doi.org/10.1186/1471-2105-5-113FanZHLiuGQ (2012) Bifurcipentatoma, a new genus of pentatomini with descriptions of two new species from China (Hemiptera: Heteroptera: Pentatomidae).28: 14–28. https://doi.org/10.11646/zootaxa.3274.1.2GraziaJSchuhRTWheelerWC (2008) Phylogenetic relationships of family groups in Pentatomoidea based on morphology and DNA sequences (Insecta: Heteroptera).24: 932–976. https://doi.org/10.1111/j.1096-0031.2008.00224.xHamiltonIHeathJ (1976) Predation of Pentatomarufipes (L.) Hemiptera : Pentatomidae) upon Zygaenafilipendulae (L .) (Lepidoptera: Zygaenidae). The Irish Naturalists’ Journal 18: 337.HenryTJ (1997) Phylogenetic analysis of family groups within the infraorder Pentatomomorpha (Hemiptera: Heteroptera), with emphasis on the Lygaeoidea.90: 275–301. https://doi.org/10.1093/aesa/90.3.275HsiaoTY (1977) A Handbook for the Determination of the Chinese Heteroptera. Science Press, Beijing, 127–128.HuaJLiMDongPCuiYXieQBuW (2008) Comparative and phylogenomic studies on the mitochondrial genomes of Pentatomomorpha (Insecta: Hemiptera: Heteroptera).9: 1–15. https://doi.org/10.1186/1471-2164-9-610HuaJLiMDongPCuiYXieQBuW (2009) Phylogenetic analysis of the true water bugs (Insecta: Hemiptera: Heteroptera: Nepomorpha): evidence from mitochondrial genomes.9: 1–11. https://doi.org/10.1186/1471-2148-9-134KearseMMoirRWilsonAStones-HavasSCheungMSturrockSBuxtonSCooperAMarkowitzSDuranCThiererTAshtonBMeintjesPDrummondA (2012) Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.28: 1647–1649. https://doi.org/10.1093/bioinformatics/bts199KumarSStecherGTamuraKMedicineE (2016) MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets 33: 1–11. https://doi.org/10.1093/molbev/msw054LanfearRCalcottBHoSYWGuindonS (2012) PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses.29: 1695–1701. https://doi.org/10.1093/molbev/mss020LiHMDengRQWangJWChenZYJiaFLWangXZ (2005) A preliminary phylogeny of the Pentatomomorpha (Hemiptera: Heteroptera) based on nuclear 18S rDNA and mitochondrial DNA sequences.37: 313–326. https://doi.org/10.1016/j.ympev.2005.07.013LiangQY (2009) Molecular phylogeny analysis of 9 Species of Pentatoma (Hemiptera: Pentatomidae) inferred from COI gene sequence.31: 105–114.LingZPZhengLY (1987) A study on the systematics of Pentatomaolivier (Heteroptera: Pentatomidae).9: 39–50.LiuYQSongFJiangPWilsonJJCaiWZLiH (2018) Compositional heterogeneity in true bug mitochondrial phylogenomics.118: 135–144. https://doi.org/10.1016/j.ympev.2017.09.025LiuYQLiHSongFZhaoYSWilsonJJCaiWZ (2019) Higher-level phylogeny and evolutionary history of Pentatomomorpha (Hemiptera: Heteroptera) inferred from mitochondrial genome sequences.44: 810–819. https://doi.org/10.1111/syen.12357LisJALisPZiajaDJKocorekA (2012) Systematic position of dinidoridae within the superfamily pentatomoidea (Hemiptera: Heteroptera) revealed by the bayesian phylogenetic analysis of the mitochondrial 12S and 16S rDNA sequences.3423: 61–68. https://doi.org/10.11646/zootaxa.3423.1.5LisJAZiajaDJLisBGradowskaP (2017) Non-monophyly of the “cydnoid” complex within Pentatomoidea (Hemiptera: Heteroptera) revealed by Bayesian phylogenetic analysis of nuclear rDNA sequences.75: 481–496.LoweTMEddySR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.25: 955–964. https://doi.org/10.1093/nar/25.5.955NassMMNassS (1963) Intramitochondrial fibers with DNA characteristics. I. fixation and electron staining reactions.19: 593–611. https://doi.org/10.1083/jcb.19.3.593NeupertSRussellWKRussellDHLópezJDPredelRNachmanRJ (2009) Neuropeptides in Heteroptera: identification of allatotropin-related peptide and tachykinin-related peptides using MALDI-TOF mass spectrometry.30: 483–488. https://doi.org/10.1016/j.peptides.2008.11.009PernaNTKocherTD (1995) Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes.41: 353–358. https://doi.org/10.1007/BF01215182PeusensGBeliënT (2012) Life cycle and control of the forest bug Pentatomarufipes L. in organically managed pear orchards.77: 663–666.PonsJBauzà-RibotMMJaumeDJuanC (2014) Next-generation sequencing, phylogenetic signal and comparative mitogenomic analyses in Metacrangonyctidae (Amphipoda: Crustacea). BioMed Central 15: e566. https://doi.org/10.1186/1471-2164-15-566PowellG (2020) The biology and control of an emerging shield bug pest , Pentatomarufipes (L.) (Hemiptera: Pentatomidae).22: 298–308. https://doi.org/10.1111/afe.12408RiderDASchwertnerCFVilímováJRédeiDKmentPThomasDB (2018) Higher systematics of the Pentatomoidea. In: Mcpherson JE (Ed.) Invasive Stink Bugs and Related Species (Pentatomoidea) CRC Press, London, 25–193. https://doi.org/10.1201/9781315371221-2RonquistFTeslenkoMVan Der MarkPAyresDLDarlingAHöhnaSLargetBLiuLSuchardMAHuelsenbeckJP (2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space.61: 539–542. https://doi.org/10.1093/sysbio/sys029SchaeferCW (1993) The Pentatomomorpha (Hemiptera: Heteroptera): an annotated outline of its systematics history.90: 105–122.SimonSHadrysHA (2013) A comparative analysis of complete mitochondrial genomes among Hexapoda.69: 393–403. https://doi.org/10.1016/j.ympev.2013.03.033TianXXieQLiMGaoCCuiYXiLBuW (2011) Phylogeny of pentatomomorphan bugs (Hemiptera-Heteroptera: Pentatomomorpha) based on six Hox gene fragments.2888: 57–68. https://doi.org/10.11646/zootaxa.2888.1.5WangJJiYTLiHSongFZhangLSWangMQ (2021) Characterization of the complete mitochondrial genome of Pentatomasemiannulata (Hemiptera: Pentatomidae).6: 750–752. https://doi.org/10.1080/23802359.2021.1875912WuYZYuSSWangYHWuHYLiXRMenXYZhangYWRédeiDXieQBuWJ (2016) The evolutionary position of lestoniidae revealed by molecular autapomorphies in the secondary structure of rrna besides phylogenetic reconstruction (Insecta: Hemiptera: Heteroptera).4: 750–763. https://doi.org/10.1111/zoj.12385WuYZRédeiDEgerJr JWangYHWuHYCarapezzaAKmentPCaiBSunXYGuoPLLuoJYXieQ (2017) Phylogeny and the colourful history of jewel bugs (Insecta: Hemiptera: Scutelleridae).34: 502–516. https://doi.org/10.1111/cla.12224XiaXHXieZ (2001) DAMBE: software package for data analysis in molecular biology and evolution.92: 371–373. https://doi.org/10.1093/jhered/92.4.371XiaXHLemeyP (2009) Assessing substitution saturation with DAMBE. In: LemeyPSalemiMVandammeAM (Eds) The Phylogenetic Handbook: a Practical Approach to Phylogenetic Analysis and Hypothesis Testing., 615–630. https://doi.org/10.1017/CBO9780511819049.022YuanMGuoZ (2016) Research progress of mitochondrial genomes of Hemiptera Insects.46: 151–166. https://doi.org/10.1360/N052015-00229YuanMLZhangQLGuoZLWangJShenYY (2015a) Comparative mitogenomic analysis of the superfamily Pentatomoidea (Insecta: Hemiptera: Heteroptera) and phylogenetic implications. BMC Genomics 16: e460. https://doi.org/10.1186/s12864-015-1679-xYuanMLZhangQLGuoZLWangJShenYY (2015b) The complete mitochondrial genome of Corizustetraspilus (Hemiptera: Rhopalidae) and phylogenetic analysis of Pentatomomorpha. PLoS ONE 10: e0129003. https://doi.org/10.1371/journal.pone.0129003ZhangDLLiMLiTYuanJJBuWJ (2018) A mitochondrial genome of Micronectidae and implications for its phylogenetic position.119: 747–757. https://doi.org/10.1016/j.ijbiomac.2018.07.191ZhangDLGaoJLiMYuanJLiangJYangHBuW (2019) The complete mitochondrial genome of Tetraphlepsaterrimus (Hemiptera: Anthocoridae): Genomic comparisons and phylogenetic analysis of Cimicomorpha.130: 369–377. https://doi.org/10.1016/j.ijbiomac.2019.02.130ZhaoLWeiJLuYWangBHaoSZhaoQ (2019) The complete mitochondrial genome of Menidaviolacea (Hemiptera: Pentatomidae) and its phylogenetic implication.4: 1953–1954. https://doi.org/10.1080/23802359.2019.1617055ZhaoQWangJWangMQCaiBZhangHFWeiJF (2018) Complete mitochondrial genome of Dinorhynchusdybowskyi (Hemiptera: Pentatomidae: Asopinae) and phylogenetic analysis of Pentatomomorpha species. Journal of Insect Science 18: e44. https://doi.org/10.1093/jisesa/iey031ZhaoQCassisGZhaoLHeYFZhangHFWeiJF (2020) The complete mitochondrial genome of Zicronacaerulea (Linnaeus) (Hemiptera: Pentatomidae: Asopinae) and its phylogenetic implications.4747: 547–561. https://doi.org/10.11646/zootaxa.4747.3.8ZhaoWQZhaoQLiMWeiJFZhangXHZhangHF (2017) Characterization of the complete mitochondrial genome and phylogenetic implications for Eurydema maracandica (Hemiptera: Pentatomidae).2: 550–551. https://doi.org/10.1080/23802359.2017.1365649ZhaoWQZhaoQLiMWeiJFZhangXHZhangHF (2019) Comparative mitogenomic analysis of the Eurydema genus in the context of representative Pentatomidae (Hemiptera: Heteroptera) taxa.19: 1–12. https://doi.org/10.1093/jisesa/iez122