Research Article |
Corresponding author: Tianxiang Gao ( gaotianxiang0611@163.com ) Academic editor: Maria Elina Bichuette
© 2022 Lu Liu, Qi Liu, Tianxiang Gao.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Liu L, Liu Q, Gao T (2022) Genome-wide survey reveals the phylogenomic relationships of Chirolophis japonicus Herzenstein, 1890 (Stichaeidae, Perciformes). ZooKeys 1129: 55-72. https://doi.org/10.3897/zookeys.1129.91543
|
Fish are the largest vertebrate group, consisting of more than 30 000 species with important ecological and economical value, while less than 3% of fish genomes have been published. Herein, a fish, Chirolophis japonicus, was sequenced using the next-generation sequencing. Approximately 595.7 megabase pair of the C. japonicus genome was assembled (49 901 contigs with 42.61% GC contents), leading to a prediction of 46 729 protein-coding gene models. A total of 554 136 simple sequence repeats was identified in the whole genome of C. japonicus, and dinucleotide microsatellite motifs were the most abundant, accounting for 59.49%. Phylogenomic analysis of 16 genomes based on the 694 single-copy genes suggests that C. japonicus is closely related with Anarrhichthys ocellatus, Cebidichthys violaceus, and Pholis gunnellus. The results provide more thorough genetic information of C. japonicus and a theoretical basis and reference for further genome-wide analysis.
Chirolophinae, draft genome, genome assembly, genome evolution, next-generation sequencing, Stichaeidae, Zoarcales
Chirolophis Swainson, 1839 belongs to the family Stichaeidae of the order Perciformes, which is widely distributed between cold and temperate areas in the Pacific Ocean and along the coasts of Europe in the Atlantic Ocean (
Genome-based phylogenetic studies have provided new opportunities for exploring the phylogeny of fishes. With the development of molecular biology and sequencing technology, more and more species are being sequenced and genomes published, ranging from model fishes to many commercial species. There are nearly 9900 species published genomes in the Eukaryota on the NCBI database (https://www.ncbi.nlm.nih.gov/genome/), accessed on 7 July 2022. Genome survey sequencing (GSS) was considered useful for providing basic genome information. Besides productively identifying genome-wide simple sequence repeats (SSRs) effectively, it can predict putative gene functions efficiently and target the potential exon-intron boundaries. A series of research advances has been made in the study of phylogenomic relationships of organisms, such as plants (
In the order Perciformes, the genomes of only three species, Anarrhichthys ocellatus (Ayres, 1855), Cebidichthys violaceus (Girard, 1854), and Pholis gunnellus (Linnaeus, 1758), have been published so far (
In this present study, we perform a genomic survey for C. japonicus using next-generation sequencing technology for the first time, investigate its genomic feature and reconstruct the phylogenomic relationships with single-copy orthologs genes of C. japonicus. The draft genome assembly of C. japonicus can help us find more useful information for taxonomic studies, adaptive evolutionary mechanisms, and phylogenetic studies, as well as understand the genomic evolution of Chirolophis, and provide a molecular basis of C. japonicus.
In this study, a male specimen of C. japonicus with body length 186 mm and body weight 225 g was collected from coastal waters of Qingdao (35°40'N, 119°30'E), China in July 2021 (Fig.
The total cell DNA was extracted using the phenol-chloroform method (
Quality control was performed on the raw data from the Illumina sequencing platform using the FastQC v. 0.11.9 (
The gene predictors Augustus v. 3.3.3 (
In this study, MIcroSAtellite identification tool (MISA) v. 2.1 was used to identify simple sequence repeats (SSR) in the draft genome of C. japonicus (
A total of 15 genomes of other bony fish were downloaded from the NCBI database (Table
Species | Biosample | Bioproject | References |
---|---|---|---|
Anarrhichthys ocellatus | SAMN10245424 | PRJNA496475 | |
Archocentrus centrarchus | SAMN09948522 | PRJNA489129 |
|
Cebidichthys violaceus | SAMN06857690 | PRJNA384078 |
|
Chirolophis japonicus | This study | ||
Cyclopterus lumpus | SAMN12629502 | PRJNA625538 | |
Gasterosteus aculeatus | SAMN15223905 | PRJNA707557 |
|
Gymnodraco acuticeps | SAMEA104242997 | PRJEB37639 | |
Liparis tanakae | SAMN10970109 | PRJNA523297 | |
Micropterus salmoides | SAMN15299117 | PRJNA687018 |
|
Myoxocephalus scorpius | SAMEA4028818 | PRJEB12469 | |
Pholis gunnellus | SAMEA7522838 | PRJEB45449 | |
Pseudoliparis sp. | SAMN10662039 | PRJNA512070 |
|
Seriola lalandi | SAMN04902367 | PRJNA319656 |
|
Taurulus bubalis | SAMEA7522994 | PRJEB45317 | |
Toxotes jaculatrix | SAMN18445299 | PRJNA723051 | |
Ophiodon elongatus | SAMN13559843 | PRJNA595583 |
|
Raw sequencing data for genome have been deposited at the Sequence Read Archive SRR21530970. These data can be quickly accessed by checking the project ID PRJNA879413 at NCBI Project.
In this study, a total of 65.4 Gb clean reads was obtained by next-generation sequencing from an Illumina NovaSeq 6000 platform. The Q20 value, Q30 value, and GC content were 98.17%, 94.83%, and 43.14%, respectively. The K-mer analysis with a depth of 71 shows that genome size of C. japonicus was 596 Mb with 0.50% heterozygosity rate and 30.30% repeat sequences (Table
The genome sequences of C. japonicus were sequenced from a male with an Illumina NovaSeq 6000 platform, spanning 595.7 Mb with GC contents of 42.61% that were assembled using the software MaSuRCA (Table
Category | Database | Number of reads | Percent (%) |
---|---|---|---|
Protein-coding gene model | 46 729 | ||
Annotated | InterPro | 37 169 | 79.54 |
Eggnog | 37 742 | 80.98 | |
GO | 9353 | 20.02 | |
KEGG_KO | 17 747 | 37.98 | |
Pfam | 26 530 | 56.77 | |
KOG | 35 440 | 75.84 | |
CAZymes | 765 | 1.64 | |
Assembly BUSCO coverage | 88.9 |
The complete mitogenome of C. japonicus is 16,522 bp long with a GC content of 45.97%. It consists of two ribosomal RNA genes (rnl and rns), 20 tRNA genes, and 13 protein-coding genes (PCGs) without an intron (Fig.
A total of 46 729 protein-coding genes was predicted by a combination of different software, including Augustus v. 3.3.3 (
A total of 554 136 of SSR was identified in the complete genome of C. japonicus, including 166 077 of mononucleotide microsatellite motifs (29.97%), 329 685 of dinucleotide microsatellite motifs (59.49%), 37 615 of trinucleotide microsatellite motifs (6.79%), 17 896 of tetranucleotide microsatellite motifs (3.23%), 1568 of pentanucleotide microsatellite motifs (0.28%), and 1322 of hexanucleotide microsatellite motifs (0.24%;) (Fig.
The distributions and frequencies of microsatellite motifs of Chirolophis japonicus a mononucleotide microsatellite motifs b dinucleotide microsatellite motifs c trinucleotide microsatellite motifs d tetranucleotide microsatellite motifs e pentanucleotide microsatellite motifs f hexanucleotide microsatellite motifs.
In the present study, the phylogenomic relationship of a total of 16 bony fish (Table
A maximum likelihood (ML) phylogenomic tree of Chirolophis japonicus based on amino acid sequences of 694 single-copy genes. Chirolophis japonicus is in bold. Maximum likelihood bootstrap values (90%) of each clade are indicated along branches. A scale bar in the upper right indicates substitutions per site.
The maximum likelihood (ML) phylogenomic tree of fungi based on amino acid of 13 protein-coding genes (PCGs): ATP6, ATP8, COX1, COX2, COX3, CYTB, ND1, ND2, ND3, ND4, ND4L, ND5 and ND6. Support values for ML analysis greater than 60% is given on relative clade. A scale bar in the upper left indicates substitutions per site.
Currently, there are more than 30 000 species of fishes, including bony, jawless, and cartilaginous fishes, living on the earth, some with great ecological and economic value. In 2002, the first fish genome, Fugu rubripes (also known as “torafugu”) was published, which provided a framework for future studies of fish genomes (
In the present study, a new fish genome, Chirolophis japonicus, was sequenced. The genomes size was estimated to be 596 Mb based on the K-mer analysis, and the genome spanned 595.7 Mb, assembled using the MaSuRCA (Table
At present, phylogenomic analysis has become an important method for studying the evolutionary relationships of an organism, such as plants (
Microsatellite DNA markers shows many advantages, such as codominant, extensive distribution, abundant polymorphisms, and a convenient analysis, and was considered to be an effective tool in genetic analysis and evolutionary research (
Finally, the genome assembly of C. japonicus can help us understand the genome evolution of Chirolophis and teleosts, as well as provide a molecular basis for breeding and cultivation.
We sincerely thank the reviewers for their constructive comments. We would like to thank Yuan Zhang and Chenghao Jia for assistance in sample collection and sorting. This work was supported by the Zhejiang Provincial Key Research and Development Program (2021C02047); The Doctoral Research Foundation of Shandong Jiaotong University, Grant/Award Numbers: BS201902051.
K-mer analyses (K = 71) of Chirolophis japonicus, X-axis and Y-axis represent the K-mer depth and frequency for the corresponding depth
Data type: image
Maximum likelihood phylogenomic tree of Chirolophis japonicus
Data type: phylogenomic tree
Explanation note: Maximum likelihood phylogenomic tree of Chirolophis japonicus based on amino acids of 694 single-copy genes.