VOG Alignment Dataset
The newly developed gene order alignment method is applied to a comprehensive comparion of all KEGG organisms and viruses. Here precomputed results of alignments by VOGs are made available. For a given organism or virus the following interface searches precomputed alignment datasets for locally similar gene orders represented as VOG sequences.Examples: | 2811091 (Cotonvirus japonicus) klm (Klebsiella sp. M5al) |
Gene order alignment for improving viral gene annotation
As of January 2025 only about 8% of genes are annotated with KOs for viruses, in comparison to over 50% for cellular organisms. Apparently, there are candidate genes that can be annotated, but sequence similarity levels are not sufficiently high for the internally used KO assignment tools.
Thus, efforts have been initiated to include conserved gene orders for KO assignments. The current procedure is the following.
Thus, efforts have been initiated to include conserved gene orders for KO assignments. The current procedure is the following.
- All viral proteins are examined for classification into Viral Ortholog Groups (VOGs). Roughly 90% of 680 thousand proteins are assigned VOG identifiers.
- All proteins of KEGG organisms are examined if they can be considered to belong to any VOGs. About 1.3% of 50 million proteins are given VOG identifiers.
- Each cellular organism genome is compared against all virus genomes to detect locally similar gene orders (represented as VOG sequences) by the newly developed gene order alignment tool. About 68% of over 10,000 genomes share synteny regions of 3 or more genes with viruses.