Genome similarity
In the genome alignment tool the genome is characterized by the sequence of KOs and the similarity of genomes is obtained by comparing KO sequences. Here the genome of an organism or a virus is characterized by the composition (distinct set) of KOs or modules, and a simple measure of genome similarity is introduced to rapidly identify similar genomes and organism groups. Three types of similarity measures are defined as shown below.
Search KEGG organisms and viruses with similar KO composition
Search KEGG organisms and viruses with similar module composition
similarity = match / (num1 + num2 - match)
similarity1 = match / num1
similarity2 = (match / num1 + match / num2 ) / 2
similarity1 = match / num1
similarity2 = (match / num1 + match / num2 ) / 2
where
The second type may represent whether and how a shorter query genome is embedded in a longer genome.
num1 = number of distinct KOs/modules in genome 1
num2 = number of distinct KOs/modules in genome 2
match = number of matching KOs/modules in genomes 1 and 2
num2 = number of distinct KOs/modules in genome 2
match = number of matching KOs/modules in genomes 1 and 2
Search KEGG organisms and viruses with similar KO composition
Organism group similarity New!
An organism group is a collection of genomes and can be characterized by the combined and distinct set of KOs. The third type of similarity measure, similarity2, is used to compare KEGG organism groups: the six top level groups of animals, plants, fungi, protists, bacteria and archea, and the second level groups. The resulting dendrograms are shown below:
- 1st level organism groups (coloring by six groups)
- 2nd level organism groups (coloring by six groups)
- 2nd level eukaryote groups (coloring by six groups)
- 2nd level prokaryote groups (coloring by kingdom)
Virus similarity
Since the KO assignment rate is very low for viral proteins, computationally generated VOGs may be used to measure similarity among viruses. Here the 30% level VOG (VOG30) is used.
Search viruses and KEGG organisms with similar VOG composition
When virus groups are compared by the combined and distinct set of VOGs, they are found to be very different even at the Family level as shown below.
Search viruses and KEGG organisms with similar VOG composition
- Virus groups at the Family level (coloring by realms)
Metagenome similarity
The KEGG Metagenome dataset at GenomeNet is given K numbers by GhostKOALA and its annotation quality is low, but the KO composition may be used to uncover organism groups in the metagenome.
Search metagenomes and KEGG organisms with similar KO composition
Search metagenomes and KEGG organisms with similar KO composition
Last updated: April 24, 2026
