Menu PATHWAY BRITE MODULE KO Annotation ENZYME RModule BlastKOALA

KO Database of Molecular Functions

The KO (KEGG Orthology) database is a database of molecular functions represented in terms of functional orthologs. A functional ortholog is manually defined in the context of KEGG molecular networks, namely, KEGG pathway maps, BRITE hierarchies and KEGG modules. For example, when a pathway map is drawn, each box is given a KO identifier (called K number) and experimentally characterized genes and proteins in specific organisms are used to find orthologs in other organisms. The granularity of "function" is context-dependent, and the resulting KO grouping may correspond to a highly similar sequence group and a limited organism group or it may be a more divergent group.

The KO system is a network-based classification of KOs shown below:
KEGG Orthology (KO)
It consists of six top categories (09100 to 09160) for KEGG pathway maps and one top category (09180) for BRITE hierarchies, as well as one top category (09190) for those KOs that are not yet included in either of them. The category numbers for these top categories and the second-level categories under metabolism (09101 to 09112) are used to define color coding of functions (see KEGG Color Codes).

Major efforts have been made to associate each KO entry with experimental evidence of functionally characterized sequence data, now shown in the SEQUENCE subfield of the REFERENCE field. Similar efforts have also been made for EC numbers in Enzyme Nomenclature. The addendum category of the GENES database, which allows functionally characterized individual protein sequences to be included in KEGG, have played major roles in these efforts.

KO Assignment and KEGG Mapping

Genome annotation in KEGG contains two unique aspects, ortholog annotation and network reconstruction, as summarized below.

Ortholog annotation (KO assignment)
  • Molecular functions are stored in the KO (KEGG Orthology) database containing orthologs of experimentally characterized genes/proteins.
  • Genome annotation in KEGG is to assign KO identifiers (or K numbers) to individual genes in the genome.
Network reconstruction (KEGG mapping)
  • Functional orthologs are defined in the context of KEGG pathway maps and other molecular networks, which are all created as networks of K number nodes.
  • The genome annotation procedure to convert a gene set in the genome to a K number set leads to automatic reconstruction of KEGG pathways and other networks by the process called KEGG mapping, enabling interpretation of high-level functions.
The following interface allows some of the KEGG mapping functions (see also KEGG Annotation).

Enter K numbers      (Example) K00161 K00162 K00163 K00627 K00382

KOALA and BlastKOALA

The KO assignment of the GENES database is performed by both automatic and manual versions of the KOALA (KEGG Orthology And Links Annotation) tool, which utilizes the SSDB database containing SSEARCH computation results for all pairwise genome comparisons. BlastKOALA is a web server for automatic KO assignment using a similar algorithm and BLAST search on the fly.

KOALA BlastKOALA
Purpose Internal GENES annotation Genome annotation service
Search program SSEARCH BLASTP
Scoring Weighted sum of SW scores
(KOALA scoring)
Weighted sum of BLAST bit scores
(modified KOALA scoring)
Database Entire GENES database sequences Non-redundant pangenome sequences

KOALA scoring includes: SW (Smith-Waterman) score, best-best ag, overlap of alignment, ratio of query and DB sequences, taxonomic category and Pfam domains.


Reference
  1. Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M.; KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457-D462 (2016). [pubmed]
  2. Kanehisa, M., Sato, Y., and Morishima, K.; BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J. Mol. Biol. 428, 726-731 (2016). [pubmed]

Last updated: July 1, 2018
KEGG GenomeNet Kanehisa Laboratories