KEGG icon


Linking genomes to pathways by ortholog annotation

Enter K numbers      (Example) K00161 K00162 K00163 K00627 K00382

KEGG Orthology (KO) System

The KEGG pathway maps, BRITE functional hierarchies, and KEGG modules are represented in a generic way to be applicable to all organisms. The KEGG Orthology (KO) system is the basis for this representation, consisting of manually defined ortholog groups (KO entries) for all proteins and functional RNAs that correspond to KEGG pathway nodes, BRITE hierarchy nodes, and KEGG module nodes. Once genes are assigned the KO identifiers, or the K numbers, by the genome annotation procedure described below, the organism-specific pathway maps, BRITE functional hierarchies, and KEGG modules are automatically generated (see KEGG mapping for details).

yellow Search for
bfind mode bget mode

Genome Annotation in KEGG

Genome annotation in KEGG is essentially cross-species annotation giving K numbers to orthologous genes in all available genomes, and is currently done as follows.
  1. Experimental evidence on known functions of genes and proteins is organized in the KO database, which is created together with the KEGG PATHWAY, KEGG BRITE, and KEGG MODULE databases.
  2. Gene catalogs of complete genomes are generated from RefSeq and other public resources, and stored in the KEGG GENES database.
  3. Sequence similarity scores and best hit relations are computed from GENES by pairwise genome comparisons using SSEARCH, and stored in the KEGG SSDB database.
  4. For each gene in a genome the GFIT (Gene Function Identification Tool) table is created detailing the information about best-hit genes, including paralogs, in all other genomes.
  5. In the past, GFIT tables were used manually to assign K numbers by the GFIT tool, which is integrated with other tools including the gene cluster tool for consistency check of operon-like structures and the ortholog table for completeness check of pathway modules and complexes.
  6. The KOALA (KEGG Orthology And Links Annotation) tool was developed in 2008 to computerize KEGG annotators' knowledge of using GFIT tables. KOALA processes all the GFIT tables at a time and makes computational K number assignments.
  7. GFIT tables are continuously updated, and KOALA's computational assignments are automatically reflected for a selected set of well-curated K numbers (about one half of the KO database) in a newly determined genome, and also in the existing genomes that meet various other criteria.
  8. KOALA's computational assignments are repeated every two to three days, and a summary of discrepancies between its assignments and the current annotations is presented. Discrepancies are examined by annotators with the manual version of KOALA and GFIT tools.
  9. Annotation results can be mapped to KEGG pathways, BRITE hierarchies, and KEGG modules for inferring systemic functions of individual organisms, groups of organisms (eg., pangenomes), and combinations of organisms (eg., host-pathogen and human-microbiome relationships).
The read-only version of KEGG annotation tools is available for public view.
  • KOALA - linked from each KEGG ORTHOLOGY entry page
  • GFIT - linked from each KEGG GENES entry page
Each KEGG MODULE entry now contains a link to the ortholog table, which is very useful to check the completeness of consistency among related genomes.

Taxonomy Mapping

Each KO entry contains a link to KEGG taxonomy mapping, which indicates the presence or absence of the orthologous gene in all KEGG organisms under a taxonomic classification.

Last updated: October 7, 2013
KEGG GenomeNet Kanehisa Laboratories