KEGG icon KEGG   KEGG2   PATHWAY   BRITE   MODULE   MEDICUS   transparent  
   Help
» Japanese

KEGG Overview

1. Genomes to Biological System

KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from genomic and molecular-level information. It is a computer representation of the biological system, consisting of molecular building blocks of genes and proteins (genomic information) and chemical substances (chemical information) that are integrated with the knowledge on molecular wiring diagrams of interaction, reaction and relation networks (systems information). It also contains disease and drug information (health information) as perturbations to the biological system.

KEGG overview

The KEGG database has been in development by Kanehisa Laboratories since 1995, and is now a prominent reference knowledge base for integration and interpretation of large-scale molecular data sets generated by genome sequencing and other high-throughput experimental technologies.

2. The KEGG Database

KEGG is an integrated database resource consisting of the seventeen main databases shown below. They are broadly categorized into systems information, genomic information and chemical information and further subcategorized by color coding of web pages.

Category Database Content Color
Systems
information
KEGG PATHWAY KEGG pathway maps kegg3
KEGG BRITE BRITE functional hierarchies
KEGG MODULE KEGG modules of functional units
Genomic
information
KEGG ORTHOLOGY KEGG Orthology (KO) groups kegg4
KEGG GENOME KEGG organisms with complete genomes kegg1
KEGG GENES Gene catalogs of complete genomes
KEGG SSDB Sequence similarity database for GENES
Chemical
information
KEGG COMPOUND Metabolites and other small molecules kegg2
KEGG GLYCAN Glycans
KEGG REACTION Biochemical reactions
KEGG RPAIR Reactant pair chemical transformations
KEGG RCLASS Reaction class defined by RPAIR
KEGG ENZYME Enzyme nomenclature
Health
information
KEGG DISEASE Human diseases kegg5
KEGG DRUG Drugs
KEGG DGROUP Drug groups
KEGG ENVIRON Crude drugs and health-related substances
Chemical information category is collectively called KEGG LIGAND
Health information category is collectively called KEGG MEDICUS

These database contain various data objects for computer representation of the biological systems. Thus, the database entry of each database is called the KEGG object, which is identified by the KEGG object identifier consisting of a database-dependent prefix and a five-digit number (see: KEGG objects).

Release Database Object Identifier
1995KEGG PATHWAYmap number
KEGG GENESlocus_tag / GeneID
KEGG ENZYMEEC number
KEGG COMPOUNDC number
1998KEGG REACTIONR number
2000KEGG GENOMEorganism code / T number
2002KEGG ORTHOLOGY  K number
2003KEGG GLYCANG number
2004KEGG RPAIRRP number
2005KEGG BRITEbr number
KEGG DRUGD number
2007KEGG MODULEM number
2008KEGG DISEASEH number
2010KEGG ENVIRONE number
KEGG RCLASSRC number
2014KEGG DGROUPDG number

3. KEGG Molecular Networks

The most unique data object in KEGG is the molecular networks -- molecular interaction, reaction and relation networks representing systemic functions of the cell and the organism. Experimental knowledge on such systemic functions is captured from literature and organized in the following three forms:
  • Pathway map - in KEGG PATHWAY (see: Pathway maps)
  • Functional hierarchy (ontology) - in KEGG BRITE (see: Brite hierarchies)
  • Membership (logical expression) - in KEGG MODULE
  • Membership (simple list) - in KEGG DISEASE
These databases constitute the reference knowledge base for biological interpretation of genomes and high-throughput molecular datasets through the process of KEGG mapping (see: KEGG mapping).

In 1995 the concept of mapping was first introduced in KEGG for linking genomes to metabolic pathways (metabolic reconstruction) using the EC number. Once the EC numbers were assigned to enzyme genes in the genome, organism-specific pathways could be generated automatically by matching against the enzyme (EC number) networks of the KEGG reference metabolic pathways. The EC number is no longer used as an identifier in KEGG. The KEGG Orthology (KO) system is the basis for genome annotation and KEGG mapping.

Period Identifier Reference knowledge Assignment
1995-1999 EC number Metabolic pathways Domain based
2000-2002 Ortholog ID Metabolic and regulatory pathways Domain based
2003- KO Pathways and BRITE hierarchies Gene based

References

  1. Kanehisa, M.; Toward pathway engineering: a new database of genetic and molecular pathways. Science & Technology Japan, No. 59, pp. 34-38 (1996). [pdf]
  2. Kanehisa, M.; A database for post-genome analysis. Trends Genet. 13, 375-376 (1997). [pubmed]
  3. Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., and Kanehisa, M.; KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27, 29-34 (1999). [pubmed] [pdf]
  4. Kanehisa, M. and Goto, S.; KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27-30 (2000). [pubmed] [pdf]
  5. Kanehisa, M., Goto, S., Kawashima, S., and Nakaya, A.; The KEGG databases at GenomeNet. Nucleic Acids Res. 30, 42-46 (2002). [pubmed] [pdf]
  6. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M.; The KEGG resources for deciphering the genome. Nucleic Acids Res. 32, D277-D280 (2004). [pubmed] [pdf]
  7. Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S., Katayama, T., Araki, M., and Hirakawa, M.; From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354-357 (2006). [pubmed] [pdf]
  8. Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., Katayama, T., Kawashima, S., Okuda, S., Tokimatsu, T., and Yamanishi, Y.; KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, D480-D484 (2008). [pubmed] [pdf]
  9. Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., and Hirakawa, M.; KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res. 38, D355-D360 (2010). [pubmed] [pdf]
  10. Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., and Tanabe, M.; KEGG for integration and interpretation of large-scale molecular datasets. Nucleic Acids Res. 40, D109-D114 (2012). [pubmed] [pdf]
  11. Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M.; Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 42, D199–D205 (2014). [pubmed] [pdf]

Last updated: July 1, 2014
Copyright 1995-2014 Kanehisa Laboratories