KEGG Taxonomy Files

The KEGG database uses the NCBI taxonomy for classification of cellular organisms and viruses. For cellulcar organisms, the three- or four-letter KEGG organism codes are classified somewhat differently in the following Brite hierarchy files.
KEGG organisms
KEGG organisms in the NCBI taxonomy
KEGG organisms in taxonomic ranks (fixed levels of phylum, class, order, family, genus and species)
The 08601 file is manually created to define the order of organism codes with hsa (Homo sapiens) at the top. The 08610 file is computationally generated using the abbreviated lineage of the NCBI taxonomy keeping the order of organism codes defined in 08601. In addition to organism codes, the 08610 file contains taxonomy IDs that are linked to GENES Addendum (ag) entries. The 08611 file is also computationally generated for the KEGG organisms with fixed levels of taxonomic ranks. It is associated with the attributes of "NCBI assembly level", "KEGG Genome Browser" and "KEGG reference genome".

For viruses, the taxonomy IDs of KEGG Viruses (GENOME vtax category and GENES vg category) are classified according to the NCBI taxonomy, which is based on the ICTV taxonomy, with the Baltimore classification at the top level added by KEGG.
KEGG viruses in the NCBI taxonomy
KEGG viruses in taxonomic ranks (fixed levels of realm, kingdom, phylum, class, order, family, genus and species)
Both of these Brite hierarchy files are computationally generated and the lowest-level taxonomy IDs are linked to GENOME vtax entries. In the 08620 file the taxonomy IDs are shown in the full lineage of NCBI virus taxonomy, while the 08621 file is organized in the fixed levels of taxonomic ranks from realm to species.

KEGG Taxonomy Browser

KEGG Taxonomy Browser is implemented as the Brite hierarchy viewer for the taxonomy files shown above. The files of 08611 for cellular organisms and 08621 for viruses are used as default. The browser has a zooming capability to adjust the bottom level of the taxonomic tree, for example, family or class in eukaryotes and species or genus in prokaryotes.

Taxonomy Mapping

Taxonomy mapping is a method to integrate various biological data, especially for integrating genomic features and organism-level features. The following tool displays the taxonomic distributions of KOs (K numbers) and modules (M numbers) as genomic features, optionally combined with user-defined data such as for phenotypic features using the Join operation of KEGG Mapper.
Select Taxonomy file
Enter K/M numbers

   M00595 K16952 M00596
Enter user-defined data (KEGG organism codes and attributes)

Or upload file:


Virus Taxonomy Mapping

Select Taxonomy file
Enter K numbers and/or a vg identifier

Example: Coronavirus spike proteins
   K24152 K24324 K24325 K19254
Example: Comparison of VOC and KO
   vg:1486428 K23381
Enter user-defined data (tax id and attributes)

Or upload file:

Example: Human diseases caused by coronaviruses

Last updated: July 14, 2023