==================================== README for the "kegg" main directory ==================================== KEGG is an integrated database resource consisting of 17 main databases, broadly categorized into systems information, genomic information, and chemical information. The database name and the corresponding subdirectory name on this FTP site are as follows: -------------------------------------------------------------------------------- Category and database Subdirecoty Content -------------------------------------------------------------------------------- Systems information KEGG PATHWAY pathway, xml KEGG pathway maps KEGG BRITE brite BRITE functional hierarchies KEGG MODULE module KEGG modules of functional units KEGG DISEASE medicus Human diseases KEGG DRUG medicus Approved drugs KEGG ENVIRON medicus Crude drugs and health-related substances Genomic information KEGG ORTHOLOGY (KO) genes KEGG Orthology (KO) groups KEGG GENOME genes KEGG organisms with complete genomes KEGG GENES genes Gene catalogs with manual/koala annotation KEGG DGENES genes Gene catalogs with automatic annotation KEGG SSDB Sequence similarity database for GENES Chemical information KEGG COMPOUND ligand Metabolites and other small molecules KEGG GLYCAN ligand Glycans KEGG REACTION ligand Biochemical reactions KEGG RPAIR ligand Reactant pair chemical transformations KEGG RCLASS ligand Reaction class KEGG ENZYME ligand Enzyme nomenclature -------------------------------------------------------------------------------- - KEGG DISEASE, DRUG, and ENVIRON are part of KEGG MEDICUS. - KEGG COMPOUND, GLYCAN, REACTION, RPAIR, RCLASS, and ENZYME are collectively called KEGG LIGAND. - KEGG SSDB contains computed sequence similarity scores and best-hit relations for all gene pairs and genome pairs in KEGG GENES. It is not included in the FTP distribution. This ftp site is updated once a week, on Wednesday JST. Update history -------------- April 1, 2013 - The .pos file has been removed. - The K number annotation is removed from the fasta sequence file. February 25, 2013 - A new file with extension .kff (KEGG feature format) is available for each KEGG organism in the genes/organisms/ subdirectory. It contains a summary of gene features and associated KEGG annotations (K number assignments). - With this new file, there will be two changes in the genes subdirectory in March 2013. (1) The .kff file is a replacement of the current .pos (gene position) file, which will be taken out from the genes subdirectory. (2) The K number annotation currently given in the definition line of the fasta-format amino acid sequence file will be removed, so that this file will be updated less frequently, only when the original database is updated. August 6, 2012 - KEGG OC (Ortholog Cluster) computationally generated from KEGG SSDB is now updated on a regular basis. June 12, 2012 - Due to the rapid increase of the number of new genomes, the single tarball genes.tar.gz introduced on March 12, 2012 is discontinued. Instead, each organism directory has been reorganized and each entry/sequence file has become separately downloadable. March 12, 2012 - The entire set of KEGG GENES entry files is made available as a single tarball (genes.tar.gz). The GENES and GENOME nucleotide sequence files (genes.nuc.gz and genes.genome.gz) are deleted from the genes/fasta subdirectory. They may be reconstructed from the fasta files for individual organisms. March 6, 2012 - The KEGG organism alias used as a file name, such as H.sapiens and E.coli, in KEGG GENES and KEGG GENOME is discontinued. Instead, the T number identifier is used. February 10, 2012 - KEGG EGENES (EST datasets) is no longer included in the genes directory. July 1, 2011 - The new KEGG FTP site is open. (c) Kanehisa Laboratories