GenomeNet - DBGET Overview

	KEGG KEGG2 PATHWAY BRITE MEDICUS DBGET

Search for

GenomeNet
   About GenomeNet
   Release notes
   Acknowledgments

DBGET
Overview
DB release info

KEGG

varDB

Community DBs

Bioinformatics tools

FTP

Feedback

DBGET Overview

1. Web of Molecular Biology Databases

DBGET is the backbone retrieval system for all GenomeNet databases including a number of molecular biology databases that are mirrored at the GenomeNet. DBGET is based on a flat-file view of molecular biology databases, where the database is considered as a collection of entries. Because each entry is given a unique entry name (or an accession number) within a database, the molecular biology databases in the world can be retrieved uniformly by the combination of the database name and the entry name:

database:entry

In KEGG an organism is a collection of genes, which may also be considered as a flat-file database. Any gene or gene product (protein or RNA) in KEGG can thus be specified by the combination of the organism name and the gene name:

organism:gene

When two data entries are related in any way, it is customary to incorporate cross-reference information in the molecular biology databases. Examples include links between sequence data and literature data or between amino acid sequence data and nucleotide sequence data. The link information between two entries is a binary relation represented by:

database1:entry1 --> database2:entry2

LinkDB is a collection of all such direct links in the GenomeNet databases as well as indirect links that are computationally obtained by combining multiple links and/or using links in reverse directions.

2. Database Categories

The DBGET/LinkDB system integrates different databases in different ways depending on the availability of mirroring, keyword indexing, and linking. The databases are thus classified into five categories.

Category	Main commands			Remark
Category	bget	bfind	blink	Remark
1. KEGG databases	yes	yes	yes	Mirrored at GenomeNet
2. Other DBGET databases	yes	yes	yes	Mirrored at GenomeNet
3. Searchable databases on the Web	no	yes	yes	Used as Web resources
4. Link-only databases on the Web	no	no	yes
5. PubMed database	yes	no	yes

3. KEGG Databases (Category 1 Databases)

The KEGG databases at GenomeNet are the following. Most of them are daily updated.

Database	Content	Remark
brite	Functional hierarchies and ontologies	KEGG BRITE
pathway	Pathway maps	KEGG PATHWAY
module	KEGG modules	KEGG PATHWAY
network	Network elements	KEGG NETWORK
variant	Human gene variants	KEGG NETWORK
disease	Human diseases	KEGG DISEASE
drug	Drugs	KEGG DRUG
dgroup	Drug groups	KEGG DRUG
orthology	KEGG Orthology (KO) groups	KEGG ORTHOLOGY
genes	Gene catalogs in high-quality genomes	KEGG GENES
genome	KEGG Organisms and selected viruses	KEGG GENOME
compound	Chemical compounds	KEGG COMPOUND
glycan	Glycans	KEGG GLYCAN
reaction	Chemical reactions	KEGG REACTION
rclass	Reaction class
enzyme	Enzyme nomenclature
expression	Microarray gene expression profiles	Submitted by authors

4. Other DBGET Databases (Category 2 Databases)

Other databases mirrored at GenomeNet are the following.

Database		Content	Original site
refseq	refnuc	Nucleotide sequences	NCBI
refseq	refpep	Amino acid sequences	NCBI
uniprot	swissprot		ExPASy / EBI
uniprot	trembl		ExPASy / EBI
mgenes		Gene catalogs in metagenomes	Metagenomes
mgenome		Metagenomes	Metagenomes
refgene	rg001	OM-RGC: Ocean microbial reference gene catalog	EMBL
	rg002	IGC: Integrated reference catalog of the human gut microbiome	BGI / EMBL
	rg003	MATOU: Marine Atlas of Tara Oceans Unigenes	Tara Oceans
pdb		Protein 3D structures	RCSB
pdbstr		PDB amino acid sequences	Kyoto University
epd		Eukaryotic promoters	ISREC
motifdic	prosite	Protein domains and sequence motifs	ExPASy
	pfam		EBI
	ncbi-cdd		NCBI
pmd		Protein mutants	NIG
aaindex		Amino acid indices	Kyoto University
carbbank		Glycan structures (no longer updated)	Teikyou U / U Georgia
prosdoc		PROSITE literature	ExPASy

5. Searchable Databases on the Web (Category 3 Databases)

The following databases are search-only databases. The actual contents are retrieved from the original sites.

Database		Content	Original site
insdc	genbank	Nonredundant database of International Nucleotide Sequence Database Collaboration	NCBI
	ddbj		NIG
	embl		EBI
ncbi-gene		Entrez Gene database	NCBI
ensembl		Eukaryotic genome annotation database	Ensembl
hgnc		Human gene nomenclature	HGNC
hmdb		Human metabolome database	HMDB
go		Gene Ontology	GO
brc-dna		Human cDNA Clones	RIKEN BRC-DNA
interpro		Protein domains and families	EBI
pubchem		PubChem small molecule database	NCBI
chebi		ChEBI small molecule database	EBI
pdb-ccd		PDB hemical Component Dictionary	PDB
lipidmaps		LIPID Metabolites And Pathways Strategy	LIPIDMAPS
knapsack		Plant secondary metabolites	KNApSAcK
ligandbox		Ligand data base open and extensible	LigandBox
sider		Side effect resource	SDIER

References

Kanehisa, M.; Linking databases and organisms: GenomeNet resources in Japan. Trends Biochem Sci. 22, 442-444 (1997). [pubmed]
Fujibuchi, W., Goto, S., Migimatsu, H., Uchiyama, I., Ogiwara, A., Akiyama, Y., and Kanehisa, M.; DBGET/LinkDB: an integrated database retrieval system. Pacific Symp. Biocomputing 1998, 683-694 (1997). [pubmed]

(DBGET has a root in the IDEAS package originally developed for GenBank in the early 1980s)
Kanehisa, M., Klein, P., Greif, P., and DeLisi, C.; Computer analysis and structure prediction of nucleic acids and proteins. Nucleic Acids Res. 12, 417-428 (1984). [pubmed] [pdf]
Kanehisa, M.I.; Los Alamos sequence analysis package for nucleic acids and proteins. Nucleic Acids Res. 10, 183-196 (1982). [pubmed] [pdf]

Last updated: May 29, 2024