EPD Home page <http://www.epd.isb-sib.ch/index.html>
------------------------------------------------------------------------
EUKARYOTIC PROMOTER DATABASE
USER MANUALWritten by:Philipp Bucher, Rouaida Cavin Périer, Viviane
Praz and Christoph Schmid
Swiss Institute of Bioinformatics
and Swiss Institute for Experimental Cancer Research
Ch. des Boveresses 155
CH-1066 Epalinges s/Lausanne
Switzerland
Electronic mail:
via webmail form <http://www.epd.isb-sib.ch/webmail.html>
This manual and the database it accompanies may be copied and
redistributed freely, without advance permission, provided that this
statement is reproduced with each copy.
Published Research assisted by the Eukaryotic Promoter Database
should cite:
EPD in its twentieth year: towards complete promoter coverage of
selected model organisms
Schmid, C.D., Perier, R., Praz, V. and Bucher, P. (2006) Nucleic
Acids Res, 34, D82-85.
------------------------------------------------------------------------
EPD RELEASE 100, June 2009
*WHAT IS NEW IN RELEASE 100:* adaptations to 3-digit release numbers
EPD release history
<http://www.epd.isb-sib.ch/current/EPD_release_history.html>
CONTENTS
1. INTRODUCTION <#INTRODUCTION>
2. PROMOTER SELECTION <#PROMOTER_SELECTION>
3. ASSIGNMENT OF INITIATION SITE
<#ASSIGNMENT_OF_TRANSCRIPTION_INITIATION_SITE>
4. FORMAT CONVENTIONS <#FORMAT_CONVENTIONS>
1. The title line <#The_title_line>
2. Promoter entries <#Promoter_entries>
1. The ID line <#The_ID_line>
2. The AC line <#The_AC_line>
3. The DT line <#The_DT_line>
4. The DE line <#The_DE_line>
5. The OS line <#The_OS_line>
6. The HG line <#The_HG_line>
7. The AP line <#The_AP_line>
8. The NP line <#The_NP_line>
9. The DR line <#The_DR_line>
10. The RN, RX, RA, RT and RL lines
<#The_RN,_RX,_RA,_RT_and_RL_lines>
11. The ME line <#The_ME_line>
12. The SE line <#The_SE_line>
13. The FL line <#The_FL_line>
14. The IF line <#The_IF_line>
15. The TX line <#The_TX_line>
16. The KW line <#The_KW_line>
17. The FP, DO and RF lines <#The_FP,_DO_and_RF_lines>
18. The // line <#The_//_line>
3. Line types retained from the old format
<#Line_types_retained_from_the_old_format>
1. The FP line <#The_FP_line>
2. Documentation <#Documentation>
3. Literature references <#Literature_references>
4. Miscellaneous <#Miscellaneous>
4. Distinct format of 'preliminary' entries in epd_bulk.dat
<#Distinct_format_of_preliminary_entries_in_epd_bulk.dat>
5. CLASSIFICATION <#CLASSIFICATION>
6. HOMOLOGOUS PROMOTERS <#HOMOLOGOUS_PROMOTERS>
7. PROMOTER SEQUENCE RETRIEVAL <#PROMOTER_SEQUENCE_RETRIEVAL>
8. REFERENCES <#REFERENCES>
1. APPENDIX A : SURVEY OF RELEASE
<http://www.epd.isb-sib.ch/current/SURVEY.html>
2. APPENDIX B : CODES AND ABBREVIATIONS <#APPENDIXB>
1. SPECIES CODES <#SPECIES_CODES>
2. JOURNAL CODES <#JOURNAL_CODES>
3. ABBREVIATIONS <#ABBREVIATIONS>
1 INTRODUCTION
The Eukaryotic Promoter Database EPD was designed and developed at the
Weizmann Institute of Science in Rehovot (Israel) and is currently
maintained at ISREC in Epalinges s/Lausanne (Switzerland). EPD is a
specialized annotation database of the EMBL Data Library. It provides
information about eukaryotic promoters available in the EMBL Data
Library and is intended to assist experimental researchers, as well as
computer analysts, in the investigation of eukaryotic transcription
signals. The present version originated from a previous compilation
published in an article (1
<http://www.epd.isb-sib.ch/current/usrman.html#ref_1>) and is organized
as a hierarchically ordered and documented "functional position set" (2
<http://www.epd.isb-sib.ch/current/usrman.html#ref_2>) pointing to
transcription initiation sites. All information is either directly
extracted from scientific literature or, starting from release 73,
compiled by a new in silico primer extension method (16 <#ref_16>). Thus
promoter information in EPD is independent of the EMBL sequence entry
descriptions. As a consequence, many of the initiation sites referred to
in EPD do not appear in corresponding EMBL feature tables.A coordinated
updating procedure has been set up by the two laboratories that will
ensure future compatibility between the position references in EPD and
the sequence data in the main data library. Investigators who access
EMBL via publicly available programs should be aware of the fact that
software producers occasionally modify the sequence data in ways that
render position references inaccurate. EPD is generally not compatible
with sequence data of another release because EMBL sequence entries are
not designed as stable data units. The completeness and accuracy of EPD
greatly benefits from user-feedback. Any report of mistakes or omissions
would be very much appreciated. Direct communication of newly published
transcript mapping or gene expression data is also welcome. Please
forward all correspondence to the address given on top of this document.
Use electronic mail if possible.
2 PROMOTER SELECTION
EPD is a rigorously selected database. In order to be included in EPD, a
promoter must be:
1. recognized by eukaryotic RNA POL II,
2. active in a higher eukaryote,
3. experimentally defined, or homologous and sufficiently similar to
an experimentally defined promoter,
4. biologically functional,
5. available in the current EMBL release,
6. distinct from other promoters in the database.
Explanations:
1. Transcription by RNA POL II is bona fide assumed for protein
coding genes but must be supported by alpha-amanitin data if the
end product is an RNA.
2. All eukaryotes except phycophyta, fungi, myxomycetes, and protozoa
are considered higher eukaryotes. Note that the expression "active
in" does not always refer to the source organism of the promoter
(e.g. in viruses). EPD contains currently promoter sequences from
139 different species <#SPECIES_CODES>.
3. A promoter is experimentally determined if a corresponding
transcription initiation site is mapped with a precision of +/- 5
bp or higher. Any technique that characterizes the 5'terminus of
an in vivo or in vitro generated RNA is acceptable. Single
nuclease-protection or primer-extension data must be accompanied
by additional evidence unless the gene's intron-exon organization
is well established. Similarity is considered "sufficient" if
percent identity (as defined in Section 6) is >=60% between -79
and +20 or >=75% between -49 and +10.
4. A promoter is biologically functional if it contributes to the
source organism's survival and/or reproduction. This is bona fide
assumed except for promoters of pseudogenes, minor transcription
initiation sites (<20% of total gene transcripts), promoters
giving rise to an unstable RNA product, and mutant promoter.
5. The minimum sequence requirement is 45 bp between -49 and +10.
6. Promoters are considered distinct if they originate from different
gene loci or different species. Identity is assumed if two
promoters from the same species exhibit >95% similarity between
-79 and +20 while their genetic relationship is unknown. Multiple
isolates of viruses or transposable elements are considered
distinct if at least one promoter region fails to fulfill the
above similarity criterion.
3 ASSIGNMENT OF TRANSCRIPTION INITIATION SITE
A eukaryotic promoter is defined as a DNA sequence around a
transcription initiation site. The position reference to the initiation
site is therefore the central part of a promoter entry. Its assignment
is based directly on experimental data shown in an article, proposed
adjustments originating from consensus sequence considerations being
ignored. In the case of minor discrepancies between different
publications averaged positions are given. Position references are
subject to permanent re-evaluation. A transcription initiation site may
be reassigned upon publication of new data. Position references are
replaced if longer upstream sequences of the same promoter become
available in a new EMBL sequence entry.
Several initiation sites preceding the same gene appear as alternative
promoters if they are clearly separated from each other or
differentially regulated. The minimum distance required between two
alternative initiation sites is 20 bp. Otherwise, they are considered a
single promoter region.
Four types of promoters are distinguished by one-letter codes in order
to account for the variety of transcription initiation patterns in
eukaryotes:
* S: Single initiation site: >90% of all reported transcripts
initiate within 10 bp (the experimental data usually do not allow
distinction between a single cap-site and small mRNA 5'
heterogeneity).
* M: Multiple initiation sites: >75% of all reported transcripts
initiate within 20 bp.
* R: Initiation region: >75% of all reported transcripts initiate
within 100 bp.
* U: Undefined transcription initiation pattern, exclusively in
'preliminary' entries in epd_bulk.dat (see next section).
Note that in addition to true alternative promoter activity, variability
in the position of the transcription initiation site might also be due
to experimental constraints, a biological variability in the activity of
the DNA polymerase II, or the presence of highly similar (pseudo-) genes
with distinct transcription initiation sites.
In sequence entries that contain a complete RNA or DNA genome of a
retrovirus or a retrovirus-like transposable elements, the position
reference points to the U3/R boundary of the 3'terminal LTR.
4 FORMAT CONVENTIONS
EPD is distributed as two ASCII flatfiles (epd.dat, epd_bulk.dat) in
essentially identical format. Differences in the format of 'preliminary'
entries in 'epd_bulk.dat' are described in paragraph 4.4
<#Distinct_format_of_preliminary_entries_in_epd_bulk.dat>. EPD files
contain a title line followed by a number of promoter entries.
Interspersed are group headings whose function and format are described
in the next section. The title line and parts of the promoter entries
are rigidly formatted so that the entire database conforms to the
standards of an FPS file (functional position set) of our current signal
search analysis (1
<http://www.epd.isb-sib.ch/current/usrman.html#ref_1>,2
<http://www.epd.isb-sib.ch/current/usrman.html#ref_2>) software.
4.1. The title line
The title line of EPD is shown below:
TI EPD83 Eukaryotic Promoter Database / Release 83 EP
The TI line contains the following fields:
columns data type
1- 2 "TI"
3- 5 (blank)
6-15 FPS name
16-70 title
71-72 FPS code
Explanations:
* FPS name and FPS code are used by our data extraction software to
generate default names for output files.
4.2. Promoter entries
An EPD entry contains the following types of information:
* Promoter identification and description.
* Machine-readable pointers to the transcription initiation site in
corresponding sequence entries.
* Description of the experimental evidence defining the
transcription start site.
* Various kinds of promoter classifications useful for extraction of
biologically meaningful promoter subsets.
* Information on regulatory properties.
* Cross-references to other databases.
* Bibliographic references.
Promoter entries are presented in a similar format as EMBL and
SWISS-PROT sequence entries. Each line starts with a line code
identifying the type of information presented. The current line types
and line codes and the order in which they appear in an entry, are shown
below:
ID - IDentification.
AC - ACcession number(s).
DT - DaTe.
DE - DEscription.
OS - Organism Species.
HG - Homology Group.
AP - Alternative Promoter.
NP - Neighbouring Promoter.
DR - Database cross-References.
RN - Reference Number.
RX - Reference cross-references.
RA - Reference Authors.
RT - Reference Title.
RL - Reference Location.
ME - MEthods.
SE - SEquence.
FL - Full Length.
IF - Initiation Frequency.
TX - TaXonomy.
KW - KeyWords.
FP - Functional Position.
DO - DOcumentation.
RF - literature ReFerence.
// - Termination line.
Spacer lines (XX) are inserted in order to make the promoter database
easier to read by eye. Some line types occur many times in a single
entry. Each entry must begin with an identification line (ID) and end
with a terminator line (//). Text does not exceed column 72. Below is an
example of a promoter entry:
ID HS_MYC_2 standard; single; VRT.
XX
AC EP11148;
XX
DT ??-APR-1987 (Rel. 11, created)
DT 07-MAR-2005 (Rel. 82, Last annotation update).
XX
DE c-myc (cellular homologue of myelocytomatosis virus 29 oncogene),
DE promoter 2.
OS Homo sapiens (human)
XX
HG Homology group 53; Mammalian c-myc proto-oncogene, promoter 2
AP Alternative promoter #2 of 2; exon 1; site 2; major promoter.
NP none.
XX
DR GENOME; NT_008046.15; NT_008046; [-41966656, 15188617].
DR EPD; EP11146; HS_MYC_1; alternative promoter; [-162; +].
DR CLEANEX; HS_MYC.
DR EMBL; AC103819.3; [-87815, 60206].
DR EMBL; X00364.2; [-2489, 8507].
DR EMBL; D10493.1; [-2487, 5569].
DR EMBL; K01910.1; [-2451, 49].
DR EMBL; M16261.1; [-1843, 1048].
DR EMBL; J03253.1; [-1759, 461].
DR EMBL; L00057.1; [-810, 2795].
DR EMBL; K03015.1; [-555, 458].
DR EMBL; X00196.1; [-532, 2792].
DR EMBL; M12026.1; [-511, 678].
DR EMBL; K01708.1; [-410, 500].
DR EMBL; K00559.1; [-345, 1020].
DR EMBL; K02280.1; [-302, 178].
DR EMBL; K01909.1; [-266, 1365].
DR EMBL; S65124.1; [-266, 1023].
DR EMBL; M14206.1; [-266, 446].
DR EMBL; M20013.1; [-240, 982].
DR EMBL; AF111270.1; [-142, 264].
DR EMBL; K02275.1; [-96, 780].
DR EMBL; X00675.1; [-96, 404].
DR EMBL; K02277.1; [-96, 157].
DR SWISS-PROT; P01106; MYC_HUMAN.
DR TRANSFAC; R01157; HS$CMYC_01; [-211, -189]; by position.
DR TRANSFAC; R01158; HS$CMYC_02; [-168, -145]; by position.
DR TRANSFAC; R01804; HS$CMYC_04; [-300, -283]; by position.
DR TRANSFAC; R01851; HS$CMYC_05; [-65, -57]; by position.
DR TRANSFAC; R01852; HS$CMYC_06; [-42, -34]; by position.
DR TRANSFAC; R04076; HS$CMYC_12; [-251, -228]; by position.
DR TRANSFAC; R04076; HS$CMYC_12; [-252, -229]; by position.
DR TRANSFAC; R04076; HS$CMYC_12; [-253, -230]; by position.
DR TRANSFAC; R04621; HS$CMYC_17; [-313, -262]; by position.
DR TRANSFAC; R08503; HS$CMYC_18; [-50, -41]; by position.
DR TRANSFAC; R16688; HS$CMYC_24; [-7, 41]; by position.
DR TRANSFAC; R16689; HS$CMYC_25; [-7, 41]; by position.
DR TRANSFAC; R17051; HS$CMYC_30; [-510, -480]; by position.
DR TRANSFAC; R18503; HS$CMYC_31; [-185, -170]; by position.
DR TRANSFAC; R18504; HS$CMYC_32; [-153, -168]; by position.
DR RefSeq; NM_002467.
DR MIM; 190080.
XX
RN [1]
RX MEDLINE; 84026482.
RA Battey J., Moulding C., Taub R., Murphy W., Stewart T., Potter H.,
RA Lenoir G., Leder P.;
RT "The human c-myc oncogene: structural consequences of
RT translocation into the IgH locus in Burkitt lymphoma";
RL Cell 34:779-787(1983).
RN [2]
RX MEDLINE; 84131953.
RA Bernard O.D., Cory S., Gerondakis S., Webb E., Adams J.M.;
RT "Sequence of the murine and human cellular myc oncogenes and two
RT modes of myc transcription resulting from chromosome translocation
RT in B lymphoid tumours";
RL EMBO J. 2:2375-2383(1983).
RN [3]
RX MEDLINE; 87257828.
RA Lipp M., Schilling R., Wiest S., Laux G., Bornkamm G.W.;
RT "Target sequences for cis-acting regulation within the dual
RT promoter of the human c-myc gene.";
RL Mol. Cell. Biol. 7:1393-1400(1987).
RN [4]
RX MEDLINE; 88038843.
RA Broome H.E., Reed J.C., Godillot E.P., Hoover R.G.;
RT "Differential promoter utilization by the c-myc gene in mitogen-
RT and interleukin-2-stimulated human lymphocytes.";
RL Mol. Cell. Biol. 7:2988-2993(1987).
XX
ME Nuclease protection [1,4].
ME Nuclease protection; transfected or transformed cells [3].
ME Length measurement of an RNA product; low-precision data [1].
XX
SE agggagggatcgcgctgagtataaaagccggttttcggggctttatctaACTCGCTGTAG
XX
TX 6. Vertebrate promoters
TX 6.1. Chromosomal genes
TX 6.1.5. Hormones, growth factors, regulatory proteins
TX 6.1.5.16. Various cellular protooncogenes
XX
KW Proto-oncogene, Nuclear protein, DNA-binding, Glycoprotein,
KW Transcription regulation.
XX
FP Hs c-myc P2+:+S EU:NC_000008.9 1+ 128817660; 11148.053 010*2
XX
DO Experimental evidence: 4,4#,2l
DO Expression/Regulation: +mitogen
RF Cell34:779 EMBOJ2:2375 MCB7:1393 MCB7:2988
//
A detailed description of each line type is given below.
4.2.1. The ID line
The identification line is always the first line of an entry. The
general form of the ID line is:
ID ENTRY_NAME data class; initiation site type; TAXONOMIC DIVISION.
* /ENTRY_NAME/ is a unique entry identifier "HS_MYC_2" which obeys
rigorous naming conventions. It contains 2 or 3 fields, the first
is the species identification code at most 4 alphanumeric
characters representing the biological source of the promoter. The
second field uses for gene identification the protein code of
SWISS-PROT ID (if available). For human EPD entries, instead of
the SwissProt ID the official gene symbol approved by the HUGO
nomenclature committee <http://www.gene.ucl.ac.uk/nomenclature/>
(if available) is used. The third field is optional, it is either
a number which represents alternative promoters or a letter for
promoters of duplicated genes. The `_' sign serves as a separator.
* The /data class/ field relates to the quality of the information:
"standard" means that the information is complete and correct
according the standards laid down in this document; "preliminary"
means that the entry has not yet undergone all quality checks
necessary for being classified as "standard".
* The /initiation site type/ is either "single", "multiple",
"region" as defined in Section 3
<http://www.epd.isb-sib.ch/current/usrman.html#ASSIGNMENT_OF_TRANSCRIPTION_INITIATION_SITE>.
* /TAXONOMIC DIVISION/ are
o PLN for plant
o NEM for nematode
o ART for arthropode
o MLS for mollusc
o ECH for echinoderm
o VRT for vertebrates.
Note that these codes relate to the organism in which the promoter
is expressed, not to the source organism in which the promoter is
replicated as defined on the OS line.
The ID line is terminated by a period.
4.2.2. The AC line
AC EP11148;
The accession number consists of the character string "EP" followed by 5
digits representing the EMBL release number followed by the EPD entry
order. Most EPD entries currently have only one accession number. If
necessary, more then one AC will be used, separated by semicolons and
the list is terminated by a semicolon.
4.2.3. The DT line
The date lines show the date of entry or last modification of the entry.
DT DD-MMM-YEAR (Rel. XX, Comment)
where `DD' is the day, `MMM' the month, `YEAR' the year, and `XX' the
EPD release number. The comment portion of the line indicates the action
taken on that date.
* The first DT line indicates when the entry first appeared in the
database.
* The second DT line indicates when the promoter data was last
modified. It is terminated by a period.
4.2.4. The DE line
DE c-myc (cellular homologue of myelocytomatosis virus 29 oncogene),
DE promoter 2.
The description lines contain general descriptive information about the
promoter. The description is given in ordinary English and is
free-format. It contains the swiss-prot gene names when known. In some
cases, more than one DE line is required; in this case, the text is
divided only between words. The last DE line is terminated by a period.
4.2.5. The OS line
OS Mus musculus (house mouse)
The species line specifies the source organism(s) of the promotery. The
species names are based on NCBI's taxonomy and thus can be automatically
hyperlinked to the NCBI's taxonomy web pages.
4.2.6. The HG line
HG Homology group 53; Mammalian c-myc proto-oncogene, promoter 2
The homology group <http://www.epd.isb-sib.ch/current/HG.html> line is
optional, it contains 2 fields: a homology group number that allows
identification of all sequence-wise similar promoters in EPD, and a
homology group name.
4.2.7. The AP line
AP Alternative promoter #2 of 2; 5' exon 1; site 2; major promoter.
The AP line is optional and provides information on alternative
promoters <http://www.epd.isb-sib.ch/current/AP.html> of the same gene
(for more details, see Section 4.3.1.). It contains 3 or 4 fields,
separated by semicolons, providing the following types of information:
descriptive text fields followed by
* Two numbers indicating, respectively, the promoter's relative
position along the gene, and the total number of alternative
promoters of the gene. Promoters are numbered in the 5' to 3'
directions starting with one.
* A number referring to the exon preceded by the promoters. Note
that multiple promoters may be associated with the same
(3'-coterminal) exon or with different exons. Known exons are
numbered in 5' to 3' direction starting with one.
Note that the nomenclature of 5'-exons in EPD may differ from the
* usage in the literature. A number indicating the promoter's
relative position among the subset of promoters preceeding the
same exon.
* An optional keyword indicating major promoters.
The AP line is terminated by a period.
4.2.8. The NP line
NP Neighbouring Promoter; EP23008; MM_H2B1; [-209; -].
The NP line is optional and provides information on promoters which are
physically closer to each other than 1000 bp. It contains 3 fields,
separated by semicolons, providing the following types of information:
* The EPD accession number of the neighbouring promoter.
* The EPD identifier of the neighbouring promoter.
* The last field indicates, respectively, the position and the
direction of the neighbouring promoter relative to the
transcription initiation site given in the promoter entry.
o Negative numbers indicate the upstream region of this entry
and positive ones indicate the downstream region.
o The sign indicates the transcription direction of the
neighbouring promoter relative to the promoter entry:
"+" means same direction
"-" means opposite direction
4.2.9. The DR line
The DR lines contain cross-references to other EPD entries (if there are
alternative promoters of the same gene), or to entries from other
databases. So far, we have incorporated links to CLEANEX,
<http://www.cleanex.isb-sib.ch/current/CleanEx_manual.html> EMBL (3
<http://www.epd.isb-sib.ch/current/usrman.html#ref_3>), GenBank (4
<http://www.epd.isb-sib.ch/current/usrman.html#ref_4>), DDBJ (5
<http://www.epd.isb-sib.ch/current/usrman.html#ref_5>), SWISS-PROT (6
<http://www.epd.isb-sib.ch/current/usrman.html#ref_6>), TRANSFAC (7
<http://www.epd.isb-sib.ch/current/usrman.html#ref_7>), Flybase (8
<http://www.epd.isb-sib.ch/current/usrman.html#ref_8>), MIM (9
<http://www.epd.isb-sib.ch/current/usrman.html#ref_9>) and MGD (10
<http://www.epd.isb-sib.ch/current/usrman.html#ref_10>). The precise
format of these lines depends on the target database. Note that some
cross-references include numbers enclosed in square brackets indicating
the relative position of a linked sequence object, or keywords
characterising the nature of the relationship between the entries. For
instance, the ranges associated with cross-references to EMBL entries
define the extensions of the EMBL sequences relative to the initiation
site described by the EPD entry. The multiplicity of EMBL
cross-references in some entries mirrors the redundancy of the sequence
database. The first of these references corresponds to the longest
promoter region, except when the sequences are cancelled from EMBL
database, but still exist in GenBank or DDBJ.
The format of the DR line is shown by the following example lines:
DR GENOME; NT_037436.1; NT_037436; [-14139754, 9212459].
DR EPD; EP11146; HS_MYC_1; alternative promoter; [-162; +].
DR EMBL; J00120.1; [-2489, 8507].
DR SWISS-PROT; P01106; MYC_HUMAN.
DR SPTREMBL; Q8IQL1.
DR FLYBASE; FBgn0013718; nuf.
DR TRANSFAC; R01804; HS$CMYC_04; [-300, -283]; by position.
DR MIM; 190080.
DR RefSeq; NM_003529.
DR MGD; MGI:88468; Cola2.
DR ENSEMBL; CG32140.
DR TRANSCRIPTOME; DMe000571.
Explanations (for detailed information go to Guidelines
<http://www.epd.isb-sib.ch/current/guidelines.html>):
* The first item on the DR line is the abbreviated name of the data
collection to which reference is made. The currently defined data
bank identifiers are the following:
GENOME NCBI Reference Sequence (RefSeq) of genomic sequence contigs
EPD Eukaryotic Promoter Database: alternative promoters of the
same gene
CLEANEX Gene expression database for human EPD promoters
EMBL Nucleotide sequence database of the EMBL
SWISS_PROT Protein sequence database
SPTREMBL Subset of protein sequence database TrEMBL. It contains
the entries which should be eventually incorporated into
SWISS-PROT. SWISS-PROT accession numbers have been assigned for
all SP-TrEMBL entries
FLYBASE Drosophila genome database
TRANSFAC Transcription factor (TF) database
MIM Mendelian Inheritance in Man Database
RefSeq Reference Sequence Database
MGD Mouse Genome Database
ENSEMBL Metazoan genome annotation
TRANSCRIPTOME Catalog of transcripts and their mapping onto the
genome (LICR Lausanne branch)
TIGR 'gene identifiers' from the 'Rice Genome Annotation' project
at TIGR
* The second item is the primary accession number (or an equivalent
unique identifier of another data banks) of the entry to which
reference is made.
* The third item (if it exists) is a secondary idientifier or name
for the cross-referenced database entry.
* The fourth item for EMBL and Transfac indicates the location and
extension of the sequences given in these entries relative to the
transcription initiation site given in the promoter entry.
Negative numbers indicate the upstream region of this site and
positive ones indicate the downstream part.
* The fifth item
o in the EPD line, indicates the position and the direction of
the alternative promoter as it is defined for the
neighbouring promoter in the NP
<http://www.epd.isb-sib.ch/current/usrman.html#The_NP_line>
line last field
o in the TRANSFAC line, designates the criteria used to
collect the TF entry:
- by position: The TF binding site is situated between -500
and + 100, +1 being the transcription initiation site
- by function: The TF binding site is known to regulate the
corresponding promoter.
/NB /: TRANSFAC cross-reference lines should not exceed the real number
of binding sites found in "TRANSFAC Site Table". Thus the position given
in this DR line in related to the longest EMBL entry common to both EPD
and TRANSFAC (version 6.3) databases.
4.2.10. The RN, RX, RA, RT and RL lines
These lines comprise the literature citations within EPD. The citations
indicate the papers from which the data has been abstracted. The
reference lines for a given citation occur in a block, and are always in
the order RN, RX, RA, RT, RL. Within each such reference block the RN
line occurs once, the RX lines occurs zero or more times, and the RA, RT
and RL lines each occur one or more times. If several references are
given, there will be a reference block for each.An example of a complete
reference is:
RN [1]
RX MEDLINE; 84026482.
RA Battey J., Moulding C., Taub R., Murphy W., Stewart T., Potter H.,
RA Lenoir G., Leder P.;
RT "The human c-myc oncogene: structural consequences of
RT translocation into the IgH locus in Burkitt lymphoma";
RL Cell 34:779-787(1983).
The formats of the individual lines are explained below. >
4.2.10.1. The RN line
The RN line gives a sequential number to each reference citation in an
entry.This number is used to indicate the reference in the ME lines.
4.2.10.2 The RX line
The RX line is an optional line which is used to indicate the identifier
assigned to a specific reference in PubMed (PMID, from the National
Library of Medicine (NLM)). .
4.2.10.3 The RA line
The RA lines list the authors of the paper (or other work) cited. The
authors are are listed in the order given in the paper. The names are
listed surname first followed by a blank followed by initial(s) with
periods. The authors' names are separated by commas and terminated by a
semicolon. Author names are not split between lines.
4.2.10.4 The RT line
The RT lines contain the title of the reference citation.
4.2.10.5 The RL line
The RL lines contain the conventional citation information for the
reference. In general, the RL lines alone are sufficient to find the
paper in question. It includes the journal abbreviation, the volume
number, the page range, and the year. Journal names are abbreviated
according to the conventions used by the National Library of Medicine
(NLM) and are based on the existing ISO and ANSI standards.
4.2.11. The ME line
The method lines describe experiments defining the transcription
initiation site. The format of the ME line is as follows:
ME Method_description [; Qualifier...] [n,...].
A complete list of method descriptions is given in Section 4.3.2.
Qualifiers may indicate that an experimental gene transcription system
was used, that data are of low precision (less +/- 5 bp), or that the
experiments were done with a closely related gene. The number(s)
enclosed in square brackets links the method descriptions to the
bibliographic references included in the promoter entry. The methods
line from the example are:
ME Nuclease protection [1,4].
ME Nuclease protection; transfected or transformed cells [3].
ME Length measurement of an RNA product; low-precision data [1].
4.2.12. The SE line
The sequence line shows a short sequence segment corresponding to the
-49 to +10 region of the promoter. Transcribed and untranscribed
nucleotides are represented by upper and lower case characters,
respectively. This line type is not meant to provide sequence data but
serves as a control string for sequence extraction.
4.2.13. The FL line
The Full length line designates the large-scale cDNA sequencing projects
: NEDO (11 <http://www.epd.isb-sib.ch/current/usrman.html#ref_11>), MGC
(12 <http://www.epd.isb-sib.ch/current/usrman.html#ref_12>), and BDGP
(15 <http://www.epd.isb-sib.ch/current/usrman.html#ref_15>).
4.2.13. The IF line
The Initiation Frequency lines reflect the frequency at which each
nucleotide within the initiation region is found at the 5'end of bone
fide full-length cDNA clone inserts.
4.2.14. The TX line
The TX (TaXonomy) lines define a promoter's location within EPD's
hierarchical classification system (see Section 5). Note that starting
from release 72, the classification system
<http://www.epd.isb-sib.ch/current/usrman.html#CLASSIFICATION> is no
longer maintained.
4.2.15. The KW line
The KW lines define a number of keywords
<http://www.epd.isb-sib.ch/current/keywords.html>describing an entry.
4.2.16. The FP, DO and RF lines
These lines pertain to the EPD old format, see next Section.
4.2.17. The // line
The // (terminator) line contains no data or comments. It designates the
end of an entry.
4.3. Line types retained from the old format
The last six lines of a entry present essential information in the more
concise, old format. A original description of the old format follows:
Each entry starts with an FP line that contains a position reference to
a transcription initiation site, and ends with a terminator (//).Below
is an example of a promoter entry:
FP Hs c-myc P2+:+S EU:NC_000008.9 1+ 128817660; 11148.053 010*2
XX
DO Experimental evidence: 4,4#,<2>
DO Expression/Regulation: +mitogen
RF Cell34:779 EMBOJ2:2375 MCB7:1393 MCB7:2988
//
4.3.1. The FP line
The FP line contains the following fields and subfields:
* *columns*
* *data type*
* 1- 2
* 3- 5
* 6-30
o 6-25
o 26-26
o 27-27
o 28-28
o 29-30
* 31-55
o 31-51
o 31-32
o 33-33
o 34-51
o 52-52
o 53-53
o 54-63
* 64-64
* 65-70
* 71-71
* 72-74
* 75-75
* 76-80
o 76-78
o 79-79
o 80-80
* "FP"
* (blank)
* description:
o promoter name
o ": "
o independent subset status (see section 6
<http://www.epd.isb-sib.ch/current/usrman.html#HOMOLOGOUS_PROMOTERS>)
o type of initiation site (see section 3
<http://www.epd.isb-sib.ch/current/usrman.html#ASSIGNMENT_OF_TRANSCRIPTION_INITIATION_SITE>)
o (blank)
* functional position reference:
o sequence reference:
o genome db code
o ":"
o genome db entry accession number
o sequence type (0 = circular, 1 = linear)
o strand (+ or -)
o position number
* ";"
* entry code
* "."
* homology group number (see section 6
<http://www.epd.isb-sib.ch/current/usrman.html#HOMOLOGOUS_PROMOTERS>)
* (blank)
* alternative promoter identification code:
o gene number
o "*"
o Initiation site number
Explanations:
* The promoter name begins with a species code usually followed by a
gene locus or gene product name. Species codes consist of the
initials of genus and species name. Occasionally, three characters
are required to generate unique codes. Standard abbreviations
identify viruses. The full names of the organisms are given in
appendix B.1. Subspecies or strains are specified in parentheses.
Chromosomal locations (genetic or cytogenetic loci, genomic map
units, etc.) may appear in square brackets immediately following
species codes. Many gene products are referred to by abbreviations
explained in appendix B.3. Alternative promoters are identified by
right-justified "P" and a digit indicating the corresponding
initiation site numbered sequentially from 5' to 3'. An optional
"E" and digit refers to the corresponding 5'exons, if known.
Identical numbers indicate 3'co-terminal exons. The strongest
initiation site is marked by trailing + if known (see also List of
alternative promoters <http://www.epd.isb-sib.ch/current/AP.html>)
* genome db codes currently used are 'EM' for EMBL database, and
'EU' for genome contigs or chromosomal genome assemblies of the
RefSeq database.
* The EMBL accession number always relates to the first EMBL
cross-reference. This one is usually the longest promoter region
except when the entry is cancelled from the EMBL database, but
still present in GenBank or DDBJ.
* The sequence type indicates whether the sequence is circular or
linear. A sequence comprising exactly one repeat unit of a tandem
repeat cluster is also considered circular. Note that the
annotation as circular or linear sequences in EPD is not always in
agreement with the corresponding annotation in EMBL.
* The entry code is a five-digit number which is the only part of a
promoter entry that is stable from release to release.
* Alternative promoter identification code: Genes represented by
multiple promoter entries in EPD are assigned a promoters group
number. The corresponding initiation sites are numbered
sequentially from 5' to 3'.
4.3.2. DO lines: Documentation
Documentation of promoter entries is presented on lines starting with
"DO". They are essentially free format and so far not processed by
specific programs. In the present release, there are two DO lines per
entry, the first referring to the transcript mapping experiments that
define the promoter, the second giving information about expression and
regulation.The varies experimental techniques are identified by number
codes.The "Medline's number" and/or "example" in brackets are linked,
respectively, to the abstract and/ or to the full text article
describing the related experiment.
codes experiments
1 Direct RNA sequencing (1634116
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=1634116>)
2 Length measurement of an RNA product (1989694
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=1989694>)
3 Nuclease protection : Length measurement of a nuclease-protected
complementary RNA or DNA fragment (2845126
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=2845126>)
(8294473
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=8294473>)
4 RNA sequencing by primer extension : by dideoxy-terminated primer
extension (3396543
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=3396543>)
5 Sequencing of a full-length cDNA (8294473
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=8294473>)
6 Primer extension : Length measurement of a primer extension product
(10187799
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=10187799>
, example <http://www.jbc.org/cgi/content/full/274/15/10154/F3>)
(9880555
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=9880555>
, example <http://www.jbc.org/cgi/content/full/274/3/1736/F2>)
7 DNA sequencing of a full-length processed pseudogene (3584116
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=3584116>)
8 Reverse direction primer extension with homologous sequence ladder :
Length measurement of an in vitro synthesised DNA primed upstream of the
initiation site and blocked by the 5'end of the RNA hybridized to the
template (2451027
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=2451027>)
9 Rapid amplification of cDNA ends (RACE) (9116864
<http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=9116864&dopt=Books>)
10 RNA sequencing, type not specifed
11 Oligo-capping : artificial capping of mRNA followed by sequencing of
the 5' end of cDNA (11375929
<http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11375929&dopt=Books>,
11337467 and examples)
<http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11337467&dopt=Books>
12 Mammalian gene collection (MGC) full-length cDNA cloning (10521335
and example)
<http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10521335&dopt=Books>
13 5' end confirmed by alignment of first 100 downstream nucleotides to
EST database.
14 Oligo-capping: Berkeley Drosophila Genome Project (12537569
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12537569>)
15 Oligo-capping: Rice full-length cDNA cloning (12869764
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12869764&query_hl=2>)
Special characters appended to the number codes designate an
experimental gene expression system where the RNA for the corresponding
experiments was synthesized.
* RNA POL II in vitro system
o injected amphibian oocytes
# transfected or transformed cells, injected neurons
! transgenic organisms
r experiments performed with closely related gene
h homologous sequence ladder used for length measurement of nuclease
protection or primer extension product
l low-precision data (error > +/- 5 bp)
Explanations and additional conventions:
* The full-length assumption of a cDNA clone or a proccessed
pseudogene is based on consistency with accompanying
nuclease-protection or primer extension data or, alternatively,
the existence of multiple 5'coterminal clones or pseudogenes.
The information on expression/regulation may include indication of
developmental stages, tissues, cell types, cell cycle stages, and
various regulatory features.Conventions:
* Semicolon delimits the two fields : expression and regulation.
* Comma delimits alternative keywords (e.g. liver, kidney)
* "+" means "induced by" or "strongly expressed in".
* "-" means "repressed by" or "weakly expressed in".
* "~" means "modulated by".
* Cell cycle stages are given in square brackets.
4.3.3. RF line: Literature references
The first four references from the RN, RX, RA, RT and RL lines are
repeated in a highly condensed form. Each reference is spaced by 15
letters and indicates journal, volume, and starting page of the referred
article (maximal 14 letters). The journal code explained in Appendix B.2.
They primarily point to the articles where the experimental promoter
evidence is presented. Additional potential subjects are homology to
other promoters, gene expression and regulation, nomenclature. Papers
containing only sequence data are usually not referred to because they
are easy to find via the corresponding EMBL sequence entry descriptions.
4.3.4. Miscellaneous
* Greek letters are sometimes represented by corresponding latin
letters followed by apostrophe:
a' = alpha b' = beta g' = gamma d' = delta e' = epsilon
z' = zeta h' = eta th'= theta k' = kappa l' = lambda
n' = nu r' = rho
* Sub- and superscripts are sometimes indicated by preceding "_" and
"^", respectively.
4.4. Distinct format of 'preliminary' entries in epd_bulk.dat
4.4.1. The title line:
TI epd83 Bulk Section Eukaryotic Promoter Database / Release 83 EP
4.4.2. The ID line
The identification line is always the first line of an entry. The form
of the ID line in 'epd_bulk.dat' is:
ID OS_bAAAA preliminary; undefined; TAXONOMIC DIVISION.
* An unique entry identifier "OS_bAAAA" is contructed using the
species identification code ('OS') with at most 4 alphanumeric
characters representing the biological source of the promoter and
a 'b' (for bulk) followed by an arbitrary 4 letter code
* "preliminary" /data class/ field indicates that the entry has not
(yet) undergone all quality checks necessary for being classified
as "standard".
* "undefined" as /initiation site type/ due to insufficient data to
define transcription initiation patterns (Section 3
<http://www.epd.isb-sib.ch/current/usrman.html#ASSIGNMENT_OF_TRANSCRIPTION_INITIATION_SITE>).
* /TAXONOMIC DIVISION/ are
o PLN for plant
o NEM for nematode
o ART for arthropode
o MLS for mollusc
o ECH for echinoderm
o VRT for vertebrates.
Note that these codes relate to the organism in which the promoter
is expressed, not to the source organism in which the promoter is
replicated as defined on the OS line.
The ID line is terminated by a period.
4.4.3. The AC line
AC EP00001;
The accession number consists of the character string "EP" followed by 5
digits. Previously the first two digits of the AC designated the release
number of initial appearance of the specific entry followed by the EPD
entry order. AC numbers in 'epd_bulk.dat' are continuous numbers,
excluding ACs already used for entries in the main file 'epd.dat'.
5 CLASSIFICATION
*Starting from release 72, the classification system is no longer
maintained. New entries are presently added by default to an
'?Unclassified' category. The classification system might still provide
valuable information for entries added before release 72. However for
any category, consider the possible existence of additional, potentially
corresponding EPD entries in the default categories.*
/The entries of the Eukaryotic Promoter Database are embedded in a
hierarchical classification
<http://www.epd.isb-sib.ch/current/epd_classif.html> system. A
promoter's taxonomic location is made clear by interspersed group
headings. The example shown below is taken from top of the database. A
contrasting format has been chosen to emphasize the very different
nature of this information./
/*----------------------------------------------------------------------*
* 1. Plant promoters *
*----------------------------------------------------------------------*
* 1.1. Chromosomal genes *
*----------------------------------------------------------------------*
* 1.1.1. Small nuclear RNAs *
*----------------------------------------------------------------------*/
/A group heading consists of a series of node numbers and a title. The
highest classification level distinguishes between promoters active in
major eukaryotic taxa (phyla). Further below, grouping considers
replicon type and functional properties of gene products. On the lowest
level, homology (as defined in section 6) is the criterion. A survey of
the upper part of the classification pyramid is presented in appendix
A.The proposed classification system has a highly tentative character as
it is often unclear how a new promoter should be classified, especially
if the gene product is a multifunctional protein. Users should therefore
not be surprised or discouraged if they don't find a promoter at the
initially expected place./
6 HOMOLOGOUS PROMOTERS
Homology is defined as sequence similarity due to common phylogenetic
origin. In EPD, two promoters are considered homologous if they exhibit
>=50% sequence similarity between -79 and +20. Similarity is calculated
from optimal alignments generated with the aid of the UWGCG subroutine
ShiftAlign (13 <http://www.epd.isb-sib.ch/current/usrman.html#ref_13>)
using the following symbol comparison table:
A C G N T
1.0 0.0 0.0 0.5 0.0 A
1.0 0.0 0.5 0.0 C
1.0 0.5 0.0 G
0.5 0.5 N
1.0 T
Gap weight and gap length weight are specified as 3 and 0, respectively.
Terminal gaps are ignored. Percent similarity is understood as alignment
score divided by segment length, times 100. Groups of homologous
promoters are identified by homology group numbers (see 4.2.1.).
Definition of these groups is based on similarity scores as defined
above and a tree generation method called UPGMA (14
<http://www.epd.isb-sib.ch/current/usrman.html#ref_14>). In a few cases,
similarities between 50% and 56% were ignored if the protein sequences
of the corresponding genes were not related. Similarities were also
ignored between alternative promoter sequences that are spaced by less
than 50 bp. A subset of "independent" promoters is marked by "+" in
column 27 of the FP line. This set contains only one member per homology
group (usually, the promoter with the longest upstream sequence
available) and is intended to be used for statistical analysis of
functional patterns where it is important to avoid bias by multiples of
closely related sequences.
7 PROMOTER SEQUENCE RETRIEVAL
Promoter sequence listings have not been incorporated into EPD for two
reasons: (i) to avoid duplication of data already existing elsewhere in
the EMBL data library, and (ii) to encourage usage of FPS-dependent
sequence retrieval programs which enables the user to specify suitable
5'- and 3'boundaries of the requested sequence segments himself. Effort
is under way to motivate producers of standard nucleotide sequence
analysis packages to provide such tools in the future. In the meantime,
users with some programming experience will find it easy to write their
own routines. Our local sequence extraction programs run in a UWGCG
environment (13 <http://www.epd.isb-sib.ch/current/usrman.html#ref_13>)
and have been implemented at several sites in Europe and the United
States. They are documented and freely available on request.
8 REFERENCES
1. Bucher, P. & Trifonov, E.N., /Compilation and analysis of
eukaryotic POL II promoter sequences/, Nucl. Acids Res. *14*,
10009-10026 (1986). (3808945
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=3808945>)
2. Bucher, P. & Bryan, B., /Signal search analysis: a new method to
localize and characterize functionally important DNA sequences/,
Nucl. Acids Res. *12*, 287-305 (1984). (84118736
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=84118736>)
3. Stoesser, G., Tuli,M.A., Lopez, R. and Sterk, P., /The EMBL
nucleotide sequence database/, Nucleic Acids. Res., *27*,* *18-24
(1999). (99063644
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=99063644>)
4. Benson, D.A., Boguski, M.S., Lipman, D.J., Ostell, J., Ouellette
B.F.F, Rapp, B:A: and Wheeler, D.L., /GenBank,/ Nucleic Acids.
Res., *27*, 12-17 (1999). (99063643
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=99063643>)
5. Sugawara, H., Miyazaki, S., Gojobori, T. and Tateno, Y.,/DNA Data
Bank of Japan dealing with large-scale data submission/, Nucleic
Acids. Res., *27*, 25-28 (1999). (99063645
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=99063645>)
6. Bairoch, A. and Apweiler, R., /The SWISS-PROT protein sequence
data bank and its supplement TrEMBL in 1999/, Nucleic Acids Res.,
*27*, 49-54 (1999). (99063650
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=99063650>)
7. Heinemeyer, T., Chen, X., Karas, H., Kel, A.E., Kel, O.V.,
Liebich, I., Meinhardt, T., Reuter, I., Schacherer, F. and
Wingender, E., /Expanding the TRANSFAC database towards an expert
system of regulatory molecular mechanisms/, Nucleic Acids. Res.,
*27*,* *318-322 (1999). (99063727
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=99063727>)
8. The FlyBase consortium, /The FlyBase database of the drosophilia
genome projects and community litterature/, Nucleic Acids. Res.,
*27*,85-88 (1999). (99063659
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=99063659>)
9. Pearson, P., Francomano, C., Foster, P., Bocchini, C., Li, P. and
McKusick, V., /The status of online Mendelian inheritance in man
(OMIM) medio 1994, /Nucleic Acids Res., *22*, 3470-3473 (1994).
(95023074
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=95023074>)
10. Blake, J.A., Richardson, J.E., Davisson, M.T., Eppig, J.T. and the
Mouse Genome Database Group, /The Mouse Genome Database (MGD):
genetic and genomic information about the laboratory mouse/,
Nucleic Acids Res., *27*, 95-98 (1999). (99063661
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=99063661>)
11. Suzuki Y., Yamashita R., Nakai K., Sugano S., /DBTSS: database of
human transcriptional start sites and full-length cDNAs. /Nucleic
Acids Res. *30*(1):328-331(2002). (11752328
<http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11752328&dopt=Books>)
12. Strausberg, R.L., Feingold, E.A., Klausner, R.D., Collins, F.S.,
/The Mammalian Gene Collection. /Science, *286*, 455-457 (1999).
(10521335)
<http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10521335&dopt=Books>
13. Devereux,J., Haeberli,P., & Smithies,O. /A comprehensive set of
sequence analysis programs for the VAX/, Nucl. Acids Res. *12*,
387-395 (1984). (84118744
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=84118744>)
14. Sneath,H.A. & Sokal,R.R., /Numerical taxonomy/, W.H. Freemann, San
Francisco, London (1973).
15. Stapleton M., Liao GC., Brokstein P., Hong L., Carninci P.,
Shiraki T., Hayashizaki Y., Champe M., Pacleb J., Wan K., Yu C.,
Carlson J., George R., Celniker S., and Rubin GM., /The Drosophila
Gene Collection: Identification of Putative Full-Length cDNAs for
70% of D. melanogaster Genes. /Genome Res., *12*:1294-1300 (2002).
(12176937
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=12176937>)
16. Schmid C.D., Praz V., Delorenzi M., Périer R., and Bucher P., The
Eukaryotic Promoter Database EPD: the impact of in silico primer
extension. Nucleic Acids Res. *32,* D82-5 (2004). (14681364
<http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&dopt=Citation&list_uids=14681364>)
A. APPENDIX A : SURVEY OF RELEASE
<http://www.epd.isb-sib.ch/current/SURVEY.html>
B. APPENDIX B : CODES AND ABBREVIATIONS
B.1. SPECIES CODES
*Code* /Scientific name/ (English name)
AAV2 /Adeno-associated virus 2
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Adeno-associated+virus&lvl=0&srchmode=1>/
Ac /Aplysia californica
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Aplysia+californica&lvl=0&srchmode=1>/
(California sea hare)
AcNPV /Autographa californica nuclear polyhedrosis virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Autographa+californica+nuclear+polyhedrosis+virus&lvl=0&srchmode=1>/
Ad2 /Human adenovirus type 2
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+adenovirus+type+2&lvl=0&srchmode=1>/
Ad5 /Human adenovirus type 5
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+adenovirus+type+5&lvl=0&srchmode=1>/
Ad7 /Human adenovirus type 7
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+adenovirus+type+7&lvl=0&srchmode=1>/
Ad12 /Human adenovirus type 12
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+adenovirus+type+12&lvl=0&srchmode=1>/
Ag /Ateles geoffroyi
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Ateles+geoffroyi&lvl=0&srchmode=1>/
(black-handed spider monkey)
ALV /Avian leukosis virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Avian+leukosis+virus&lvl=0&srchmode=1>/
Am /Antirrhinum majus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Antirrhinum+majus&lvl=0&srchmode=1>/
(snapdragon)
Ab-MLV /Abelson murine leukemia virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Abelson+murine+leukemia+virus&lvl=0&srchmode=1>/
Apo /Antheraea polyphemus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Antheraea+polyphemus&lvl=0&srchmode=1>/
(polyphemus moth)
Ap /Anas platyrhynchos
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Anas+platyrhynchos&lvl=0&srchmode=1>/
(mallard, domestic duck)
As /Avena sativa
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Avena+sativa&lvl=0&srchmode=1>/
(oat)
At /Agrobacterium tumefaciens
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Agrobacterium+tumefaciens&lvl=0&srchmode=1>/
Ath /Arabidopsis thaliana
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Arabidopsis+thaliana&lvl=0&srchmode=1>/
(thale cress)
Atr /Aotus trivirgatus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Aotus+trivirgatus&lvl=0&srchmode=1>/
(douroucouli)
Ay /Antheraea yamamai
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Antheraea+yamamai&lvl=0&srchmode=1>/
B19 /Human parvovirus B19
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+parvovirus+B19&lvl=0&srchmode=1>/
Be /Bertholletia excelsa
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Bertholletia+excelsa&lvl=0&srchmode=1>/
(Brazil nut)
BKV /Papovavirus BKV
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Papovavirus+BKV&lvl=0&srchmode=1>/
BLV /Bovine leukemia virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Bovine+leukemia+virus&lvl=0&srchmode=1>/
Bm /Bombyx mori
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Bombyx+mori&lvl=0&srchmode=1>/
(silkworm)
Bn /Brassica napus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Brassica+napus&lvl=0&srchmode=1>/
(rape)
BPV1 /Bovine papillomavirus type 1
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Bovine+papillomavirus+type+1&lvl=0&srchmode=1>/
Bt /Bos taurus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Bos+taurus&lvl=0&srchmode=1>/
(cattle)
CaMV /Cauliflower mosaic virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Cauliflower+mosaic+virus&lvl=0&srchmode=1>/
Cco /Coturnix coturnix
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Coturnix+coturnix&lvl=0&srchmode=1>/
(quail)
Ce /Caenorhabditis elegans
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Caenorhabditis+elegans&lvl=0&srchmode=1>/
Cg /Canavalia gladiata
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Canavalia+gladiata&lvl=0&srchmode=1>/
(sword bean)
Cgr /Cricetulus griseus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Cricetulus+griseus&lvl=0&srchmode=1>/
(Chinese hamster)
Ch /Capra hircus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Capra+hircus&lvl=0&srchmode=1>/
(goat)
Cl /Canis lupus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Canis+lupus&lvl=0&srchmode=1>/
(gray wolf)
Cm /Cairina moschata
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Cairina+moschata&lvl=0&srchmode=1>/
(muscovy duck)
Cp /Cavia porcellus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Cavia+porcellus&lvl=0&srchmode=1>/
(domestic guinea pig)
Cpe /Cucurbita pepo
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Cucurbita+pepo&lvl=0&srchmode=1>/
(zucchini)
Ct /Chironomus thummi
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Chironomus+thummi&lvl=0&srchmode=1>/
(midge)
Cte /Chironomus tentans
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Chironomus+tentans&lvl=0&srchmode=1>/
Dc /Daucus carota
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Daucus+carota&lvl=0&srchmode=1>/
(carrot)
Df /Drosophila funebris
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+funebris&lvl=0&srchmode=1>/
(fruit fly)
Dh /Drosophila hydei
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+hydei&lvl=0&srchmode=1>/
(fruit fly)
DHBV /Duck hepatitis B virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Duck+hepatitis+B+virus&lvl=0&srchmode=1>/
Dm /Drosophila melanogaster
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+melanogaster&lvl=0&srchmode=1>/
(fruit fly)
Dma /Drosophila mauritiana
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+mauritiana&lvl=0&srchmode=1>/
(fruit fly)
Dmo /Drosophila mojavensis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+mojavensis&lvl=0&srchmode=1>/
(fruit fly)
Dmu /Drosophila mulleri
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+mulleri&lvl=0&srchmode=1>/
(fruit fly)
Do /Drosophila orena
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+orena&lvl=0&srchmode=1>/
(fruit fly)
Dp /Drosophila pseudoobscura
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+pseudoobscura&lvl=0&srchmode=1>/
(fruit fly)
Ds /Drosophila simulans
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+simulans&lvl=0&srchmode=1>/
(fruit fly)
Dse /Drosophila sechellia
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+sechellia&lvl=0&srchmode=1>/
(fruit fly)
Dv /Drosophila virilis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Drosophila+virilis&lvl=0&srchmode=1>/
(fruit fly)
EBV /Human herpesvirus 4
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+herpesvirus+4&lvl=0&srchmode=1>/
(Epstein-Barr virus)
Ec /Equus caballus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Equus+caballus&lvl=0&srchmode=1>/
(horse)
FBJ-MSV /Murine osteosarcoma virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Murine+osteosarcoma+virus&lvl=0&srchmode=1>/
(Finkel-Biskis-Jinkins)
FBR-MSV /Murine osteosarcoma virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Murine+osteosarcoma+virus&lvl=0&srchmode=1>/
(Finkel-Biskis-Reilly)
F-MCF /Friend mink cell focus-forming virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Friend+mink+cell+focus-forming+virus&lvl=0&srchmode=1>/
(Murine)
Fs /Felis silvestris
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Felis+silvestris&lvl=0&srchmode=1>/
(wild cat)
F-SFFV /Friend spleen focus-forming virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Friend+spleen+focus-forming+virus&lvl=0&srchmode=1>/
Ft /Flaveria trinervia
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Flaveria+trinervia&lvl=0&srchmode=1>/
GA-FeLV /Gardner-Arnstein feline leukemia oncovirus B
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Gardner-Arnstein+feline+leukemia+oncovirus+B&lvl=0&srchmode=1>/
GALV /Gibbon ape leukemia virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Gibbon+ape+leukemia+virus&lvl=0&srchmode=1>/
Gg /Gallus gallus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Gallus+gallus&lvl=0&srchmode=1>/
(chicken)
Ggo /Gorilla gorilla
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Gorilla+gorilla&lvl=0&srchmode=1>/
(gorilla)
Gm /Glycine max
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Glycine+max&lvl=0&srchmode=1>/
(soybean)
GSHV /Ground squirrel hepatitis virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Ground+squirrel+hepatitis+virus&lvl=0&srchmode=1>/
H-1 /Parvovirus H1
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Parvovirus+H1&lvl=0&srchmode=1>/
(Murine)
Ha /Helianthus annuus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Helianthus+annuus&lvl=0&srchmode=1>/
(common sunflower)
Hb /Hevea brasiliensis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Hevea+brasiliensis&lvl=0&srchmode=1>/
(para rubber tree)
HBV /Human hepatitis B virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+hepatitis+B+virus&lvl=0&srchmode=1>/
HCMV /Human cytomegalovirus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+cytomegalovirus&lvl=0&srchmode=1>/
Hg /Halichoerus grypus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Halichoerus+grypus&lvl=0&srchmode=1>/
(grey seal)
HIV-1 /Human immunodeficiency virus type 1
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+immunodeficiency+virus+type+1&lvl=0&srchmode=1>/
HIV-2 /Human immunodeficiency virus type 2
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+immunodeficiency+virus+type+2&lvl=0&srchmode=1>/
HPV16 /Human papillomavirus type 16
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+papillomavirus+type+16&lvl=0&srchmode=1>/
HPV18 /Human papillomavirus type 18
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+papillomavirus+type+18&lvl=0&srchmode=1>/
Hs /Homo sapiens
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Homo+sapiens&lvl=0&srchmode=1>/
(human)
HSV-1 /Human herpesvirus 1
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+herpesvirus+1&lvl=0&srchmode=1>/
HSV-2 /Human herpesvirus 2
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+herpesvirus+2&lvl=0&srchmode=1>/
HTLV-I /Human T-cell leukemia virus type I
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+T-cell+leukemia+virus+type+I&lvl=0&srchmode=1>/
HTLV-II /Human T-cell leukemia virus type II
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Human+T-cell+leukemia+virus+type+II&lvl=0&srchmode=1>/
Hv /Hordeum vulgare
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Hordeum+vulgare&lvl=0&srchmode=1>/
(barley)
HVS /Herpesvirus saimiri
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Herpesvirus+saimiri&lvl=0&srchmode=1>/
JCV /Human polyomavirus JCV
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=human+polyomavirus+JCV&lvl=0&srchmode=1>/
Le /Lycopersicon esculentum
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Lycopersicon+esculentum&lvl=0&srchmode=1>/
(tomato)
Leu /Lepus europaeus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Lepus+europaeus&lvl=0&srchmode=1>/
(European hare)
Lm /Locusta migratoria
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Locusta+migratoria&lvl=0&srchmode=1>/
(migratory locust)
Lp /Lytechinus pictus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Lytechinus+pictus&lvl=0&srchmode=1>/
(painted urchin)
Lpe /Lycopersicon peruvianum
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Lycopersicon+peruvianum&lvl=0&srchmode=1>/
(Peruvian tomato)
Lv /Lytechinus variegatus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Lytechinus+variegatus&lvl=0&srchmode=1>/
(green urchin)
Ma /Mesocricetus auratus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Mesocricetus+auratus&lvl=0&srchmode=1>/
(golden hamster)
Mc /Macaca fascicularis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Macaca+fascicularis&lvl=0&srchmode=1>/
(crab-eating macaque)
MCMV /Murine cytomegalovirus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Murine+cytomegalovirus&lvl=0&srchmode=1>/
MLV_AKV /AKV murine leukemia virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=AKV+murine+leukemia+virus&lvl=0&srchmode=1>/
MLVxeno /Xenotropic murine leukemia virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Xenotropic+murine+leukemia+virus&lvl=0&srchmode=1>/
Mm /Mus musculus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Mus+musculus&lvl=0&srchmode=1>/
(house mouse)
M-MLV /Moloney murine leukemia virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Moloney+murine+leukemia+virus&lvl=0&srchmode=1>/
M-MSV /Moloney murine sarcoma virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Moloney+murine+sarcoma+virus&lvl=0&srchmode=1>/
MMTV /Mouse mammary tumor virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Mouse+mammary+tumor+virus&lvl=0&srchmode=1>/
Ms /Medicago sativa
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Medicago+sativa&lvl=0&srchmode=1>/
(alfalfa)
MSV /Maize streak virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Maize+streak+virus&lvl=0&srchmode=1>/
Np /Nicotiana plumbaginifolia
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Nicotiana+plumbaginifolia&lvl=0&srchmode=1>/
(curled-leaved tobacco)
Ns /Nicotiana sylvestris
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Nicotiana+sylvestris&lvl=0&srchmode=1>/
(wood tobacco)
Nt /Nicotiana tabacum
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Nicotiana+tabacum&lvl=0&srchmode=1>/
(common tobacco)
Nto /Nicotiana tomentosiformis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Nicotiana+tomentosiformis&lvl=0&srchmode=1>/
Oa /Ovis aries
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Ovis+aries&lvl=0&srchmode=1>/
(sheep)
Oc /Oryctolagus cuniculus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Oryctolagus+cuniculus&lvl=0&srchmode=1>/
(rabbit)
Os /Oryza sativa
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Oryza+sativa&lvl=0&srchmode=1>/
(rice)
Ph /Petunia hybrida
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Petunia+hybrida&lvl=0&srchmode=1>/
(e.g. Petunia strain Mitchell)
Pa /Papio anubis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Papio+anubis&lvl=0&srchmode=1>/
(olive baboon)
Pc /Petroselinum crispum
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Petroselinum+crispum&lvl=0&srchmode=1>/
(parsley)
Pl /Paracentrotus lividus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Paracentrotus+lividus&lvl=0&srchmode=1>/
(common urchin)
Pm /Psammechinus miliaris
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Psammechinus+miliaris&lvl=0&srchmode=1>/
(sand urchin)
Polyoma /Mouse polyomavirus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Mouse+polyomavirus&lvl=0&srchmode=1>/
Ppy /Photinus pyralis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Photinus+pyralis&lvl=0&srchmode=1>/
(North American firefly)
Pp /Pongo pygmaeus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Pongo+pygmaeus&lvl=0&srchmode=1>/
(orangutan)
Ps /Pisum sativum
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Pisum+sativum&lvl=0&srchmode=1>/
(pea)
Pt /Pan troglodytes
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Pan+troglodytes&lvl=0&srchmode=1>/
(chimpanzee)
Pth /Pinus thunbergii
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Pinus+thunbergii&lvl=0&srchmode=1>/
(Japanese black pine)
Pv /Phaseolus vulgaris
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Phaseolus+vulgaris&lvl=0&srchmode=1>/
(kidney bean)
RAV2 /Rous associated virus type 2
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Rous+associated+virus+type+2&lvl=0&srchmode=1>/
(Avian)
Rc /Ricinus communis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Ricinus+communis&lvl=0&srchmode=1>/
(castor bean)
R-MCF /Rauscher mink cell focus-forming virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Rauscher+mink+cell+focus-forming+virus&lvl=0&srchmode=1>/
Rn /Rattus norvegicus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Rattus+norvegicus&lvl=0&srchmode=1>/
(Norway rat)
RSV /Rous sarcoma virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Rous+sarcoma+virus&lvl=0&srchmode=1>/
(Avian)
Sa /Sinapis alba
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Sinapis+alba&lvl=0&srchmode=1>/
(white mustard)
SA7P /Simian adenovirus (7P)
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Simian+adenovirus+&lvl=0&srchmode=1>/
Sd /Strongylocentrotus droebachiensis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Strongylocentrotus+droebachiensis&lvl=0&srchmode=1>/
Se /Nannospalax ehrenbergi
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Nannospalax+ehrenbergi&lvl=0&srchmode=1>/
(Ehrenberg's mole-rat)
Sg /Oncorhynchus mykiss
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Oncorhynchus+mykiss&lvl=0&srchmode=1>/
(rainbow trout)
SIV /Simian immunodeficiency virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Simian+immunodeficiency+virus+&lvl=0&srchmode=1>/
SNV /Spleen necrosis virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Spleen+necrosis+virus&lvl=0&srchmode=1>/
So /Spinacia oleracea
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Spinacia+oleracea&lvl=0&srchmode=1>/
Sp /Strongylocentrotus purpuratus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Strongylocentrotus+purpuratus&lvl=0&srchmode=1>/
Spe /Sarcophaga peregrina
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Sarcophaga+peregrina&lvl=0&srchmode=1>/
Sr /Sesbania rostrata
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Sesbania+rostrata&lvl=0&srchmode=1>/
SRV-1 /Simian AIDS retrovirus SRV-1
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Simian+AIDS+retrovirus+SRV-1&lvl=0&srchmode=1>/
Ss /Sus scrofa
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Sus+scrofa&lvl=0&srchmode=1>/
(pig)
SSV /Simian sarcoma virus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Simian+sarcoma+virus&lvl=0&srchmode=1>/
St /Solanum tuberosum
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Solanum+tuberosum&lvl=0&srchmode=1>/
(potato)
Sv /Sorghum bicolor (sorghum)
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Sorghum+bicolor&lvl=0&srchmode=1>/
SV40 /Simian virus 40
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Simian+virus+40&lvl=0&srchmode=1>/
Ta /Triticum aestivum
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Triticum+aestivum&lvl=0&srchmode=1>/
(wheat)
Visna /Visna lentivirus
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Visna+lentivirus&lvl=0&srchmode=1>/
Xb /Xenopus borealis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Xenopus+borealis&lvl=0&srchmode=1>/
(Kenyan clawed frog)
Xl /Xenopus laevis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Xenopus+laevis&lvl=0&srchmode=1>/
(African clawed frog)
Xt /Xenopus tropicalis
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Xenopus+tropicalis&lvl=0&srchmode=1>/
(western clawed frog)
Zm /Zea mays
<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&name=Zea+mays&lvl=0&srchmode=1>/
(maize)
>
B.2. JOURNAL CODES
Code Journal name
ARB Annual Review of Biochemistry
ARP Annual Review of Physiology
BBA Biochimica Biophysica Acta
BBRC Biochemical and Biophysical Research Communications
Bch Biochemistry
Bchi Biochimie
BchJ Biochemical Journal
BCHS Biological Chemistry Hoppe-Seyler
BrJR British Journal of Rheumatology
BrainR Brain Research
Btech Biotechnology
CanR Cancer Research
Cell Cell
CGD Cell Growth Differentiation
Chrom Chromosoma
CSHS Cold Spring Harbor Symposia on Quantitative Biology
CTMI Current Topics in Microbiology and Immunology
CurG Current Genetics
DCB DNA and Cell Biology
DevB Developmental Biology
Diab Diabetes
DNA DNA
ECR Experimental Cell Research
EJBc European Journal of Biochemistry
EJCB European Journal of Cellular Biology
EMBOJ EMBO Journal
EMBOR EMBO Reports
Evo Evolution
FEBS FEBS Letters
GDev Genes and Development
Gene Gene
GChC Genes Chromosomes Cancer
GnmR Genome Research
Gnms Genomics
Gnts Genetics
HGEN Human Genetics
IJCa International Journal of Cancer
ImTo Immunology Today
JBC Journal of Biological Chemistry
JBch Journal of Biochemistry
JCB Journal of Cell Biology
JEM Journal of Experimental Medicine
JGV Journal of General Virology
JI Journal of Immunology
JMAG Journal of Molecular and Applied Genetics
JMB Journal of Molecular Biology
JME Journal of Molecular Evolution
JMEnd Journal of Molecular Endocrinology
JNeSc Journal of Neuroscience
JVir Journal of Virology
MB Molecular Biology
MBE Molecular Biology and Evolution
MBM Molecular Biology and Medicine
MBR Molecular Biology Reports
MCB Molecular and Cellular Biology
MCEnd Molecular and Cellular Endocrinology
MEnd Molecular Endocrinology
MImm Molecular Immunology
MEnz Methods in Enzymology
MGG Molecular and General Genetics
MNeub Molecular Neurobiology
MPMI Molecular Plant-Microbe Interactions
NAR Nucleic Acids Research
Nat Nature
Oncg Oncogene
OncR Oncogene Research
Pla Planta
PlJ Plant Journal
PMB Plant Molecular Biology
PSL Plant Science Letters
RPHR Recent Progress in Hormone Research
PNAS Proceedings of the National Academy of Sciences of the United
States of America
Sci Science
SCMG Somatic Cell and Molecular Genetics
TiG Trends in Genetics
Vir Virology
VirR Virus Research
>
B.3. ABBREVIATIONS
1-25OH2D3 1,25-(OH)_2 vitamin D_3
20-OHE 20-Hydroxyecdysone
4CL 4-coumarate coenzyme A ligase
a1 Gene locus 1 involved in anthocyanin biosynthesis
abd-g. Abdominal ganglion
abl Abelson murine leukemia virus oncogene
ACC 1-aminocyclopropane-1-carboxylic acid
AChR Acetylcholin receptor
ACP b'-ketoacyl-acyl carrier protein of fatty acid synthase
ACTH Adrenocorticotropic hormone
ADA Adenosine deaminase
ADH Alcohol dehydrogenase
ADPg-s GT ADPglucose-starch glucosyltransferase
adult-HA Adult hermaphrodite
AFW1 Adult fast-white (myosin heavy chain) 1
Ag Antigen
(AGM) "from african green monkey"
AGP Acid glycoprotein
AGPP ADP glucose pyrophosphorylase
AIRS Aminoimidazole ribonucleotide synthase
ALA-synt. 5-Aminolevulinate synthase
ALDH_2 Aldehyde dehydrogenase 2
AlkExo Alkaline exonuclease
Amy Amylase
antp "antennapedia" locus
aP2 Adipocyte homologue of myelin P2
apolipop. Apolipoprotein
apoVLDLII Very low densitiy apolipoprotein II
APRT Adenine phosphoribosyltransferase
AR Adrenergic receptor
ARF ADP-ribosylation factor
arg Arginine
AS Argininosuccinate synthetase
AS-C "achaete-scute" complex locus
AspAT Aspartate aminotransferase
ass. Associated
AT Antitrypsin
ATIII Antithrombin III
ATCase Aspartate transcarbamylase
ATP Adenosinetriphosphate
awd "abnormal wing disk" locus
BB Bowman-Birk (protease inhibitor)
BCKDHA Branched-chain alpha-keto acid dehydrogenase complex
Bcl-2 B-cell leukemia/lymphoma 2 proto-oncogene
BMMC Bone marrow-derived mast cell
BPTI Bovine pancreatic trypsin inhibitor
BSF B-cell stimulating factor
bsg25D Blastoderm specific locus 25D
c- Cellular protooncogene ..
c1 Regulatory locus of anthocyanin synthesis (maize)
C4BP Complement component C4-binding protein
CA Carbonic anhydrase
CAD Carbamoyl-phosphate synthetase (glutamine-hydrolysing)/aspartate
carbamoyl transferase/dihydroorotase
cab Chlorophyll a/b-binding protein
cAMP Cyclic AMP (Adenosinemonophosphate)
card-m. Cardiac muscle
cc-ind. Cell cycle-independent
CD3 T-cell differentiation antigen CD3
CD4 T-cell differentiation antigen CD4
CD8 T-cell differentiation antigen CD8
CEA Carcinoembryonic antigen
CG Chorionic gonadotropin
CNS Central nervous system
CNTF Ciliary neurotrophic factor
car. Cartilage
col. Collagen
conglyc. Conglycinin
cor. Cornea
cotyl. Cotyledon
cp Cytoplasm(ic)
CPS Carbamyl-phosphate synthetase
CRF Corticotropin-releasing factor
CRP C-reactive protein
cs Cytosol(ic)
CSF Colony stimulating facter
cyt Cytokinin gene (coding for isopentenyltransferase)
DAF Decay-accelerating factor
dbp DNA binding protein
DDC DOPA decarboxylase
DDH Dihydrodiol dehydrogenase
dep. dependent
dev. Development(ally)
DHFR Dihydrofolate reductase
diff. differentiation, differentiated
DL/R Left and right duplicated region
dnc "dunce" locus
dUTPase Deoxyuridinetriphosphatase
E 1. Early, 2. Erythroid cell-specific
E8 Ethylene inducible gene during fruit ripening 8
EAS 5-epi-aristolochene synthase (sesquiterpene cyclase)
EBNA Epstein-Barr virus nuclear antigens
ecd-ind. Ecdysone-inducible
EDF Eosinophil differentiation factor
EFW1 Embryonic fast-white (myosin heavy chain) 1
EGF Epidermal growth factor
EIa Adenovirus early Ia region (transactivating element)
Eip Ecdysone-induced protein
ELH Egg-laying hormone
em Embryo, embryonic
epithel epithelial or epithelium
EPSP 5-Enolpyruvylshikimate-3-phosphate
erbA,B (Avian) erythroblastosis virus oncogene A,B
E-resp. Estrogen-responsive
ERV3 Endogenous retrovirus 3
E.Tn Early transposon
et-hypocot. Etiolated hypocotyl
ev1 (Avian) endogenous virus 1
eve "even-skipped" locus
exch. Exchanger
f. Factor
fib. Fibers
fibrob. Fibroblasts
FMRFamide Phe-Met-Arg-Phe-NH(2) neuropeptide
FNR Ferredoxin-(NADP+)-oxidoreductase
FBP Folate Binding Protein
fos FBJ (Finkel-Biskis-Jinkins) osteosarcoma virus oncogene
FSH Follicle stimulating hormone
ftz "fushi tarazu" locus
g. Gene
G0S.. G0/G1 switch regulatory gene ..
G6PD Glucose-6-phosphate dehydrogenase
GA Gibberellic acid
GADPH Glyceraldehyde-3-phosphate dehydrogenase
GARS Glycinamide ribonucleotide synthase
Gart "Gart" locus (-> GARS, AIRS, GART)
GART Glycinamide ribonucleotide transformylase
gC Glycoprotein C
G-CSF Granulocyte colony stimulating factor
gD Glycoprotein D
GdX X-linked gene downstream of G6PD gene
gE Glycoprotein E
GFAP Glial fibrillary acidic protein
g'GT g'-Glutamyl transpeptidase
gln Glutamine
globul-12s 12s globulin (oat seed storage protein)
glucc Glucocorticoid
GLUT1 Glucose transporter type 1
GM-CSF Granulocyte/Macrophage colony stimulating factor
GnRH Gonadotropin-releasing hormone
gp Glycoprotein
GPD Glycerol-3-phosphate dehydrogenase
GPT UDP-GlcNAc:dolichol phosphate N-acetylglucosamine-1-phosphate
transferase
granulo-c Granulocyte
GRF Growth hormone-releasing factor
GRP Glycine-rich (cell wall) protein
GS17 Gastrula-specific transcript 17
GSHPx Gluthathione peroxidase
G-spec. Gastrula-specific
GST Gutathione S-transferase
H 1. Heavy chain, 2. Housekeeping-type promoter
Ha-ras Rat-derived Harvey murine sarcoma virus oncogene
haptoblob haptoglobin
hb "hunchbank" locus
Hc High-cysteine (chorion protein)
HDC L-histidine decarboxylase
hematop. hematopoietic
HGT High-(glycine+tyrosine) keratin
hist. Histone
HMG- High mobility group chromosomal protein
HMG-CoA 3-Hydroxy-3-methylglutaryl coenzyme A
HPRT Hypoxanthine phosphoribosyltransferase
hs Heatshock
hsc Constitutive analogue of heatshock gene/protein
HSF Hepatocyte-stimulating factor
hsp Heatshock protein
Ht Testicular histone
HTF Restriction endonuclease HpaII tiny fragments
I-FABP Intestinal fatty-acid binding protein
IAA Indolacetic acid
IAP Intracisternal A-particles
ICP Infected cell protein
IE Immediate early (gene, RNA)
IF Intermediate filament
IFI Interferon-induced gene/protein
IFN Interferon
Ig Immunoglobulin
IGF Insulin-like growth factor
IL Interleukin
inf. Infected
inh. Inhibitor
iNOS Inducible nitric oxide synthase
IRF Interferon regulatory factor
ISG Interferon-stimulated gene
k. Kinase
keratino-c Keratinocyte
Ki-ras Rat-derived Kirsten murine sarcoma virus oncogene
L 1. Light chain; 2. Late
larva-1,2,.. First, second, .. instar larva
LAT.. Lycopersicon anther-specific gene ..
LCAT Lecithin-cholesterol acyltransferase
lck T-cell- or lymphocyte-specific tyrosine kinase
LDH Lactate dehydrogenase
leghem. Leghemoglobin
LeIF Leukocyte interferon
leuko-c Leukocyte
LH Luteinizing hormone
LHC Light-harvesting complex
LHRH Luteinizing hormone-releasing factor
liv. liver
LMW Low molecular weight
LPH Lipotropic hormone
LPS Lipopolysaccharide
LTR Long Terminal Repeat
lympho-c Lymphocyte
lys Lysosomal
MBP Myelin basic protein
(MAC) Macaque
MC Methylcholanthrene
MCK Muscle-specific creatine kinase
mGK Submaxillary gland kallikrein
MHCI/MHCII Class I/II transplantation antigens of major
histocompatibility complex
MIF Macrophage migration inhibitory factor
minipara Miniparamyosin
mit Mitochondrial
mono-c Monocyte
mononuc-c. Mononuclear cells
MOPC.. Mineral oil-induced plasmacytoma
mos Moloney murine sarcoma virus oncogene
MP Macrophage
MPC.. Mouse plasma cell tumor
MRP MIF-related protein (see MIF)
MSF Megakaryocyte stimulating factor
msp Major sperm protein gene
MT Metallothionein
mst Male-specific transcript
MUP Major urinary protein
myb (Avian) myeoloblastosis virus oncogene
myc Myelocytomatosis virus 29 oncogene
NCA nonspecific cross-reacting (with -> CEA) antigen
nerv. sys Nervous system
neu Ethyl-nitrosurea-induced rat neuroblastoma oncogene
neuropep. Neuropeptide
NGF Nerve growth factor
ninaE "neither inactivation nor afterpotential" locus E
NMDH NADP-malate dehydrogenase
NOS Nitric oxide synthase
nos Nopaline synthetase
NR Nitrate reductase
N-ras Neuroblastoma ras-like (-> Ha-ras) oncogene
NS Nervous system
OAT Ornithine aminotransferase
ocs Octopine synthetase
ODC Ornithine decarboxylase
Ori Origin of replication
OTC Ornithine transcarbamylase
ovalb. Ovalbumin
p. Protein
P-450 Cytochrome P-450
p53 53K phosphoprotein
panc. pancreas, pancreatic
parath. Parathyroid
PB Phenobarbital
PBGD Porphobilinogen deaminase
PCNA Proliferating cell nuclear antigen
PDEase cAMP phosphodiesterase
PDGF Platelet-derived growth factor
PEPCase Phosphoenolpyruvate carboxylase
PEPCK Phosphoenolpyruvate carboxykinase
PG Prostaglandin
PGK 3-Phosphoglycerate kinase
PHA Phytohemagglutinin
PK Protein kinase
P_L Late promoter
PLP Proteolipid protein
POL Polymerase
POMC Proopiomelanocortin
pp.. Phosphoprotein ..
PR1a Pathogenesis-related protein 1a
PRBP Plasma retinol-binding protein
PRL Prolactin
prog. Progesterone
prolyl 4-hydr. Prolyl 4-hydroxylase
PrP Prion protein
PSG1,PSG2,. Pregnancy-specific glycoproteins 1,2,.
PSBP Prostatic steroid binding protein
PSP Parotid secretory protein
PTH Parathyroid hormone
pTiN Nopaline type tumor inducing plasmid
pTiO Octopine type tumor inducing plasmid
r "rudimentary" locus
R 1. Regulatory subunit, 2. Erythroid cell-specific
RAB Gene responsive to ABA
ras Homologue of -> Ha-ras, Ki-ras, etc.
rec. Receptor
red. Reductase
reg. Regulated
rep-dep. Replication-dependent
rig Rat insulinoma gene
RnBP Renin-binding protein
RNR1, RNR2 Ribonucleotide reductase large, small subunit
rp Ribosomal protein
rTn Retrotransposon
RuBPCss Ribulose-1,5-biphosphate carboxylase small subunit
RuBPCA Ribulose-1,5-biphosphate carboxylase/oxygenase activase
s. Small
saliv-g. Salivary gland
SBP Spermine-binding protein
SC Stem cells
sem-v. Seminal vesicle
ser. Serum
sgs Salivary gland secretion protein
sis Simian sarcoma virus oncogene
sk-m. Skeletal muscle
skel-m. Skeletal muscle
smooth-m. Smooth muscle
snRNA Small nuclear RNA
snRNA Small nuclear ribonucleoprotein
SOD Superoxide dismutase
som Somatic
spat-reg. Spatially regulated
Spec Strongylocentrotus purpureatus ectoderm enriched RNA
SPI Serine protease inhibitor
sry "serendipity" locus
SV40T Tumor antigen of simian virus 40 (SV40)
SVS Seminal vesicle secretory protein
synt. Synthase
T3d' T-cell antigen receptor-associated T3-complex delta chain
TAT Tyrosine aminotransferase
TCDD 2,3,7,8-Tetrachlorodibenzo-p-dioxin
TCGF T-cell growth factor
TCR T-cell receptor
TdT Terminal deoxynucleotidyltransferase
test. testis
TF Transcription factor
TGA1a TGACG-specific DNA-binding protein 1a
TGF-b' Transforming growth factor beta
TH Tyrosin hydroxylase
thyr. Thyroxine
Thy-1.2 Thy-1 (thymocyte) antigen/glycoprotein allotype 2
TIF Trans-inducing factor
TIM Triosephosphate isomerase
tis. Tissue
TM Tropomyosin
tmr "tumor morphology root" locus
TNF Tumor necrosis factor
TnI Troponin I (inhibitory subunit)
TnT Troponin T (tropomyosin-binding subunit)
TO Tryptophan oxygenase
TP1,TP2,. Transition protein 1,2,.
TPA 12-O-tetradecaonyl-phorbol-13-acetate
TPI Triosephosphate isomerase
tr.,tr- Transcript
TRF T-cell replacing factor
TRH Thyrotropin-releasing hormone
TS Thymidylate sythetase
TSH Thyroid stimulating hormone
T/t Large/small T(tumor) antigen
Ubx "ultrabithorax" locus
uPA Urine plasminogen activator
URO-D Uroporphyrinogen decarboxylase
Vg1 Vegetal hemisphere-specific mRNA 1
vir-inf. Viral infection
VL30 Retrovirus-like 30s RNA
VLDL Very low density lipoprotein
V_NP (Immunoglobulin heavy chain) variable region specific for
4-hydroxyl-3-nitrophenacetyl
VP5 Virion protein 5 (HSV-1/2: =major capsid protein)
VSP Virion stimulatory protein
vWf von Willebrand factor
Zen "zerknuellt" protein
------------------------------------------------------------------------
EPD Home page <http://www.epd.isb-sib.ch/index.html>