ID C1E408_MICCC Unreviewed; 689 AA.
AC C1E408;
DT 26-MAY-2009, integrated into UniProtKB/TrEMBL.
DT 26-MAY-2009, sequence version 1.
DT 27-MAR-2024, entry version 54.
DE RecName: Full=SANT domain-containing protein {ECO:0000259|PROSITE:PS51293};
GN ORFNames=MICPUN_57866 {ECO:0000313|EMBL:ACO63028.1};
OS Micromonas commoda (strain RCC299 / NOUM17 / CCMP2709) (Picoplanktonic
OS green alga).
OC Eukaryota; Viridiplantae; Chlorophyta; Mamiellophyceae; Mamiellales;
OC Mamiellaceae; Micromonas.
OX NCBI_TaxID=296587 {ECO:0000313|EMBL:ACO63028.1, ECO:0000313|Proteomes:UP000002009};
RN [1] {ECO:0000313|EMBL:ACO63028.1, ECO:0000313|Proteomes:UP000002009}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=RCC299 / NOUM17 {ECO:0000313|Proteomes:UP000002009};
RX PubMed=19359590; DOI=10.1126/science.1167222;
RA Worden A.Z., Lee J.H., Mock T., Rouze P., Simmons M.P., Aerts A.L.,
RA Allen A.E., Cuvelier M.L., Derelle E., Everett M.V., Foulon E.,
RA Grimwood J., Gundlach H., Henrissat B., Napoli C., McDonald S.M.,
RA Parker M.S., Rombauts S., Salamov A., Von Dassow P., Badger J.H.,
RA Coutinho P.M., Demir E., Dubchak I., Gentemann C., Eikrem W., Gready J.E.,
RA John U., Lanier W., Lindquist E.A., Lucas S., Mayer K.F., Moreau H.,
RA Not F., Otillar R., Panaud O., Pangilinan J., Paulsen I., Piegu B.,
RA Poliakov A., Robbens S., Schmutz J., Toulza E., Wyss T., Zelensky A.,
RA Zhou K., Armbrust E.V., Bhattacharya D., Goodenough U.W., Van de Peer Y.,
RA Grigoriev I.V.;
RT "Green evolution and dynamic adaptations revealed by genomes of the marine
RT picoeukaryotes Micromonas.";
RL Science 324:268-272(2009).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CP001325; ACO63028.1; -; Genomic_DNA.
DR RefSeq; XP_002501770.1; XM_002501724.1.
DR AlphaFoldDB; C1E408; -.
DR GeneID; 8242674; -.
DR KEGG; mis:MICPUN_57866; -.
DR eggNOG; KOG4468; Eukaryota.
DR InParanoid; C1E408; -.
DR OMA; AMHSWHS; -.
DR OrthoDB; 390918at2759; -.
DR Proteomes; UP000002009; Chromosome 4.
DR GO; GO:0005634; C:nucleus; IEA:UniProtKB-KW.
DR GO; GO:0003677; F:DNA binding; IEA:UniProtKB-KW.
DR CDD; cd00167; SANT; 1.
DR Gene3D; 1.20.58.1880; -; 1.
DR InterPro; IPR009057; Homeobox-like_sf.
DR InterPro; IPR001005; SANT/Myb.
DR InterPro; IPR017884; SANT_dom.
DR InterPro; IPR039467; TFIIIB_B''_Myb.
DR PANTHER; PTHR21677; CRAMPED PROTEIN; 1.
DR PANTHER; PTHR21677:SF1; PROTEIN CRAMPED-LIKE; 1.
DR Pfam; PF15963; Myb_DNA-bind_7; 1.
DR SMART; SM00717; SANT; 1.
DR SUPFAM; SSF46689; Homeodomain-like; 1.
DR PROSITE; PS51293; SANT; 1.
PE 4: Predicted;
KW DNA-binding {ECO:0000256|ARBA:ARBA00023125};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242};
KW Reference proteome {ECO:0000313|Proteomes:UP000002009};
KW Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..16
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 17..689
FT /note="SANT domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5002906529"
FT DOMAIN 23..74
FT /note="SANT"
FT /evidence="ECO:0000259|PROSITE:PS51293"
FT REGION 172..214
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 356..394
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 592..618
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 368..384
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 689 AA; 72365 MW; 4AA8DD9C317642ED CRC64;
MNVSFFIIFP IIFVRAGERK PTKSQDRWTP WEEKQFFTAL KDVGSNFDKI AEIIQTRNRS
QVRTFYNTEV KRINTVLAPL GVQVDPSDAE EVHTAMHSWH SMKDKLGPGG SFQELCKKPQ
NRNLFAWDLK KELDKSVWAQ GKKEAKLKSL GLLPAVAGDA SGKVLGVMPV KTTANRGRPK
GSGIRTKNTC KADSPAGRTK STGRKAARRG SVSPNAALRA AAATAAKDDS VAKEQKERLM
LQLFPIDTST RAALVAGGFN PHLELTFRAK KSVTGLMQHL ITKWAAALPH LPAGLDKETA
VLQLYPFEAA SASDAKGAWN HRHEGVTATD IFDAIGRPAA FRVRYGWVPS EEAAVRPHLA
PPPPVYQQSL ARSRSRTPSP PNLSPRKRSA EEMTVAHPSF CQAPVQAGSF AGMFGGGNAS
AVGSVFGSDL ALFGGVGGAG AGTGADGGVV GVGAEDFTLP GFGADFSNMC REMGLGDVDA
SRGLGGPSAG LTAARHNGIA AGAQLASAAP DQIKGYSERQ VISEFGGGDV TVGTMLENDA
MFSLTQMLGD VPEGNISHAG APVAVPSGPS SFAGMFAGGS IADVIGMPRQ KLKQPGIAGK
PRGGEKKPRQ PYNKKPANLG QPYVSSLVMT GPAVHGAIVP GYINNGKGFV PVQQQPVESA
AAYYPHMAPQ GDMAANMAKE AAASASLGA
//