ID A0A3Q2H300_HORSE Unreviewed; 1501 AA.
AC A0A3Q2H300;
DT 10-APR-2019, integrated into UniProtKB/TrEMBL.
DT 13-SEP-2023, sequence version 2.
DT 27-MAR-2024, entry version 25.
DE RecName: Full=Homeobox protein cut-like {ECO:0000256|RuleBase:RU361129};
GN Name=CUX1 {ECO:0000313|Ensembl:ENSECAP00000027979.2,
GN ECO:0000313|VGNC:VGNC:112271};
OS Equus caballus (Horse).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Perissodactyla; Equidae; Equus.
OX NCBI_TaxID=9796 {ECO:0000313|Ensembl:ENSECAP00000027979.2, ECO:0000313|Proteomes:UP000002281};
RN [1] {ECO:0000313|Ensembl:ENSECAP00000027979.2, ECO:0000313|Proteomes:UP000002281}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000027979.2,
RC ECO:0000313|Proteomes:UP000002281};
RX PubMed=19892987; DOI=10.1126/science.1178158;
RG Broad Institute Genome Sequencing Platform;
RG Broad Institute Whole Genome Assembly Team;
RA Wade C.M., Giulotto E., Sigurdsson S., Zoli M., Gnerre S., Imsland F.,
RA Lear T.L., Adelson D.L., Bailey E., Bellone R.R., Bloecker H., Distl O.,
RA Edgar R.C., Garber M., Leeb T., Mauceli E., MacLeod J.N., Penedo M.C.T.,
RA Raison J.M., Sharpe T., Vogel J., Andersson L., Antczak D.F., Biagi T.,
RA Binns M.M., Chowdhary B.P., Coleman S.J., Della Valle G., Fryc S.,
RA Guerin G., Hasegawa T., Hill E.W., Jurka J., Kiialainen A., Lindgren G.,
RA Liu J., Magnani E., Mickelson J.R., Murray J., Nergadze S.G., Onofrio R.,
RA Pedroni S., Piras M.F., Raudsepp T., Rocchi M., Roeed K.H., Ryder O.A.,
RA Searle S., Skow L., Swinburne J.E., Syvaenen A.C., Tozaki T., Valberg S.J.,
RA Vaudin M., White J.R., Zody M.C., Lander E.S., Lindblad-Toh K.;
RT "Genome sequence, comparative analysis, and population genetics of the
RT domestic horse.";
RL Science 326:865-867(2009).
RN [2] {ECO:0000313|Ensembl:ENSECAP00000027979.2}
RP IDENTIFICATION.
RC STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000027979.2};
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000256|PROSITE-ProRule:PRU00108,
CC ECO:0000256|RuleBase:RU000682}.
CC -!- SIMILARITY: Belongs to the CUT homeobox family.
CC {ECO:0000256|ARBA:ARBA00008190, ECO:0000256|RuleBase:RU361129}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR Ensembl; ENSECAT00000061784.3; ENSECAP00000027979.2; ENSECAG00000022645.4.
DR VGNC; VGNC:112271; CUX1.
DR GeneTree; ENSGT00940000159751; -.
DR Proteomes; UP000002281; Chromosome 13.
DR Bgee; ENSECAG00000022645; Expressed in articular cartilage of joint and 23 other cell types or tissues.
DR ExpressionAtlas; A0A3Q2H300; baseline.
DR GO; GO:0005794; C:Golgi apparatus; IEA:Ensembl.
DR GO; GO:0005654; C:nucleoplasm; IEA:Ensembl.
DR GO; GO:0000981; F:DNA-binding transcription factor activity, RNA polymerase II-specific; IEA:InterPro.
DR GO; GO:1990837; F:sequence-specific double-stranded DNA binding; IEA:Ensembl.
DR CDD; cd00086; homeodomain; 1.
DR Gene3D; 1.10.10.60; Homeodomain-like; 1.
DR Gene3D; 1.10.260.40; lambda repressor-like DNA-binding domains; 3.
DR InterPro; IPR003350; CUT_dom.
DR InterPro; IPR009057; Homeobox-like_sf.
DR InterPro; IPR017970; Homeobox_CS.
DR InterPro; IPR001356; Homeobox_dom.
DR InterPro; IPR010982; Lambda_DNA-bd_dom_sf.
DR PANTHER; PTHR14043; CCAAT DISPLACEMENT PROTEIN-RELATED; 1.
DR PANTHER; PTHR14043:SF4; HOMEOBOX PROTEIN CUT-LIKE 1; 1.
DR Pfam; PF02376; CUT; 3.
DR Pfam; PF00046; Homeodomain; 1.
DR SMART; SM01109; CUT; 3.
DR SMART; SM00389; HOX; 1.
DR SUPFAM; SSF46689; Homeodomain-like; 1.
DR SUPFAM; SSF47413; lambda repressor-like DNA-binding domains; 3.
DR PROSITE; PS51042; CUT; 3.
DR PROSITE; PS00027; HOMEOBOX_1; 1.
DR PROSITE; PS50071; HOMEOBOX_2; 1.
PE 3: Inferred from homology;
KW Coiled coil {ECO:0000256|ARBA:ARBA00023054, ECO:0000256|SAM:Coils};
KW DNA-binding {ECO:0000256|ARBA:ARBA00023125, ECO:0000256|PROSITE-
KW ProRule:PRU00108};
KW Homeobox {ECO:0000256|ARBA:ARBA00023155, ECO:0000256|PROSITE-
KW ProRule:PRU00108};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242, ECO:0000256|PROSITE-
KW ProRule:PRU00108}; Reference proteome {ECO:0000313|Proteomes:UP000002281};
KW Transcription {ECO:0000256|RuleBase:RU361129};
KW Transcription regulation {ECO:0000256|RuleBase:RU361129}.
FT DOMAIN 553..640
FT /note="CUT"
FT /evidence="ECO:0000259|PROSITE:PS51042"
FT DOMAIN 938..1025
FT /note="CUT"
FT /evidence="ECO:0000259|PROSITE:PS51042"
FT DOMAIN 1121..1208
FT /note="CUT"
FT /evidence="ECO:0000259|PROSITE:PS51042"
FT DOMAIN 1246..1306
FT /note="Homeobox"
FT /evidence="ECO:0000259|PROSITE:PS50071"
FT DNA_BIND 1248..1307
FT /note="Homeobox"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00108"
FT REGION 407..466
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 522..561
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 656..680
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 693..715
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 774..932
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1040..1114
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1216..1251
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1315..1501
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COILED 112..355
FT /evidence="ECO:0000256|SAM:Coils"
FT COMPBIAS 446..466
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 522..557
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 661..675
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 699..715
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 844..858
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 868..914
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 915..932
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1040..1082
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1087..1106
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1389..1406
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1410..1424
FT /note="Basic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1427..1442
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1501 AA; 164739 MW; EA346C4BBBFBED7D CRC64;
MAANVGSMFQ YWKRFDLQQL QRELDATATV LANRQDESEQ SRKRLIEQSR EFKKNTPEDL
RKQVAPLLKS FQGEIDALSK RSKEAEAAFL NVYKRLIDVP DPVPALELGQ QLQVKVQRLH
DIETENQKLR ETLEEYNKEF AEVKNQEVTI KALKEKIREY EQTLKNQAET IALEKEQKLQ
NDFAEKERKL QETQMSTTSK LEEAEHKVQT LQTALEKTRT ELFDLKTKYD EEITAKADEM
EMIMTDLERA NQRAEVAQRE AETLREQLSS ANHSLQLASQ IQKAPDVEQA IEVLTRSSLE
VELAAKEREI AQLVEDVQRL QASLSKLREN SASQISQLEQ QLSAKNSTLK QLEEKLKGQA
DYEEVKKELN ILKSMEFAPS EGAGTQDASK PLEVLLLEKN RSLQSENAAL RISNSDLSGS
ARRKGKDQPE SRRPGPLPAS PPPQLPRNTG EQASNTNGTH QFSPAGLTQD FFSSSLASPS
LPLASTGKFA LNSLLQRQLM QSFYSKAVQE AGSTSMIFPT GPYSTNSISS QSPLQQSPDV
NGMAPSPSQS ESAGSVSEGE EIDTAEIARQ VKEQLIKHNI GQRIFGHYVL GLSQGSVSEI
LARPKPWNKL TVRGKEPFHK MKQFLSDEQN ILALRSIQGR QRENPGQSLH RLFQEVPKRR
NGSEGNITTR VRASETGSDE AIKSILEQAK RELQVQKAAE PAQPSSSSSS GSSDDAIRSI
LQQARREMEA QQAALDPALK QTPLSQTDIA ILTPKLISTS PISSGYSPLA ISLKKPPAAP
DSSASALPNP PALKKEAQDT PGLDLQGAAD PAQGVLRHVK NELGRSGVWK DHWWSTVQPE
RKSAVPPEEP KGEEASGGKE KGGGSQTRAE RGQLQGPSSS EYWKEWPSAE SPYSQSSELS
LTGASRSETP QNSPLPSSPI VPLSKPAKPS VPPLTPEQYE IYMYQEVDTI ELTRQVKEKL
AKNGICQRIF GEKVLGLSQG SVSDMLSRPK PWSKLTQKGR EPFIRMQLWL NGELGQGVLP
VQGQQQGPVL HSVTSLQDPL QQGCVSSEST PKTSASCSPA PESPMSSSES VKSLTELVQQ
PCPPIETSKD GKPPEPSDPP ASDSQPTTPL PLSGHSALSI QELVAMSPEL DTYGITKRVK
EVLTDNNLGQ RLFGETILGL TQGSVSDLLA RPKPWHKLSL KGREPFVRMQ LWLNDPNNVE
KLMDMKRMEK KAYMKRRHSS VSDSQPCEPP SVGIDYSQGA SPQPQHQLKK PRVVLAPEEK
EALKRAYQQK PYPSPKTIEE LATQLNLKTS TVINWFHNYR SRIRRELFIE EIQAGSQGQA
GASDSPSARS GRAAPGSEGD SCDGVEAAEG PGAADAEESG GPAAAAKSQG GPAEAAAAPE
EQEEAPRPAE KAQPPPSGPR PPRTTPTARA GPGRHRRHRP RAPRTARGPC QTPPPRPRPG
RTPLPQPRRR RRARAGPGTA PTGVPLCQAP ARPRPPAGPA RCRASSASRR RRAPGTRETT
P
//