ID G1SIA5_RABIT Unreviewed; 2700 AA.
AC G1SIA5;
DT 19-OCT-2011, integrated into UniProtKB/TrEMBL.
DT 19-OCT-2011, sequence version 1.
DT 27-MAR-2024, entry version 85.
DE SubName: Full=Nuclear receptor binding SET domain protein 1 {ECO:0000313|Ensembl:ENSOCUP00000002408.2};
GN Name=NSD1 {ECO:0000313|Ensembl:ENSOCUP00000002408.2};
OS Oryctolagus cuniculus (Rabbit).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Lagomorpha; Leporidae; Oryctolagus.
OX NCBI_TaxID=9986 {ECO:0000313|Ensembl:ENSOCUP00000002408.2, ECO:0000313|Proteomes:UP000001811};
RN [1] {ECO:0000313|Ensembl:ENSOCUP00000002408.2, ECO:0000313|Proteomes:UP000001811}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Thorbecke inbred {ECO:0000313|Ensembl:ENSOCUP00000002408.2,
RC ECO:0000313|Proteomes:UP000001811};
RX PubMed=21993624; DOI=10.1038/nature10530;
RA Lindblad-Toh K., Garber M., Zuk O., Lin M.F., Parker B.J., Washietl S.,
RA Kheradpour P., Ernst J., Jordan G., Mauceli E., Ward L.D., Lowe C.B.,
RA Holloway A.K., Clamp M., Gnerre S., Alfoldi J., Beal K., Chang J.,
RA Clawson H., Cuff J., Di Palma F., Fitzgerald S., Flicek P., Guttman M.,
RA Hubisz M.J., Jaffe D.B., Jungreis I., Kent W.J., Kostka D., Lara M.,
RA Martins A.L., Massingham T., Moltke I., Raney B.J., Rasmussen M.D.,
RA Robinson J., Stark A., Vilella A.J., Wen J., Xie X., Zody M.C., Baldwin J.,
RA Bloom T., Chin C.W., Heiman D., Nicol R., Nusbaum C., Young S.,
RA Wilkinson J., Worley K.C., Kovar C.L., Muzny D.M., Gibbs R.A., Cree A.,
RA Dihn H.H., Fowler G., Jhangiani S., Joshi V., Lee S., Lewis L.R.,
RA Nazareth L.V., Okwuonu G., Santibanez J., Warren W.C., Mardis E.R.,
RA Weinstock G.M., Wilson R.K., Delehaunty K., Dooling D., Fronik C.,
RA Fulton L., Fulton B., Graves T., Minx P., Sodergren E., Birney E.,
RA Margulies E.H., Herrero J., Green E.D., Haussler D., Siepel A., Goldman N.,
RA Pollard K.S., Pedersen J.S., Lander E.S., Kellis M.;
RT "A high-resolution map of human evolutionary constraint using 29 mammals.";
RL Nature 478:476-482(2011).
RN [2] {ECO:0000313|Ensembl:ENSOCUP00000002408.2}
RP IDENTIFICATION.
RC STRAIN=Thorbecke {ECO:0000313|Ensembl:ENSOCUP00000002408.2};
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000256|ARBA:ARBA00004123}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AAGW02066605; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AAGW02066606; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AAGW02066607; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AAGW02066608; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AAGW02066609; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AAGW02066610; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AAGW02066611; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AAGW02066612; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR RefSeq; XP_002710468.1; XM_002710422.3.
DR RefSeq; XP_008253681.1; XM_008255459.2.
DR RefSeq; XP_008253682.1; XM_008255460.2.
DR RefSeq; XP_008253683.1; XM_008255461.2.
DR SMR; G1SIA5; -.
DR STRING; 9986.ENSOCUP00000002408; -.
DR PaxDb; 9986-ENSOCUP00000002408; -.
DR Ensembl; ENSOCUT00000002776.3; ENSOCUP00000002408.2; ENSOCUG00000002775.3.
DR GeneID; 100355926; -.
DR KEGG; ocu:100355926; -.
DR CTD; 64324; -.
DR eggNOG; KOG1081; Eukaryota.
DR GeneTree; ENSGT00940000155027; -.
DR HOGENOM; CLU_000756_0_0_1; -.
DR InParanoid; G1SIA5; -.
DR OMA; CTKTAES; -.
DR OrthoDB; 950362at2759; -.
DR TreeFam; TF329088; -.
DR Proteomes; UP000001811; Chromosome 3.
DR Bgee; ENSOCUG00000002775; Expressed in upper lobe of left lung and 16 other cell types or tissues.
DR GO; GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
DR GO; GO:0046975; F:histone H3K36 methyltransferase activity; IEA:Ensembl.
DR GO; GO:0050681; F:nuclear androgen receptor binding; IEA:Ensembl.
DR GO; GO:0000978; F:RNA polymerase II cis-regulatory region sequence-specific DNA binding; IEA:Ensembl.
DR GO; GO:0003712; F:transcription coregulator activity; IEA:Ensembl.
DR GO; GO:0008270; F:zinc ion binding; IEA:Ensembl.
DR GO; GO:0045893; P:positive regulation of DNA-templated transcription; IEA:Ensembl.
DR GO; GO:0033135; P:regulation of peptidyl-serine phosphorylation; IEA:Ensembl.
DR CDD; cd15648; PHD1_NSD1_2; 1.
DR CDD; cd15650; PHD2_NSD1; 1.
DR CDD; cd15653; PHD3_NSD1; 1.
DR CDD; cd15656; PHD4_NSD1; 1.
DR CDD; cd15659; PHD5_NSD1; 1.
DR CDD; cd20161; PWWP_NSD1_rpt1; 1.
DR CDD; cd20164; PWWP_NSD1_rpt2; 1.
DR CDD; cd19210; SET_NSD1; 1.
DR Gene3D; 2.30.30.140; -; 2.
DR Gene3D; 2.170.270.10; SET domain; 1.
DR Gene3D; 3.30.40.10; Zinc/RING finger domain, C3HC4 (zinc finger); 4.
DR InterPro; IPR006560; AWS_dom.
DR InterPro; IPR041306; C5HCH.
DR InterPro; IPR047426; PHD1_NSD1_2.
DR InterPro; IPR047428; PHD2_NSD1.
DR InterPro; IPR047429; PHD3_NSD1.
DR InterPro; IPR047430; PHD4_NSD1.
DR InterPro; IPR047432; PHD5_NSD1.
DR InterPro; IPR003616; Post-SET_dom.
DR InterPro; IPR000313; PWWP_dom.
DR InterPro; IPR047423; PWWP_NSD1_rpt2.
DR InterPro; IPR001214; SET_dom.
DR InterPro; IPR046341; SET_dom_sf.
DR InterPro; IPR047433; SET_NSD1.
DR InterPro; IPR019786; Zinc_finger_PHD-type_CS.
DR InterPro; IPR011011; Znf_FYVE_PHD.
DR InterPro; IPR001965; Znf_PHD.
DR InterPro; IPR019787; Znf_PHD-finger.
DR InterPro; IPR013083; Znf_RING/FYVE/PHD.
DR PANTHER; PTHR22884:SF312; HISTONE-LYSINE N-METHYLTRANSFERASE, H3 LYSINE-36 SPECIFIC; 1.
DR PANTHER; PTHR22884; SET DOMAIN PROTEINS; 1.
DR Pfam; PF17907; AWS; 1.
DR Pfam; PF17982; C5HCH; 1.
DR Pfam; PF00628; PHD; 1.
DR Pfam; PF00855; PWWP; 2.
DR Pfam; PF00856; SET; 1.
DR SMART; SM00570; AWS; 1.
DR SMART; SM00249; PHD; 5.
DR SMART; SM00508; PostSET; 1.
DR SMART; SM00293; PWWP; 2.
DR SMART; SM00317; SET; 1.
DR SUPFAM; SSF57903; FYVE/PHD zinc finger; 3.
DR SUPFAM; SSF82199; SET domain; 1.
DR SUPFAM; SSF63748; Tudor/PWWP/MBT; 2.
DR PROSITE; PS51215; AWS; 1.
DR PROSITE; PS50868; POST_SET; 1.
DR PROSITE; PS50812; PWWP; 2.
DR PROSITE; PS50280; SET; 1.
DR PROSITE; PS01359; ZF_PHD_1; 1.
DR PROSITE; PS50016; ZF_PHD_2; 2.
PE 4: Predicted;
KW Chromatin regulator {ECO:0000256|ARBA:ARBA00022853};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242};
KW Reference proteome {ECO:0000313|Proteomes:UP000001811};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW S-adenosyl-L-methionine {ECO:0000256|ARBA:ARBA00022691};
KW Transferase {ECO:0000256|ARBA:ARBA00022679};
KW Zinc {ECO:0000256|ARBA:ARBA00022833};
KW Zinc-finger {ECO:0000256|ARBA:ARBA00022771, ECO:0000256|PROSITE-
KW ProRule:PRU00146}.
FT DOMAIN 323..388
FT /note="PWWP"
FT /evidence="ECO:0000259|PROSITE:PS50812"
FT DOMAIN 1546..1592
FT /note="PHD-type"
FT /evidence="ECO:0000259|PROSITE:PS50016"
FT DOMAIN 1710..1754
FT /note="PHD-type"
FT /evidence="ECO:0000259|PROSITE:PS50016"
FT DOMAIN 1759..1821
FT /note="PWWP"
FT /evidence="ECO:0000259|PROSITE:PS50812"
FT DOMAIN 1893..1943
FT /note="AWS"
FT /evidence="ECO:0000259|PROSITE:PS51215"
FT DOMAIN 1945..2062
FT /note="SET"
FT /evidence="ECO:0000259|PROSITE:PS50280"
FT DOMAIN 2069..2085
FT /note="Post-SET"
FT /evidence="ECO:0000259|PROSITE:PS50868"
FT REGION 280..311
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 490..522
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 634..658
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 872..892
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 961..1275
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1324..1347
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1399..1430
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1479..1515
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2094..2115
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2216..2425
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2468..2487
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2510..2529
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2558..2579
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2668..2700
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 873..892
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 961..988
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1029..1043
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1069..1083
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1108..1142
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1162..1176
FT /note="Basic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1403..1417
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2220..2235
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2242..2270
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2283..2297
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2679..2694
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 2700 AA; 297021 MW; F5FCBACC979AFFED CRC64;
MDQTCELPRR NCLLPLCNPV NLDAPEDKDS PFGNGQSNFS EPLNGCTMQL STASGTSPSA
YGQDSPSCYI PLRRLQDLAS MINVEYLNGS ADGSESFQDP EKSDSRAQSP VVCASLRPGG
PTALAMKQEP CCNNSPELQV KVTKTIKNGL VHFENCTCVD DADVESEMDP EQPVTEDECI
EEIFEETQTN ATCNYEPKSE NGAKVAMGSE QDSTPESRHG AVKSPFLLLA PQTETQNNKQ
RSEVDGSNEK AALLPTPFSL GDANGTLEEQ LNSINLSFQD DPDSSTSTLG NMLELPGTSS
SSTSQELPFC QPKKKSTPLK YEVGDLIWAK FKRRPWWPCK ICSDPLINTH SKMKVSNRRP
YRQYYVEAFG DPSERAWVAG KAIVMFEGRH QFEELPVLRR RGKQKEKGYR HKVPQKILSK
WEASVGLAEQ YDVPKGSKNR NCVSTSIKLD SEEDMPFEDC TNDPDSEHDL LLNGCLKSLA
FDSEHSADEK EKSCAKSRAR KNCDNPKRTS VKKGHMQFEA HKEERRGKIS ENLGLNFISG
DVSDKQASNE LSRIANSLTG SNTAPGSFLF SSCGQNTAKK EFETSNCDSL LGLSEGTLIS
KCSGEKKKPQ RGLVCSSKVQ LCYIGTGDEE KRSDSISICT TSDDGSSDLD PVEHSSESDN
SILEITDTFD RTENILSMQK NEKIKYSRYP ATNSRVKPKQ KSLITNSHTD HLMNCTKTTE
LGTEMSQVNL SDLTVSTLVH KPQSDFKNDS LAPKFNTPSA ISSENSLVTG GATNQTLLHS
KSKPPKFRSI KCKHKENPLI VEPSVPNEDC SLKCCSSDTK GSPLASISKS GKMDGLKLLS
NMHEKTRDSS DIETAVVKHV LSELKELSYR SLSEDVSDSG TSKPSKPLLF SSASNQNHIP
IEPDYKFSTL LMMLKDMHDS KTKEQRLMTA QNLISYRSPS RGDCSTSSPV GASKILVSGS
FTHNSEKSGD VTQDSARPSP SGGDSAPSVE LSASLPGLVS DKRDLSVSVK SRSNCVTRRN
CGRSKPSKLR DAFSTQMGKN TVNRKALKTE RKRKPSQLPA VTLEVPLQGD KESGSSVSGS
SRDGAEDSGK ESSQQTGHLT SEDAIQFSDV HFDNKVKQSD PDKIPEKEPT FENRKDPELN
SEMNSENDEP NGVNQVVPKK RWQRLNQRRT KPRKRTNRSR EKENSEDAFG VLLPGDPVQK
GRDEFPEHRT PPPTNIVEDS VTDPNHAGCL DSVGPQLNVC DKSAASNEDM EKEPGIPSLT
PQAELPEPVV RSEKKRLRKP SKWLLEYTEE YDQIFAPKKK QKKVQEQVHK VSSRCEEEGL
LARCQSSAQN KQVDENSLIS TKEEPPVLER EAPFLEGPLA QSELGGGNAE LPQLTLSVPV
APEVSPRPAL ESEELLVKTP GNYESKRQRK PTKKLLESND LDPGFMPKKG DLGLTKKCYE
AGCLENGITE SCAVSRSKEF GGGTTKIFDK PRKRKRQRHV AAKVQCKKVK NDDSSKGMPG
SEGELMAHRT TASPKEAVEE GVEHDPGMPA SKKMQGERGG GAALKENVCQ NCEKLGELLL
CEAQCCGAFH LECLGLTEMP RGKFICNECR TGIHTCFVCK QSGEDVKRCL LPLCGKFYHE
ECVQKYPPTV MQNKGFRCSL HICITCHAAN PASVSASKGR LMRCVRCPVA YHANDFCLAA
GSKILASNSI ICPNHFTPRR GCRNHEHVNV SWCFVCSEGG SLLCCDSCPA AFHRECLNID
IPEGNWYCND CKAGKKPHYR EIVWVKVGRY RWWPAEICHP RAVPSNIDKM RHDVGEFPVL
FFGSNDYLWT HQARVFPYME GDVSSKDKMG KGVDGTYKKA LQEAAARFEE LKAQKELRQL
QEDRKNDKKP PPYKHIKVNR PIGRVQIFTA DLSEIPRCNC KATDENPCGI DSECINRMLL
YECHPTVCPA GGRCQNQCFT KRQYPEVEIF RTLQRGWGLR TKTDIKKGEF VNEYVGELID
EEECRARIRY AQEHDITNFY MLTLDKDRII DAGPKGNYAR FMNHCCQPNC ETQKWSVNGD
TRVGLFALSD IKAGTELTFN YNLECLGNGK TVCKCGAPNC SGFLGVRPKN QPIATEEKSK
KFKKKQPGKR RSQGEITKER EDECFSCGDA GQLVSCKKPG CPKVYHADCL NLTKRPAGKW
ECPWHQCDVC GKEAASFCEM CPSSFCKQHR EGMLFISKLD GRLSCTEHDP CGPNPLEPGE
IREYVPPPVP LPPGPGAHLA EQSSGTAAQG PKMSDKPPAD TNQTLPLSKK ALAGTCQRPL
LPERPLERTD SRPQLLDRVR DLAGSGTKSQ PLASSQRPLD RSPPVAGPRP QLSDKPSPVT
GPGSSPSVRP QPLERPLGTT DPRLDKSIGA VSPRPQSLEK TPVPTGLRLL PPDRLLVTSS
PKPQTSERPP DKSHAPLSQR LPPPEKVLSA VVQTLVAKEK ALRPVDQNTQ SKNRAALVMD
LIDLTPRQKE RAASPHEVTP QADEKVPVLE SSSWAASKGL GHMPRVVEKG SMSEPLLQPP
GKTAAPAEHP WQAVKSLTQA RLLSQPPAKA FLYEPATQAS GRAPAGAEQT PGPPSQAPGL
VKQVKQMAGS QQLPGLAAKT GQSFRSLVKT PASLSTEEKK LATPEQSSWA LGKTSAGAGL
WPMVAGQTLM QSCWPAGSTQ TLAQTCWSLG RGQDPKPEQN TLPALNQAPS NHKCAESEQK
//