ID A0A0D9R996_CHLSB Unreviewed; 2426 AA.
AC A0A0D9R996;
DT 27-MAY-2015, integrated into UniProtKB/TrEMBL.
DT 27-MAY-2015, sequence version 1.
DT 27-MAR-2024, entry version 54.
DE SubName: Full=Nuclear receptor binding SET domain protein 1 {ECO:0000313|Ensembl:ENSCSAP00000005185.1};
GN Name=NSD1 {ECO:0000313|Ensembl:ENSCSAP00000005185.1};
OS Chlorocebus sabaeus (Green monkey) (Cercopithecus sabaeus).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
OC Cercopithecidae; Cercopithecinae; Chlorocebus.
OX NCBI_TaxID=60711 {ECO:0000313|Ensembl:ENSCSAP00000005185.1, ECO:0000313|Proteomes:UP000029965};
RN [1] {ECO:0000313|Ensembl:ENSCSAP00000005185.1, ECO:0000313|Proteomes:UP000029965}
RP NUCLEOTIDE SEQUENCE.
RA Warren W., Wilson R.K.;
RL Submitted (MAR-2014) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSCSAP00000005185.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000256|ARBA:ARBA00004123}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AQIB01152928; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AQIB01152929; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AQIB01152930; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AQIB01152931; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AQIB01152932; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AQIB01152933; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AQIB01152934; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AQIB01152935; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR RefSeq; XP_008013603.1; XM_008015412.1.
DR RefSeq; XP_008013604.1; XM_008015413.1.
DR STRING; 60711.ENSCSAP00000005185; -.
DR Ensembl; ENSCSAT00000006994.1; ENSCSAP00000005185.1; ENSCSAG00000008928.1.
DR GeneID; 103245045; -.
DR CTD; 64324; -.
DR eggNOG; KOG1081; Eukaryota.
DR GeneTree; ENSGT00940000155027; -.
DR OMA; CTKTAES; -.
DR OrthoDB; 950362at2759; -.
DR BioGRID-ORCS; 103245045; 2 hits in 9 CRISPR screens.
DR Proteomes; UP000029965; Chromosome 23.
DR Bgee; ENSCSAG00000008928; Expressed in caudate nucleus and 7 other cell types or tissues.
DR GO; GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
DR GO; GO:0046975; F:histone H3K36 methyltransferase activity; IEA:Ensembl.
DR GO; GO:0050681; F:nuclear androgen receptor binding; IEA:Ensembl.
DR GO; GO:0000978; F:RNA polymerase II cis-regulatory region sequence-specific DNA binding; IEA:Ensembl.
DR GO; GO:0003712; F:transcription coregulator activity; IEA:Ensembl.
DR GO; GO:0008270; F:zinc ion binding; IEA:Ensembl.
DR GO; GO:0045893; P:positive regulation of DNA-templated transcription; IEA:Ensembl.
DR GO; GO:0033135; P:regulation of peptidyl-serine phosphorylation; IEA:Ensembl.
DR CDD; cd15648; PHD1_NSD1_2; 1.
DR CDD; cd15650; PHD2_NSD1; 1.
DR CDD; cd15653; PHD3_NSD1; 1.
DR CDD; cd15656; PHD4_NSD1; 1.
DR CDD; cd15659; PHD5_NSD1; 1.
DR CDD; cd20161; PWWP_NSD1_rpt1; 1.
DR CDD; cd20164; PWWP_NSD1_rpt2; 1.
DR CDD; cd19210; SET_NSD1; 1.
DR Gene3D; 2.30.30.140; -; 2.
DR Gene3D; 2.170.270.10; SET domain; 1.
DR Gene3D; 3.30.40.10; Zinc/RING finger domain, C3HC4 (zinc finger); 4.
DR InterPro; IPR006560; AWS_dom.
DR InterPro; IPR041306; C5HCH.
DR InterPro; IPR047426; PHD1_NSD1_2.
DR InterPro; IPR047428; PHD2_NSD1.
DR InterPro; IPR047429; PHD3_NSD1.
DR InterPro; IPR047430; PHD4_NSD1.
DR InterPro; IPR047432; PHD5_NSD1.
DR InterPro; IPR003616; Post-SET_dom.
DR InterPro; IPR000313; PWWP_dom.
DR InterPro; IPR047423; PWWP_NSD1_rpt2.
DR InterPro; IPR001214; SET_dom.
DR InterPro; IPR046341; SET_dom_sf.
DR InterPro; IPR047433; SET_NSD1.
DR InterPro; IPR019786; Zinc_finger_PHD-type_CS.
DR InterPro; IPR011011; Znf_FYVE_PHD.
DR InterPro; IPR001965; Znf_PHD.
DR InterPro; IPR019787; Znf_PHD-finger.
DR InterPro; IPR013083; Znf_RING/FYVE/PHD.
DR PANTHER; PTHR22884:SF312; HISTONE-LYSINE N-METHYLTRANSFERASE, H3 LYSINE-36 SPECIFIC; 1.
DR PANTHER; PTHR22884; SET DOMAIN PROTEINS; 1.
DR Pfam; PF17907; AWS; 1.
DR Pfam; PF17982; C5HCH; 1.
DR Pfam; PF00628; PHD; 1.
DR Pfam; PF00855; PWWP; 2.
DR Pfam; PF00856; SET; 1.
DR SMART; SM00570; AWS; 1.
DR SMART; SM00249; PHD; 5.
DR SMART; SM00508; PostSET; 1.
DR SMART; SM00293; PWWP; 2.
DR SMART; SM00317; SET; 1.
DR SUPFAM; SSF57903; FYVE/PHD zinc finger; 3.
DR SUPFAM; SSF82199; SET domain; 1.
DR SUPFAM; SSF63748; Tudor/PWWP/MBT; 2.
DR PROSITE; PS51215; AWS; 1.
DR PROSITE; PS50868; POST_SET; 1.
DR PROSITE; PS50812; PWWP; 2.
DR PROSITE; PS50280; SET; 1.
DR PROSITE; PS01359; ZF_PHD_1; 1.
DR PROSITE; PS50016; ZF_PHD_2; 2.
PE 4: Predicted;
KW Chromatin regulator {ECO:0000256|ARBA:ARBA00022853};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242};
KW Reference proteome {ECO:0000313|Proteomes:UP000029965};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW S-adenosyl-L-methionine {ECO:0000256|ARBA:ARBA00022691};
KW Transferase {ECO:0000256|ARBA:ARBA00022679};
KW Zinc {ECO:0000256|ARBA:ARBA00022833};
KW Zinc-finger {ECO:0000256|ARBA:ARBA00022771, ECO:0000256|PROSITE-
KW ProRule:PRU00146}.
FT DOMAIN 54..119
FT /note="PWWP"
FT /evidence="ECO:0000259|PROSITE:PS50812"
FT DOMAIN 1273..1319
FT /note="PHD-type"
FT /evidence="ECO:0000259|PROSITE:PS50016"
FT DOMAIN 1437..1481
FT /note="PHD-type"
FT /evidence="ECO:0000259|PROSITE:PS50016"
FT DOMAIN 1486..1548
FT /note="PWWP"
FT /evidence="ECO:0000259|PROSITE:PS50812"
FT DOMAIN 1620..1670
FT /note="AWS"
FT /evidence="ECO:0000259|PROSITE:PS51215"
FT DOMAIN 1672..1789
FT /note="SET"
FT /evidence="ECO:0000259|PROSITE:PS50280"
FT DOMAIN 1796..1812
FT /note="Post-SET"
FT /evidence="ECO:0000259|PROSITE:PS50868"
FT REGION 1..39
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 218..244
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 602..625
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 666..762
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 797..818
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 842..867
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 923..947
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 973..1002
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1032..1074
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1112..1158
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1209..1264
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1821..1841
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1943..2153
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2194..2254
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2283..2303
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2325..2355
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2394..2426
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 9..39
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 605..625
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 666..718
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1032..1048
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1130..1144
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1212..1230
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1240..1258
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1947..1962
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1983..1998
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2010..2024
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2031..2045
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2061..2079
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2286..2302
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2405..2420
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 2426 AA; 267467 MW; 89F186DD3951EF3A CRC64;
MPLKTRTALS DDPDSSTSTL GNMLELPGTS SSSTSQELPF CQAKKKSTPL KYEVGDLIWA
KFKRRPWWPC RICSDPLINT HSKMKVSNRR PYRQYYVEAF GDPSERAWVA GKAIVMFEGR
HQFEELPVLR RRGKQKEKGY RHKVPQKILS KWEASVGLAE QYDVPKGSKN RKCIPGSIKL
DSEEDMPFED CTNDPESEHD LLLNGCLKSL AFDSEHSADE KEKPCAKSRA RKSSDNPKRT
SVKKGHIQFE AHKDERRGKI PENLGLNFIS GDISDTQASN ELSRIANSLT GSNTAPGSFL
FSSCGKNTAK KEFETSNGDA LLGLPEGALI SKCSREKNKP QRSLVCGSKV KLCYIGAGDE
EKRSDSISIC TTSDDGSSDL DPIEHSSESD NSVLEITDAF DRTENMLSMQ KNEKIKYSRF
AATNTRVKAK QKPLISNSHT DHLMGCTKSA EPGTETSQVN LSDLKASTLV HKPQSDFTSD
DLSPKFNMSS SISSENSLIK GGPVNQALLH SKSKQPKFRS IKCKHKENPV MVEPPVTNDE
YSLKCCSSDT KGSPLASISK SGKVDGLKLL NNMHEKTRDS SDIETAVVKH VLSELKELSY
RSLGEDVSDS GTSKPSKPLL FSSPSQNHIP IEPDYKFSTL LMMLKDMHDS KTKEQRLMTA
QNLVSYRSPG RGDCSTNSPV GVSKVLVSGG STHNSEKKGN GTQNSANPSP SGGDSALSGE
LSASLPGLVS DKRDLPACGK SRSNCVTRRN CGRSKPSSKL RDAFSAQVVK NTVNRKALKT
ERKRKLNQLS SVTLDAALQG DREHGGSLRG GAEDPSKEEP LQIMGHLTSE DGDHFSDVHF
DNKVKQSDPG KISEKGPSFE NGKGPELDSV MNSENDELNG VNQVVPKKRW QRLNQRRTKP
RKRMNRFKEK ENSECAFGVL LPSDPVQEGR DEFPEHRTSS ASILEEPLTD QKHADCLDSV
GPRLNVCDKS SASIGDMEKE PGIPSLTPQA ELPEPAVRSE KKRLRKPSKW LLEYTEEYDQ
IFAPKKKQKK VQEQVHKVSS RCEEESLLAR GRSSAQNKQV DENSLISTKE EPPVLEREAP
FLEGPLAQSE LGGGHAELPQ LTLSVPVAPE VSPRPALESE ELLVKTPGNY ESKRQRKPTK
KLLESNDLDP GFMPKKGDLG LSKKCYEAGH LENGITESCA TSYSKDFGGG TSKIFDRPRK
RKRQRHAAAK MQCKKVKNDD SSKEIPSLEG ELMPHRTAAS PKETVEEGVE HDSGMPASKK
MQGERGGGAA LKENVCQNCE KLGELLLCEA QCCGAFHLEC LGLTEMPRGK FICNECRTGI
HTCFVCKQSG EDVKRCLLPL CGKFYHEECV QKYPPTVMQN KGFRCSLHIC ITCHAANPAN
VSASKGRLMR CVRCPVAYHA NDFCLAAGSK ILASNSIICP NHFTPRRGCR NHEHVNVSWC
FVCSEGGSLL CCDSCPAAFH RECLNIDIPE GNWYCNDCKA GKKPHYREIV WVKVGRYRWW
PAEICHPRAV PSNIDKMRHD VGEFPVLFFG SNDYLWTHQA RVFPYMEGDV SSKDKMGKGV
DGTYKKALQE AAARFEELKA QKELRQLQED RKNDKKPPPY KHIKVNRPIG RVQIFTADLS
EIPRCNCKAT DENPCGIDSE CINRMLLYEC HPTVCPAGGR CQNQCFSKRQ YPEVEIFRTL
QRGWGLRTKT DIKKGEFVNE YVGELIDEEE CRARIRYAQE HDITNFYMLT LDKDRIIDAG
PKGNYARFMN HCCQPNCETQ KWSVNGDTRV GLFALSDIKA GTELTFNYNL ECLGNGKTVC
KCGAPNCSGF LGVRPKNQPI ATEEKSKKFK KKQQGKRRTQ GEITKEREDE CFSCGDAGQL
VSCKKPGCPK VYHADCLNLT KRPAGKWECP WHQCDICGKE AASFCEMCPS SFCKQHREGM
LFISKLDGRL SCTEHDPCGP NPLEPGEIRE YVPPPVPLPP GPSTHLAEQS TGMAAQAPKM
SDKPPADTNQ TLSLSKKALA GTCQRPLLPE RPLERTDSRS QPLDKVRDLA GSGTKSQSLV
SSQRPLDRQP AVAGPRPQLS DKPSPVTSPS SSPSVRSQPL ERPLGTADPR LDKSIGAASP
RPQSLEKTPV PTGLRLPPPD RLLITSSPKP QTSDRPPDKP HASLSQRLPP PEKVLSAVVQ
TLVAKEKALR PVDQNTQSKN RAALVMDLID LTPRQKERAA SPHEVTPQAD EKMPVLESSS
WPASKGLGHM PRAVEKGSVS DPLQTSGKVA AHSEDPWQAV KSFTQARLLS QPPAKAFLYE
PTTQASGRAP AGTEQTPGPL SQVPGLVKQA KQMVGGQQLP ALAARSGQSF RSLGKAPASL
PTEEKKLVTT EQSPWALGKA SSRAGLWPIV AGQTLAQSCW SPGSTQTLAQ TCWSLGRGQD
PKPEQNTLPA LNQAPSSHKC AESEQK
//