ID A0A061ECG6_THECC Unreviewed; 1670 AA.
AC A0A061ECG6;
DT 03-SEP-2014, integrated into UniProtKB/TrEMBL.
DT 03-SEP-2014, sequence version 1.
DT 27-MAR-2024, entry version 55.
DE SubName: Full=DNA-repair protein UVH3, putative isoform 1 {ECO:0000313|EMBL:EOY02333.1};
GN ORFNames=TCM_016845 {ECO:0000313|EMBL:EOY02333.1};
OS Theobroma cacao (Cacao) (Cocoa).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma.
OX NCBI_TaxID=3641 {ECO:0000313|EMBL:EOY02333.1, ECO:0000313|Proteomes:UP000026915};
RN [1] {ECO:0000313|EMBL:EOY02333.1, ECO:0000313|Proteomes:UP000026915}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Matina 1-6 {ECO:0000313|Proteomes:UP000026915};
RX PubMed=23731509; DOI=10.1186/gb-2013-14-6-r53;
RA Motamayor J.C., Mockaitis K., Schmutz J., Haiminen N., Iii D.L.,
RA Cornejo O., Findley S.D., Zheng P., Utro F., Royaert S., Saski C.,
RA Jenkins J., Podicheti R., Zhao M., Scheffler B.E., Stack J.C., Feltus F.A.,
RA Mustiga G.M., Amores F., Phillips W., Marelli J.P., May G.D., Shapiro H.,
RA Ma J., Bustamante C.D., Schnell R.J., Main D., Gilbert D., Parida L.,
RA Kuhn D.N.;
RT "The genome sequence of the most widely cultivated cacao type and its use
RT to identify candidate genes regulating pod color.";
RL Genome Biol. 14:R53.1-R53.24(2013).
CC -!- COFACTOR:
CC Name=Mg(2+); Xref=ChEBI:CHEBI:18420;
CC Evidence={ECO:0000256|ARBA:ARBA00001946};
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000256|ARBA:ARBA00004123}.
CC -!- SIMILARITY: Belongs to the XPG/RAD2 endonuclease family. XPG subfamily.
CC {ECO:0000256|ARBA:ARBA00005283}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CM001882; EOY02333.1; -; Genomic_DNA.
DR STRING; 3641.A0A061ECG6; -.
DR EnsemblPlants; EOY02333; EOY02333; TCM_016845.
DR Gramene; EOY02333; EOY02333; TCM_016845.
DR eggNOG; KOG2520; Eukaryota.
DR InParanoid; A0A061ECG6; -.
DR OMA; PNSMDFS; -.
DR Proteomes; UP000026915; Chromosome 4.
DR GO; GO:0005634; C:nucleus; IBA:GO_Central.
DR GO; GO:0004520; F:DNA endonuclease activity; IBA:GO_Central.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR GO; GO:0003697; F:single-stranded DNA binding; IBA:GO_Central.
DR GO; GO:0016740; F:transferase activity; IEA:UniProtKB-KW.
DR GO; GO:0006289; P:nucleotide-excision repair; IEA:InterPro.
DR CDD; cd09904; H3TH_XPG; 1.
DR CDD; cd09868; PIN_XPG_RAD2; 2.
DR Gene3D; 6.10.250.1630; -; 1.
DR Gene3D; 1.10.150.20; 5' to 3' exonuclease, C-terminal subdomain; 1.
DR Gene3D; 3.40.50.1010; 5'-nuclease; 2.
DR InterPro; IPR036279; 5-3_exonuclease_C_sf.
DR InterPro; IPR008918; HhH2.
DR InterPro; IPR025527; HUWE1/Rev1_UBM.
DR InterPro; IPR029060; PIN-like_dom_sf.
DR InterPro; IPR006086; XPG-I_dom.
DR InterPro; IPR006084; XPG/Rad2.
DR InterPro; IPR001044; XPG/Rad2_eukaryotes.
DR InterPro; IPR019974; XPG_CS.
DR InterPro; IPR006085; XPG_DNA_repair_N.
DR PANTHER; PTHR16171:SF7; DNA EXCISION REPAIR PROTEIN ERCC-5; 1.
DR PANTHER; PTHR16171; DNA REPAIR PROTEIN COMPLEMENTING XP-G CELLS-RELATED; 1.
DR Pfam; PF14377; UBM; 2.
DR Pfam; PF00867; XPG_I; 1.
DR Pfam; PF00752; XPG_N; 1.
DR PRINTS; PR00853; XPGRADSUPER.
DR PRINTS; PR00066; XRODRMPGMNTG.
DR SMART; SM00279; HhH2; 1.
DR SMART; SM00484; XPGI; 1.
DR SMART; SM00485; XPGN; 1.
DR SUPFAM; SSF47807; 5' to 3' exonuclease, C-terminal subdomain; 1.
DR SUPFAM; SSF88723; PIN domain-like; 1.
DR PROSITE; PS00841; XPG_1; 1.
DR PROSITE; PS00842; XPG_2; 1.
PE 3: Inferred from homology;
KW DNA damage {ECO:0000256|ARBA:ARBA00022763};
KW DNA repair {ECO:0000256|ARBA:ARBA00023204};
KW Hydrolase {ECO:0000256|ARBA:ARBA00022801};
KW Magnesium {ECO:0000256|ARBA:ARBA00022842};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Nuclease {ECO:0000256|ARBA:ARBA00022722};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242};
KW Reference proteome {ECO:0000313|Proteomes:UP000026915};
KW Transferase {ECO:0000256|ARBA:ARBA00022679}.
FT DOMAIN 1..98
FT /note="XPG N-terminal"
FT /evidence="ECO:0000259|SMART:SM00485"
FT DOMAIN 1010..1079
FT /note="XPG-I"
FT /evidence="ECO:0000259|SMART:SM00484"
FT REGION 125..146
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 221..251
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 409..430
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 499..532
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 584..615
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 629..668
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 711..747
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 764..801
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 815..840
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1318..1670
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 125..145
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 228..242
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 409..424
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 503..532
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 820..840
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1336..1350
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1351..1365
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1395..1417
FT /note="Basic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1425..1439
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1442..1456
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1498..1536
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1598..1652
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1670 AA; 186337 MW; 9ED467017A5D19B3 CRC64;
MGVHGLWELL APVGRRVSVE TLAGKKLAID ASIWMVQFMK AMRDEKGEMV RNAHLLGFFR
RICKLLYLKT KPVFVFDGAT PVLKRRTVIA RRRQRENSQA KIRKTAEKLL LNHLKQMRLK
ELAKDLEDQR KKQKNNAKGR KVSSDKPYDA NIVGCNAVEL TNSDHVNLKE KSEMPIPAED
GGGDENEDEY EEIILPEIDG NIDPDVLAAL PQSMQRQLLS QNNAKDKKIF SNDLDQSNME
RSNAEHDPMA SSSYNQEKLD EMLAASLAAQ EDSNLANNAS TSAAAIHSEE DGDEDEEMIL
PAMHGNVDPA VLAALPPSLQ LDLLVQMREK LMAVNRQKYQ KVKKAPEQFS ELQIQSYLKT
VAFRREIDEV QRAAAGRGVA GVQTSRIASE ANREFIFSSS FTGDKQVLTS ARKERDEDKQ
QEIHSNHPSG FLNSVKSICK SNVVTESVPN EPTSAPDEDV GTYLDERGQV RVSRVRGMGI
RMTRDLQRNL DLMKEIEQER TNSNKDMNVQ SVPDRNRIGT SKNSSSENQF LKTSHDGNCE
SVNLNESNQQ SAFKTEACME ITFEDDGRNK FFDDDDDIFA RLAAGDPVTL PSPENKPSGK
HTSDSDSDCE WEEGMTEGNW DGVAHCMDAK NNPSYKESNI SDESEVEWEE EPSDAPKSSS
GPVESGVMLS KGYLEEEADL QEAIRRSLTD IGAKKSNYFP SEFEKLKKFG KNMDEGFGSP
HGKSSMDGPS FREGKVNQEN KSCQNLDRVQ KLYSVDELSI SEASNFPERL SPIAHSSDRN
GTLSYKPCER SDGPHSEQSR DIASTVLVTT LEREVHLAPG KQSNASNEVD GLSTVSNSWS
KDSSRSLDVV LDDLPGAILV DKKNDSEGEP STLVSEKKSE VETELCSMVE DKKNDLEAKS
LHQSIEIVDS SIPVVQSSVN KATSDIHIEQ ELVGDRTYEN YVNEAEQETD MANVKGNDYA
DVEFTQVSLD EELLILGQEC MNLGDEQRKL ERNAESVSSE MFAECQELLQ MFGLPYIIAP
MEAEAQCAYM ELTNIVDGVV TDDSDVFLFG ARSVYKNIFD DRKYVETYFM QDIEKEIGLT
REKLMRMALL LGSDYTEGVS GIGIVNAIEV VNAFPEEDGL HKFREWIESP DPAILGKLNV
QEGSSARKRG SKFSDKDVIS AKTSMRDSGS PIEGLSSFDQ NISQADKNTQ STDCIDDIKQ
IFMDKHRNVS KNWHIPSSFP SEAVISEYCS PQVDKSTEPF TWGRPDLFVL RKLCWDKFGW
GSQKSDDLLL PVLREYEKRE TQLRLEAFYT FNERFAKIRS KRIKKAVKGI TGNQSSELID
DAMQQVSKSR KRRRVSPVKS GDDKSGEPSN WKEDIVSQRQ SKSMEKSVPK PSRKRPPQTS
PGKSTPEQPP RAARRRKTNK QSPGIGRRKG HGARRRRRKA SPDFEQSETS SSGGNSGNDY
QEVDGEKLDR PQQVRRSMRT RNPVNYNVND LEDEVGLSNK ESSCEEAMEQ EAADDLNEEN
PSEARDPTFE EDFSRDYLER GGGFCMDEKE VGHPDESQGV DPTPEAEASK DYLKMGGGFC
IDENETSKDP DAACDQDPVA ATDSSNGVAF TDKADDNAAS AEPSSSPKRS LDGLQNASFT
ELNLGHQNAA NEDDSKGSAP PQETTVNDTV TAFVGGLSAM PTLKRKRRKR
//