ID A0A061GGP8_THECC Unreviewed; 709 AA.
AC A0A061GGP8;
DT 03-SEP-2014, integrated into UniProtKB/TrEMBL.
DT 03-SEP-2014, sequence version 1.
DT 24-JAN-2024, entry version 33.
DE SubName: Full=HAT transposon superfamily protein, putative {ECO:0000313|EMBL:EOY26199.1};
GN ORFNames=TCM_027624 {ECO:0000313|EMBL:EOY26199.1};
OS Theobroma cacao (Cacao) (Cocoa).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma.
OX NCBI_TaxID=3641 {ECO:0000313|EMBL:EOY26199.1, ECO:0000313|Proteomes:UP000026915};
RN [1] {ECO:0000313|EMBL:EOY26199.1, ECO:0000313|Proteomes:UP000026915}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Matina 1-6 {ECO:0000313|Proteomes:UP000026915};
RX PubMed=23731509; DOI=10.1186/gb-2013-14-6-r53;
RA Motamayor J.C., Mockaitis K., Schmutz J., Haiminen N., Iii D.L.,
RA Cornejo O., Findley S.D., Zheng P., Utro F., Royaert S., Saski C.,
RA Jenkins J., Podicheti R., Zhao M., Scheffler B.E., Stack J.C., Feltus F.A.,
RA Mustiga G.M., Amores F., Phillips W., Marelli J.P., May G.D., Shapiro H.,
RA Ma J., Bustamante C.D., Schnell R.J., Main D., Gilbert D., Parida L.,
RA Kuhn D.N.;
RT "The genome sequence of the most widely cultivated cacao type and its use
RT to identify candidate genes regulating pod color.";
RL Genome Biol. 14:R53.1-R53.24(2013).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CM001884; EOY26199.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A061GGP8; -.
DR STRING; 3641.A0A061GGP8; -.
DR EnsemblPlants; EOY26199; EOY26199; TCM_027624.
DR Gramene; EOY26199; EOY26199; TCM_027624.
DR eggNOG; ENOG502RE1W; Eukaryota.
DR HOGENOM; CLU_016471_3_1_1; -.
DR InParanoid; A0A061GGP8; -.
DR OMA; SIYNIDR; -.
DR Proteomes; UP000026915; Chromosome 6.
DR GO; GO:0046983; F:protein dimerization activity; IEA:InterPro.
DR InterPro; IPR007021; DUF659.
DR InterPro; IPR008906; HATC_C_dom.
DR InterPro; IPR012337; RNaseH-like_sf.
DR PANTHER; PTHR32166:SF63; HAT TRANSPOSON SUPERFAMILY PROTEIN; 1.
DR PANTHER; PTHR32166; OSJNBA0013A04.12 PROTEIN; 1.
DR Pfam; PF05699; Dimer_Tnp_hAT; 1.
DR Pfam; PF04937; DUF659; 1.
DR SUPFAM; SSF53098; Ribonuclease H-like; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000026915}.
FT DOMAIN 200..352
FT /note="DUF659"
FT /evidence="ECO:0000259|Pfam:PF04937"
FT DOMAIN 568..642
FT /note="HAT C-terminal dimerisation"
FT /evidence="ECO:0000259|Pfam:PF05699"
FT REGION 93..128
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 690..709
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 111..128
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 709 AA; 79914 MW; EDB3DD0B08B0F5E8 CRC64;
MASSEASINV HDHGKAVDGK KQRVQCNYCG KEMSGFFRLK YHLGGVRGDV IPCEMVSEDV
KELFKNMLPE RGGRLSQEVR DLSRQDLPWK RNGCPNSNVA KKMRRQSCKS SGSRSGEDEI
IDSMSEDDVK EPAILPSARI VSQSAVTGDP EEEPSCKQNK RCIGRFFYET GIDLTLVNSP
SFQRMINDTH CPGQTNYKIP SCQELKGWIL KDEVKEMQEY VEKIRQSWAS SGCSILLDGW
IDEKGRNLVS FIVDCPQGPI YLHSSDVSAS VDDVDALQLL FDRVIDDVGV ENVVQIIAFS
TEGWVGAVGK QFMGRSKTVF WTVNASHCIE LMLDKIAMMG EIRGTLENAR TISKFIHGHL
TVLNLLRDYT DGHDLIKPTK VRSAMPFVTL ENIIAEKKNL KAMFASSEWN TSAWASRAEG
KRVADLVGDP SFWKGAGRVV KTALPLIRVL CLINGDDKPQ MGYIYETMDQ MKETIKKECN
SKESQYMPFW ELIDKIWDGH LHSPLHAAGH FLNPSLFYST DFQSDSEVAF GLLCCMVRMI
QSQPIQDKIV QQLEAYRNSE GAFGEGSTVQ QRTRFSSTMW WSTYGGRCPE LQRFATRILS
QTCVGASKYR LNRSLVEKLL TKGRNPVEQQ LLSDLIFVHY NLQLQQQQRS QFGVNYDIAG
DEIDAMDEWI VDDTPEIGSR DGDSAWKELD GAVNGGRPSS QVKEEYRQV
//