ID A0A061EE03_THECC Unreviewed; 1480 AA.
AC A0A061EE03;
DT 03-SEP-2014, integrated into UniProtKB/TrEMBL.
DT 03-SEP-2014, sequence version 1.
DT 27-MAR-2024, entry version 38.
DE RecName: Full=RNA-directed DNA polymerase {ECO:0000256|ARBA:ARBA00012493};
DE EC=2.7.7.49 {ECO:0000256|ARBA:ARBA00012493};
GN ORFNames=TCM_017700 {ECO:0000313|EMBL:EOY03146.1};
OS Theobroma cacao (Cacao) (Cocoa).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma.
OX NCBI_TaxID=3641 {ECO:0000313|EMBL:EOY03146.1, ECO:0000313|Proteomes:UP000026915};
RN [1] {ECO:0000313|EMBL:EOY03146.1, ECO:0000313|Proteomes:UP000026915}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Matina 1-6 {ECO:0000313|Proteomes:UP000026915};
RX PubMed=23731509; DOI=10.1186/gb-2013-14-6-r53;
RA Motamayor J.C., Mockaitis K., Schmutz J., Haiminen N., Iii D.L.,
RA Cornejo O., Findley S.D., Zheng P., Utro F., Royaert S., Saski C.,
RA Jenkins J., Podicheti R., Zhao M., Scheffler B.E., Stack J.C., Feltus F.A.,
RA Mustiga G.M., Amores F., Phillips W., Marelli J.P., May G.D., Shapiro H.,
RA Ma J., Bustamante C.D., Schnell R.J., Main D., Gilbert D., Parida L.,
RA Kuhn D.N.;
RT "The genome sequence of the most widely cultivated cacao type and its use
RT to identify candidate genes regulating pod color.";
RL Genome Biol. 14:R53.1-R53.24(2013).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CM001882; EOY03146.1; -; Genomic_DNA.
DR EnsemblPlants; EOY03146; EOY03146; TCM_017700.
DR Gramene; EOY03146; EOY03146; TCM_017700.
DR eggNOG; KOG0017; Eukaryota.
DR HOGENOM; CLU_000384_2_2_1; -.
DR InParanoid; A0A061EE03; -.
DR Proteomes; UP000026915; Chromosome 4.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR GO; GO:0008270; F:zinc ion binding; IEA:InterPro.
DR GO; GO:0015074; P:DNA integration; IEA:InterPro.
DR CDD; cd00303; retropepsin_like; 1.
DR CDD; cd09274; RNase_HI_RT_Ty3; 1.
DR CDD; cd01647; RT_LTR; 1.
DR Gene3D; 1.10.340.70; -; 1.
DR Gene3D; 3.30.70.270; -; 2.
DR Gene3D; 2.40.70.10; Acid Proteases; 1.
DR Gene3D; 3.10.10.10; HIV Type 1 Reverse Transcriptase, subunit A, domain 1; 1.
DR Gene3D; 3.30.420.10; Ribonuclease H-like superfamily/Ribonuclease H; 1.
DR Gene3D; 4.10.60.10; Zinc finger, CCHC-type; 1.
DR InterPro; IPR016197; Chromo-like_dom_sf.
DR InterPro; IPR043502; DNA/RNA_pol_sf.
DR InterPro; IPR001584; Integrase_cat-core.
DR InterPro; IPR041588; Integrase_H2C2.
DR InterPro; IPR021109; Peptidase_aspartic_dom_sf.
DR InterPro; IPR005162; Retrotrans_gag_dom.
DR InterPro; IPR043128; Rev_trsase/Diguanyl_cyclase.
DR InterPro; IPR012337; RNaseH-like_sf.
DR InterPro; IPR036397; RNaseH_sf.
DR InterPro; IPR000477; RT_dom.
DR InterPro; IPR041373; RT_RNaseH.
DR InterPro; IPR001878; Znf_CCHC.
DR PANTHER; PTHR45835:SF105; IPP TRANSFERASE; 1.
DR PANTHER; PTHR45835; YALI0A06105P; 1.
DR Pfam; PF17921; Integrase_H2C2; 1.
DR Pfam; PF03732; Retrotrans_gag; 1.
DR Pfam; PF17917; RT_RNaseH; 1.
DR Pfam; PF08284; RVP_2; 1.
DR Pfam; PF00078; RVT_1; 1.
DR Pfam; PF00098; zf-CCHC; 1.
DR SMART; SM00343; ZnF_C2HC; 1.
DR SUPFAM; SSF50630; Acid proteases; 1.
DR SUPFAM; SSF54160; Chromo domain-like; 1.
DR SUPFAM; SSF56672; DNA/RNA polymerases; 1.
DR SUPFAM; SSF53098; Ribonuclease H-like; 1.
DR PROSITE; PS50994; INTEGRASE; 1.
DR PROSITE; PS50878; RT_POL; 1.
DR PROSITE; PS50158; ZF_CCHC; 1.
PE 4: Predicted;
KW Metal-binding {ECO:0000256|PROSITE-ProRule:PRU00047};
KW Reference proteome {ECO:0000313|Proteomes:UP000026915};
KW Zinc {ECO:0000256|PROSITE-ProRule:PRU00047};
KW Zinc-finger {ECO:0000256|PROSITE-ProRule:PRU00047}.
FT DOMAIN 386..400
FT /note="CCHC-type"
FT /evidence="ECO:0000259|PROSITE:PS50158"
FT DOMAIN 651..846
FT /note="Reverse transcriptase"
FT /evidence="ECO:0000259|PROSITE:PS50878"
FT DOMAIN 1234..1322
FT /note="Integrase catalytic"
FT /evidence="ECO:0000259|PROSITE:PS50994"
FT REGION 1..51
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 59..78
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 116..138
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 286..366
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 400..458
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 63..77
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 286..308
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 309..336
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 345..366
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 406..428
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 435..458
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1480 AA; 167731 MW; 78D7153A0C1F07AD CRC64;
MSGALSRSWT KVRMPPKTRA ASRRAGEQDA PIEMTDRPWA STQRGRGRRG RVTRLVGLDT
PVSRQEEGQS SSEVNRHPAG GITIEDLAAG LQGVNRVVEM MATRMEDIQR VIEGRPTVQE
SPSSQGQADH QHHEEERGHL DISLPDFLKL KPPTFSGVRS VELVAFQLED VAQEWYNSLC
RGRPTNATPL AWSEFSAAFL DRFLPLSVRN ARVREFETLV QTSSMTVSEY DIKFTQLARY
APYLVSTEEM KIQRFVDGLV EPLFRAVASR DFTTYSATVD RAQRIEMRTS ESRATRDRAK
RGKTEGYQSR RDFSSGGSFS SRQGPQRDSR LPQQGSDAPG ANIRVGQRTF SSRRQQDSRQ
SSQVIRSCDT CGIRHSGRCF LTTKTCYGCG QPGHIMKDCP MAHQSPDSAR GSTQPASSAP
SVAVSSGLEV SGSRGRGAGT SSQGRPSRSG HQSSIGRGQA RVFALTQQEA QTSNAVVSGI
LSVCNMNARV LFDPGATHSF ISPCFASRLG RGRVRREEQL VVSTLLKEIF MAEWEYESCV
VRVKDKDTSV NLVVLDTLDF DVILGMDWLS PCHASVDCYH KLVRFDFPGE PSFSIQGDMS
NAPTNLISVI SARRLLRQGC IGYLAVVKDS QAKIGDVTQV SVVKEFVDVF PEELSGFPPE
REIEFCIDLI PDTRPMSIPP YRMAPAELKE LKDSWRICWI KLNKVTVKNK YPLPRIDDLF
DQLQGAQCFS KIDLRSGYHQ LRIRNEDIPK TAFRTRYGHY EFLVMSFGLT NAPAAFMDLM
NRVFKPYLDK FVVVFIDDIL IYSKSREEHE QHLKIVLQIL REHRLYAKFS KCEFWLERVA
FLGHVVSREG IQVDTKKIEA VEKWPRSTSV TEIRSFVGLA GYYRRFVKDF SKIVALLTKL
TRKDTKFEWS DACENSFEKL KACLTTAPVL SLLQGTGGYT VFCDASGVGL GCVLMQHGKV
IAYASRQLKR HEQNYPIHDL EMAAIVFALK IWRHYLYGET CEIYTDHKSL KYIFQQRDLN
LRQHRWMELL KDYDCTILYH PGKANVVADA LSRKSMGSLA HISIGRRSLV REIHSLGDIG
VRLEVAETNA LLAHFRVRPI LMDRIKEAQS KDEFVIKALE DPQGRKGKMF TKGTDGVLRY
GTRLYVPDGD GLRREILEEA HMAAYVVHPG ALKMYQDLKG VYWWEGLKRD VAEFVSKCLV
CQQVKAEHQK PAGLLQPLPV PRVEVGTYCY GLCNGGAQFT SRFWGKLQEA LGTKFDFSTA
FHPQTDGQSE RTIQTLEDML RACVIDLGVR WEQYLPLVEF AYNNSFQTSI QMAPFEALYG
RRCRSPIGWL EVGERKLLGP ELVQDATEKI HMIRVMRFGK KGKLSPRYIG PFEILEKVGA
VAYRLALPPD LSNIYPVFHV SMLRKYNPDP SHVIRYETIQ LQNDLTYEEQ PVAILDRQVK
KLCSKDVASV KVLWRNYTSE EVTWEAEDEM RTKHPHLFDM
//