ID A0A061G0P6_THECC Unreviewed; 996 AA.
AC A0A061G0P6;
DT 03-SEP-2014, integrated into UniProtKB/TrEMBL.
DT 03-SEP-2014, sequence version 1.
DT 27-MAR-2024, entry version 36.
DE RecName: Full=Integrase catalytic domain-containing protein {ECO:0000259|PROSITE:PS50994};
GN ORFNames=TCM_014834 {ECO:0000313|EMBL:EOY22757.1};
OS Theobroma cacao (Cacao) (Cocoa).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma.
OX NCBI_TaxID=3641 {ECO:0000313|EMBL:EOY22757.1, ECO:0000313|Proteomes:UP000026915};
RN [1] {ECO:0000313|EMBL:EOY22757.1, ECO:0000313|Proteomes:UP000026915}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Matina 1-6 {ECO:0000313|Proteomes:UP000026915};
RX PubMed=23731509; DOI=10.1186/gb-2013-14-6-r53;
RA Motamayor J.C., Mockaitis K., Schmutz J., Haiminen N., Iii D.L.,
RA Cornejo O., Findley S.D., Zheng P., Utro F., Royaert S., Saski C.,
RA Jenkins J., Podicheti R., Zhao M., Scheffler B.E., Stack J.C., Feltus F.A.,
RA Mustiga G.M., Amores F., Phillips W., Marelli J.P., May G.D., Shapiro H.,
RA Ma J., Bustamante C.D., Schnell R.J., Main D., Gilbert D., Parida L.,
RA Kuhn D.N.;
RT "The genome sequence of the most widely cultivated cacao type and its use
RT to identify candidate genes regulating pod color.";
RL Genome Biol. 14:R53.1-R53.24(2013).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CM001881; EOY22757.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A061G0P6; -.
DR STRING; 3641.A0A061G0P6; -.
DR EnsemblPlants; EOY22757; EOY22757; TCM_014834.
DR Gramene; EOY22757; EOY22757; TCM_014834.
DR eggNOG; KOG0017; Eukaryota.
DR HOGENOM; CLU_001650_5_1_1; -.
DR InParanoid; A0A061G0P6; -.
DR Proteomes; UP000026915; Chromosome 3.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR GO; GO:0008270; F:zinc ion binding; IEA:InterPro.
DR GO; GO:0015074; P:DNA integration; IEA:InterPro.
DR CDD; cd09272; RNase_HI_RT_Ty1; 1.
DR Gene3D; 3.30.420.10; Ribonuclease H-like superfamily/Ribonuclease H; 1.
DR InterPro; IPR043502; DNA/RNA_pol_sf.
DR InterPro; IPR025314; DUF4219.
DR InterPro; IPR001584; Integrase_cat-core.
DR InterPro; IPR012337; RNaseH-like_sf.
DR InterPro; IPR036397; RNaseH_sf.
DR InterPro; IPR013103; RVT_2.
DR InterPro; IPR036875; Znf_CCHC_sf.
DR PANTHER; PTHR42648:SF24; RETROTRANSPOSON, UNCLASSIFIED-LIKE PROTEIN; 1.
DR PANTHER; PTHR42648; TRANSPOSASE, PUTATIVE-RELATED; 1.
DR Pfam; PF13961; DUF4219; 1.
DR Pfam; PF14223; Retrotran_gag_2; 1.
DR Pfam; PF00665; rve; 1.
DR Pfam; PF07727; RVT_2; 1.
DR SUPFAM; SSF56672; DNA/RNA polymerases; 1.
DR SUPFAM; SSF57756; Retrovirus zinc finger-like domains; 1.
DR SUPFAM; SSF53098; Ribonuclease H-like; 1.
DR PROSITE; PS50994; INTEGRASE; 1.
PE 4: Predicted;
KW Coiled coil {ECO:0000256|SAM:Coils};
KW Reference proteome {ECO:0000313|Proteomes:UP000026915}.
FT DOMAIN 297..473
FT /note="Integrase catalytic"
FT /evidence="ECO:0000259|PROSITE:PS50994"
FT REGION 161..198
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COILED 47..74
FT /evidence="ECO:0000256|SAM:Coils"
FT COMPBIAS 161..192
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 996 AA; 113610 MW; 2995CC5B2EB72B00 CRC64;
MSTSTSTITQ QSPLFFYGSN YAVWAIKMKA FLRGVNLWNA IEFETELPVL KENASQAQVR
LQNLRRKYEL LRMKENQPVG EFVEDLMKLV NQSKLMGDSL IDLKVVEKIM LSLPERFEPT
ITYLEQVKDI IELSISDLVS ALEADEQRKA ARRDERVDHA LAARAKDKAP VDPSFKKNSN
ENREKDKAGT AAGRSQNKKG KFPVCPYCKK RNHSEAYCWF RPGVKCNACK QLGHVEKVCK
NKVEAADKKP QVTKQVEKAE AAVKIENGLI LDAVGKGIVA IQTTSSTSSC QYGKLTRRSF
PKASLNRAKH RLELVHSDVA RPMSEPSLNG SKYFVIFIDD MSIMTWIYFI QHKSEVFSIF
QKFKAKVENE SGCRIKKLRT DNGGEYTSSE FTSYLENEGI HHQLTAPYCP EQNGVSERKN
RTIIEMSRCL LFENKLPKSF WAKAANTAVY LRNILITQAV NNETPYEAWY NTRPSVDHLR
IFGSICYLHV PEELRDKLQP KAKLGVFIGY SQQSKAYRIY QIESGKVFGS RHVTFNEGAY
WNWENNQVQH TKFLDEDVNL QPASSDEILD VEQIVDEPPV REWKEAMKKE MKMINLNKTW
SLVDRPKHHH VLGVKWVFRM KLNSDGSLNK HKAWLVVKGF AQLLRVDYHK TFAPVARMDT
IRLLLALSAK FKWKIFHLDI KSAFLNGDLQ EEIFIEQPYG FESEPNRDKV YKLHKALYGL
KFRMTNLGQM SYFLGLQILQ GNSGIFICQS NYIGEVLDKF KMTDCKTVAT PLIPHEKLSV
DKGSALENPS AFRSLIGSLL YICASRPDLM FAASYLSRFM QVPTTEHFSA AKRVLRYLKG
TANIGLQFTY ADESSVEFVG FSDSDWAGCV DDCKSTSGYV FTLGNGVFCW NSKKQETTAQ
SSAEAKYIAA ATVANQAIWV KKILTDLGFL HVPPTKLDQL ADIRTKPLHL SRFEELRSKL
NIQQARRSVE IKTSLNVILS NYVQDKCPLC FVVLVW
//