ID A0A061FVK0_THECC Unreviewed; 433 AA.
AC A0A061FVK0;
DT 03-SEP-2014, integrated into UniProtKB/TrEMBL.
DT 03-SEP-2014, sequence version 1.
DT 27-MAR-2024, entry version 32.
DE SubName: Full=Gag protease polyprotein {ECO:0000313|EMBL:EOY21246.1};
GN ORFNames=TCM_012661 {ECO:0000313|EMBL:EOY21246.1};
OS Theobroma cacao (Cacao) (Cocoa).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma.
OX NCBI_TaxID=3641 {ECO:0000313|EMBL:EOY21246.1, ECO:0000313|Proteomes:UP000026915};
RN [1] {ECO:0000313|EMBL:EOY21246.1, ECO:0000313|Proteomes:UP000026915}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Matina 1-6 {ECO:0000313|Proteomes:UP000026915};
RX PubMed=23731509; DOI=10.1186/gb-2013-14-6-r53;
RA Motamayor J.C., Mockaitis K., Schmutz J., Haiminen N., Iii D.L.,
RA Cornejo O., Findley S.D., Zheng P., Utro F., Royaert S., Saski C.,
RA Jenkins J., Podicheti R., Zhao M., Scheffler B.E., Stack J.C., Feltus F.A.,
RA Mustiga G.M., Amores F., Phillips W., Marelli J.P., May G.D., Shapiro H.,
RA Ma J., Bustamante C.D., Schnell R.J., Main D., Gilbert D., Parida L.,
RA Kuhn D.N.;
RT "The genome sequence of the most widely cultivated cacao type and its use
RT to identify candidate genes regulating pod color.";
RL Genome Biol. 14:R53.1-R53.24(2013).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CM001881; EOY21246.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A061FVK0; -.
DR EnsemblPlants; EOY21246; EOY21246; TCM_012661.
DR Gramene; EOY21246; EOY21246; TCM_012661.
DR eggNOG; KOG0017; Eukaryota.
DR HOGENOM; CLU_026677_1_0_1; -.
DR InParanoid; A0A061FVK0; -.
DR Proteomes; UP000026915; Chromosome 3.
DR GO; GO:0008233; F:peptidase activity; IEA:UniProtKB-KW.
DR GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR InterPro; IPR005162; Retrotrans_gag_dom.
DR PANTHER; PTHR34482; DNA DAMAGE-INDUCIBLE PROTEIN 1-LIKE; 1.
DR PANTHER; PTHR34482:SF36; DUF4283 DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF03732; Retrotrans_gag; 1.
PE 4: Predicted;
KW Hydrolase {ECO:0000313|EMBL:EOY21246.1};
KW Protease {ECO:0000313|EMBL:EOY21246.1};
KW Reference proteome {ECO:0000313|Proteomes:UP000026915}.
FT DOMAIN 165..257
FT /note="Retrotransposon gag"
FT /evidence="ECO:0000259|Pfam:PF03732"
FT REGION 1..62
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 298..433
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 45..60
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 300..358
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 391..415
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 416..433
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 433 AA; 48308 MW; FB7F38E6A9B942F1 CRC64;
MPPRRGRPPL YRSVGRGRGR ARLSQPDPVE RESAAPTFRA APAVEPTEIP PPPPPPTATP
GVHAMSLEAV QALAAFLNVI MGQAQASRVP HTVPPAVSPV PPPPPLVPPP VPDVSISKKL
KEARQLGCTS FVGDLDATAA KDWITQVTET FVDMKLDDDM KLMVATRLLE KRARTWWSSV
KSRSITSLTW IDFLQEFDGQ YYTYFHQKEK KREFLSLQQG NLTIEEYEAR FNELMSYVPD
LVKSEQDQAS YFEEGLRNEI RERMTVTGRE PHKEVVQMAL RAEKLTNENR RMRAEFAKKR
NPNVSSSQLP KRGKDTFASE STVSVPVISP RPPLSQLQQR PPRFSRSGMS STSEKSFGGR
ATVVAPSPLT HTDMQRRDSS GVHPRQGVAV RSEMGSNTPA QPPLRPLTRS STRVFAVTED
EARVRSGEKL KNT
//