ID W6LD86_9TRYP Unreviewed; 461 AA.
AC W6LD86;
DT 16-APR-2014, integrated into UniProtKB/TrEMBL.
DT 16-APR-2014, sequence version 1.
DT 27-MAR-2024, entry version 37.
DE RecName: Full=Cysteine protease {ECO:0008006|Google:ProtNLM};
GN ORFNames=GSHART1_T00002538001 {ECO:0000313|EMBL:CCW71869.1};
OS Phytomonas sp. Hart1.
OC Eukaryota; Discoba; Euglenozoa; Kinetoplastea; Metakinetoplastina;
OC Trypanosomatida; Trypanosomatidae; Phytomonas.
OX NCBI_TaxID=223615 {ECO:0000313|EMBL:CCW71869.1, ECO:0000313|Proteomes:UP000053358};
RN [1] {ECO:0000313|EMBL:CCW71869.1}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=Hart1 {ECO:0000313|EMBL:CCW71869.1};
RA Genoscope - CEA;
RL Submitted (MAY-2013) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|EMBL:CCW71869.1}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Hart1 {ECO:0000313|EMBL:CCW71869.1};
RX PubMed=24516393; DOI=10.1371/journal.pgen.1004007;
RA Porcel B.M., Denoeud F., Opperdoes F., Noel B., Madoui M.A.,
RA Hammarton T.C., Field M.C., Da Silva C., Couloux A., Poulain J.,
RA Katinka M., Jabbari K., Aury J.M., Campbell D.A., Cintron R., Dickens N.J.,
RA Docampo R., Sturm N.R., Koumandou V.L., Fabre S., Flegontov P., Lukes J.,
RA Michaeli S., Mottram J.C., Szoor B., Zilberstein D., Bringaud F.,
RA Wincker P., Dollet M.;
RT "The Streamlined Genome of Phytomonas spp. Relative to Human Pathogenic
RT Kinetoplastids Reveals a Parasite Tailored for Plants.";
RL PLoS Genet. 10:e1004007-e1004007(2014).
CC -!- SIMILARITY: Belongs to the peptidase C1 family.
CC {ECO:0000256|ARBA:ARBA00008455}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; HF955216; CCW71869.1; -; Genomic_DNA.
DR AlphaFoldDB; W6LD86; -.
DR MEROPS; C01.076; -.
DR EnsemblProtists; CCW71869; CCW71869; GSHART1_T00002538001.
DR OrthoDB; 5485574at2759; -.
DR Proteomes; UP000053358; Unassembled WGS sequence.
DR GO; GO:0004197; F:cysteine-type endopeptidase activity; IEA:InterPro.
DR GO; GO:0006508; P:proteolysis; IEA:InterPro.
DR CDD; cd02248; Peptidase_C1A; 1.
DR Gene3D; 1.10.287.2250; -; 1.
DR Gene3D; 3.90.70.10; Cysteine proteinases; 1.
DR InterPro; IPR021981; DUF3586.
DR InterPro; IPR038765; Papain-like_cys_pep_sf.
DR InterPro; IPR025661; Pept_asp_AS.
DR InterPro; IPR000169; Pept_cys_AS.
DR InterPro; IPR025660; Pept_his_AS.
DR InterPro; IPR013128; Peptidase_C1A.
DR InterPro; IPR000668; Peptidase_C1A_C.
DR InterPro; IPR039417; Peptidase_C1A_papain-like.
DR InterPro; IPR013201; Prot_inhib_I29.
DR PANTHER; PTHR12411:SF947; CATHEPSIN O; 1.
DR PANTHER; PTHR12411; CYSTEINE PROTEASE FAMILY C1-RELATED; 1.
DR Pfam; PF12131; DUF3586; 1.
DR Pfam; PF08246; Inhibitor_I29; 1.
DR Pfam; PF00112; Peptidase_C1; 1.
DR PRINTS; PR00705; PAPAIN.
DR SMART; SM00848; Inhibitor_I29; 1.
DR SMART; SM00645; Pept_C1; 1.
DR SUPFAM; SSF54001; Cysteine proteinases; 1.
DR PROSITE; PS00640; THIOL_PROTEASE_ASN; 1.
DR PROSITE; PS00139; THIOL_PROTEASE_CYS; 1.
DR PROSITE; PS00639; THIOL_PROTEASE_HIS; 1.
PE 3: Inferred from homology;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Reference proteome {ECO:0000313|Proteomes:UP000053358};
KW Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..25
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 26..461
FT /note="Cysteine protease"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5018610808"
FT DOMAIN 36..94
FT /note="Cathepsin propeptide inhibitor"
FT /evidence="ECO:0000259|SMART:SM00848"
FT DOMAIN 139..351
FT /note="Peptidase C1A papain C-terminal"
FT /evidence="ECO:0000259|SMART:SM00645"
FT REGION 105..127
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 110..124
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 461 AA; 50414 MW; 0EA2068E752A422F CRC64;
MSTRLGLWGS LLLLILVLVV SETSALSANE KFHAMFQDFK AYHSKVYESL EEEAYRLSIF
ISNIKRARRL GLIDSHAHFS VEGNRFADLS HDEFVARYLG TRPPPRLNTG KNPIETQQSH
GFGRENWNGM KKENLKDSLP SSFDWRDHGA VTEVKDQKQC GSCWAFSTTG TIEGVWAATG
HPLTSLSEQE LVSCDDTDQG CNGGMMGNAI EWLLNARGGR VLTESSYPYT SGDGLTEECQ
LEGGKVGAVV AKLVEIESNE DAIAAQLIKS GPIAVAVDAS NWQLYAGGVL SSCELSALNH
GVLIVGFNDT AKPPYWIIKN SWSNSWGEKG YIRIEKGSNQ CGVQEYAVTV EVKDGSDSDS
KNPPPPPPKP VKSELVVKSC FDNRCSLFCS KESFPLENCV DLKSGGSAIF NCSESTVEQT
IFFQKKCVGA SKSIKEPLNM CMGGILGYYE NICTFPHAIA K
//