ID G4VG42_SCHMA Unreviewed; 455 AA.
AC G4VG42;
DT 14-DEC-2011, integrated into UniProtKB/TrEMBL.
DT 14-DEC-2011, sequence version 1.
DT 27-MAR-2024, entry version 74.
DE RecName: Full=Dipeptidyl peptidase 1 {ECO:0000256|ARBA:ARBA00014709};
DE AltName: Full=Cathepsin C {ECO:0000256|ARBA:ARBA00029779};
DE AltName: Full=Cathepsin J {ECO:0000256|ARBA:ARBA00029762};
DE AltName: Full=Dipeptidyl peptidase I {ECO:0000256|ARBA:ARBA00032961};
DE AltName: Full=Dipeptidyl transferase {ECO:0000256|ARBA:ARBA00030778};
OS Schistosoma mansoni (Blood fluke).
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes; Trematoda;
OC Digenea; Strigeidida; Schistosomatoidea; Schistosomatidae; Schistosoma.
OX NCBI_TaxID=6183 {ECO:0000313|Proteomes:UP000008854, ECO:0000313|WBParaSite:Smp_019030.1};
RN [1] {ECO:0000313|Proteomes:UP000008854}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Puerto Rican {ECO:0000313|Proteomes:UP000008854};
RX PubMed=22253936; DOI=10.1371/journal.pntd.0001455;
RA Protasio A.V., Tsai I.J., Babbage A., Nichol S., Hunt M., Aslett M.A.,
RA De Silva N., Velarde G.S., Anderson T.J., Clark R.C., Davidson C.,
RA Dillon G.P., Holroyd N.E., LoVerde P.T., Lloyd C., McQuillan J.,
RA Oliveira G., Otto T.D., Parker-Manuel S.J., Quail M.A., Wilson R.A.,
RA Zerlotini A., Dunne D.W., Berriman M.;
RT "A systematically improved high quality genome and transcriptome of the
RT human blood fluke Schistosoma mansoni.";
RL PLoS Negl. Trop. Dis. 6:E1455-E1455(2012).
RN [2] {ECO:0000313|WBParaSite:Smp_019030.1}
RP IDENTIFICATION.
RC STRAIN=Puerto Rican {ECO:0000313|WBParaSite:Smp_019030.1};
RG WormBaseParasite;
RL Submitted (DEC-2018) to UniProtKB.
CC -!- COFACTOR:
CC Name=chloride; Xref=ChEBI:CHEBI:17996;
CC Evidence={ECO:0000256|ARBA:ARBA00001923};
CC -!- SUBUNIT: Tetramer of heterotrimers consisting of exclusion domain,
CC heavy- and light chains. {ECO:0000256|ARBA:ARBA00011610}.
CC -!- SIMILARITY: Belongs to the peptidase C1 family.
CC {ECO:0000256|ARBA:ARBA00008455}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR RefSeq; XP_018651509.1; XM_018799751.1.
DR AlphaFoldDB; G4VG42; -.
DR STRING; 6183.G4VG42; -.
DR MEROPS; C01.070; -.
DR EnsemblMetazoa; Smp_019030.1; Smp_019030.1; Smp_019030.
DR GeneID; 8344221; -.
DR KEGG; smm:Smp_019030; -.
DR WBParaSite; Smp_019030.1; Smp_019030.1; Smp_019030.
DR CTD; 8344221; -.
DR HOGENOM; CLU_048219_0_0_1; -.
DR InParanoid; G4VG42; -.
DR OMA; HWDWRNV; -.
DR OrthoDB; 5475703at2759; -.
DR PhylomeDB; G4VG42; -.
DR Proteomes; UP000008854; Unassembled WGS sequence.
DR GO; GO:0008234; F:cysteine-type peptidase activity; IEA:UniProtKB-KW.
DR GO; GO:0008239; F:dipeptidyl-peptidase activity; IEA:UniProtKB-EC.
DR GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR CDD; cd02621; Peptidase_C1A_CathepsinC; 1.
DR Gene3D; 2.40.128.80; Cathepsin C, exclusion domain; 1.
DR Gene3D; 3.90.70.10; Cysteine proteinases; 1.
DR InterPro; IPR039412; CatC.
DR InterPro; IPR014882; CathepsinC_exc.
DR InterPro; IPR036496; CathepsinC_exc_dom_sf.
DR InterPro; IPR038765; Papain-like_cys_pep_sf.
DR InterPro; IPR000169; Pept_cys_AS.
DR InterPro; IPR025660; Pept_his_AS.
DR InterPro; IPR013128; Peptidase_C1A.
DR InterPro; IPR000668; Peptidase_C1A_C.
DR PANTHER; PTHR12411; CYSTEINE PROTEASE FAMILY C1-RELATED; 1.
DR PANTHER; PTHR12411:SF942; DIPEPTIDYL PEPTIDASE 1; 1.
DR Pfam; PF08773; CathepsinC_exc; 1.
DR Pfam; PF00112; Peptidase_C1; 1.
DR PRINTS; PR00705; PAPAIN.
DR SMART; SM00645; Pept_C1; 1.
DR SUPFAM; SSF54001; Cysteine proteinases; 1.
DR SUPFAM; SSF75001; Dipeptidyl peptidase I (cathepsin C), exclusion domain; 1.
DR PROSITE; PS00139; THIOL_PROTEASE_CYS; 1.
DR PROSITE; PS00639; THIOL_PROTEASE_HIS; 1.
PE 3: Inferred from homology;
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydrolase {ECO:0000256|ARBA:ARBA00022801};
KW Protease {ECO:0000256|ARBA:ARBA00022670};
KW Reference proteome {ECO:0000313|Proteomes:UP000008854};
KW Signal {ECO:0000256|SAM:SignalP};
KW Thiol protease {ECO:0000256|ARBA:ARBA00022807}.
FT SIGNAL 1..20
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 21..455
FT /note="Dipeptidyl peptidase 1"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5030171997"
FT DOMAIN 219..452
FT /note="Peptidase C1A papain C-terminal"
FT /evidence="ECO:0000259|SMART:SM00645"
SQ SEQUENCE 455 AA; 51829 MW; BA9FDABAE3BD5145 CRC64;
MHWVFHCILI ILACLRFTCA DTPANCTYED AHGRWKFHIG DYQSKCPEKL NSKQSVVISL
LYPDIAIDEF GNRGHWTLIY NQGFEVTINH RKWLVIFAYK SNGEFNCHKS MPMWTHDTLI
RQWKCFVAEK IGVHDKFHIN KLFGSKSFGR TLYHINPSFV DKINAHQKSW RAEIYPELSK
YTIDELRNRA GGVKSMVTRP SVLNRKTPSK ELISLTGNLP LEFDWTSPPD GSRSPVTPIR
NQGICGSCYA FASAAALEAR IRLVSNFSEQ PILSPQAVVD CSPYSEGCNG GFPFLIAGKY
GEDFGFVSEN CDPYTGEDTG KCTVSKNCTR YYTTDYSYIG GYYGATNEKL MQLELISNGP
FPVGFEVYED FQFYKEGIYH HTTVQNDHYN FNPFELTNHA VLLVGYGVDK LSGEPYWKVK
NSWGVEWGEQ GYFRILRGTD ECGVESLGVR FDPVL
//