GenomeNet

Database: UniProt
Entry: G4VG42_SCHMA
LinkDB: G4VG42_SCHMA
Original site: G4VG42_SCHMA 
ID   G4VG42_SCHMA            Unreviewed;       455 AA.
AC   G4VG42;
DT   14-DEC-2011, integrated into UniProtKB/TrEMBL.
DT   14-DEC-2011, sequence version 1.
DT   27-MAR-2024, entry version 74.
DE   RecName: Full=Dipeptidyl peptidase 1 {ECO:0000256|ARBA:ARBA00014709};
DE   AltName: Full=Cathepsin C {ECO:0000256|ARBA:ARBA00029779};
DE   AltName: Full=Cathepsin J {ECO:0000256|ARBA:ARBA00029762};
DE   AltName: Full=Dipeptidyl peptidase I {ECO:0000256|ARBA:ARBA00032961};
DE   AltName: Full=Dipeptidyl transferase {ECO:0000256|ARBA:ARBA00030778};
OS   Schistosoma mansoni (Blood fluke).
OC   Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes; Trematoda;
OC   Digenea; Strigeidida; Schistosomatoidea; Schistosomatidae; Schistosoma.
OX   NCBI_TaxID=6183 {ECO:0000313|Proteomes:UP000008854, ECO:0000313|WBParaSite:Smp_019030.1};
RN   [1] {ECO:0000313|Proteomes:UP000008854}
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC   STRAIN=Puerto Rican {ECO:0000313|Proteomes:UP000008854};
RX   PubMed=22253936; DOI=10.1371/journal.pntd.0001455;
RA   Protasio A.V., Tsai I.J., Babbage A., Nichol S., Hunt M., Aslett M.A.,
RA   De Silva N., Velarde G.S., Anderson T.J., Clark R.C., Davidson C.,
RA   Dillon G.P., Holroyd N.E., LoVerde P.T., Lloyd C., McQuillan J.,
RA   Oliveira G., Otto T.D., Parker-Manuel S.J., Quail M.A., Wilson R.A.,
RA   Zerlotini A., Dunne D.W., Berriman M.;
RT   "A systematically improved high quality genome and transcriptome of the
RT   human blood fluke Schistosoma mansoni.";
RL   PLoS Negl. Trop. Dis. 6:E1455-E1455(2012).
RN   [2] {ECO:0000313|WBParaSite:Smp_019030.1}
RP   IDENTIFICATION.
RC   STRAIN=Puerto Rican {ECO:0000313|WBParaSite:Smp_019030.1};
RG   WormBaseParasite;
RL   Submitted (DEC-2018) to UniProtKB.
CC   -!- COFACTOR:
CC       Name=chloride; Xref=ChEBI:CHEBI:17996;
CC         Evidence={ECO:0000256|ARBA:ARBA00001923};
CC   -!- SUBUNIT: Tetramer of heterotrimers consisting of exclusion domain,
CC       heavy- and light chains. {ECO:0000256|ARBA:ARBA00011610}.
CC   -!- SIMILARITY: Belongs to the peptidase C1 family.
CC       {ECO:0000256|ARBA:ARBA00008455}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   RefSeq; XP_018651509.1; XM_018799751.1.
DR   AlphaFoldDB; G4VG42; -.
DR   STRING; 6183.G4VG42; -.
DR   MEROPS; C01.070; -.
DR   EnsemblMetazoa; Smp_019030.1; Smp_019030.1; Smp_019030.
DR   GeneID; 8344221; -.
DR   KEGG; smm:Smp_019030; -.
DR   WBParaSite; Smp_019030.1; Smp_019030.1; Smp_019030.
DR   CTD; 8344221; -.
DR   HOGENOM; CLU_048219_0_0_1; -.
DR   InParanoid; G4VG42; -.
DR   OMA; HWDWRNV; -.
DR   OrthoDB; 5475703at2759; -.
DR   PhylomeDB; G4VG42; -.
DR   Proteomes; UP000008854; Unassembled WGS sequence.
DR   GO; GO:0008234; F:cysteine-type peptidase activity; IEA:UniProtKB-KW.
DR   GO; GO:0008239; F:dipeptidyl-peptidase activity; IEA:UniProtKB-EC.
DR   GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR   CDD; cd02621; Peptidase_C1A_CathepsinC; 1.
DR   Gene3D; 2.40.128.80; Cathepsin C, exclusion domain; 1.
DR   Gene3D; 3.90.70.10; Cysteine proteinases; 1.
DR   InterPro; IPR039412; CatC.
DR   InterPro; IPR014882; CathepsinC_exc.
DR   InterPro; IPR036496; CathepsinC_exc_dom_sf.
DR   InterPro; IPR038765; Papain-like_cys_pep_sf.
DR   InterPro; IPR000169; Pept_cys_AS.
DR   InterPro; IPR025660; Pept_his_AS.
DR   InterPro; IPR013128; Peptidase_C1A.
DR   InterPro; IPR000668; Peptidase_C1A_C.
DR   PANTHER; PTHR12411; CYSTEINE PROTEASE FAMILY C1-RELATED; 1.
DR   PANTHER; PTHR12411:SF942; DIPEPTIDYL PEPTIDASE 1; 1.
DR   Pfam; PF08773; CathepsinC_exc; 1.
DR   Pfam; PF00112; Peptidase_C1; 1.
DR   PRINTS; PR00705; PAPAIN.
DR   SMART; SM00645; Pept_C1; 1.
DR   SUPFAM; SSF54001; Cysteine proteinases; 1.
DR   SUPFAM; SSF75001; Dipeptidyl peptidase I (cathepsin C), exclusion domain; 1.
DR   PROSITE; PS00139; THIOL_PROTEASE_CYS; 1.
DR   PROSITE; PS00639; THIOL_PROTEASE_HIS; 1.
PE   3: Inferred from homology;
KW   Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW   Hydrolase {ECO:0000256|ARBA:ARBA00022801};
KW   Protease {ECO:0000256|ARBA:ARBA00022670};
KW   Reference proteome {ECO:0000313|Proteomes:UP000008854};
KW   Signal {ECO:0000256|SAM:SignalP};
KW   Thiol protease {ECO:0000256|ARBA:ARBA00022807}.
FT   SIGNAL          1..20
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT   CHAIN           21..455
FT                   /note="Dipeptidyl peptidase 1"
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT                   /id="PRO_5030171997"
FT   DOMAIN          219..452
FT                   /note="Peptidase C1A papain C-terminal"
FT                   /evidence="ECO:0000259|SMART:SM00645"
SQ   SEQUENCE   455 AA;  51829 MW;  BA9FDABAE3BD5145 CRC64;
     MHWVFHCILI ILACLRFTCA DTPANCTYED AHGRWKFHIG DYQSKCPEKL NSKQSVVISL
     LYPDIAIDEF GNRGHWTLIY NQGFEVTINH RKWLVIFAYK SNGEFNCHKS MPMWTHDTLI
     RQWKCFVAEK IGVHDKFHIN KLFGSKSFGR TLYHINPSFV DKINAHQKSW RAEIYPELSK
     YTIDELRNRA GGVKSMVTRP SVLNRKTPSK ELISLTGNLP LEFDWTSPPD GSRSPVTPIR
     NQGICGSCYA FASAAALEAR IRLVSNFSEQ PILSPQAVVD CSPYSEGCNG GFPFLIAGKY
     GEDFGFVSEN CDPYTGEDTG KCTVSKNCTR YYTTDYSYIG GYYGATNEKL MQLELISNGP
     FPVGFEVYED FQFYKEGIYH HTTVQNDHYN FNPFELTNHA VLLVGYGVDK LSGEPYWKVK
     NSWGVEWGEQ GYFRILRGTD ECGVESLGVR FDPVL
//
DBGET integrated database retrieval system