GenomeNet

Database: UniProt
Entry: W6UUL0_ECHGR
LinkDB: W6UUL0_ECHGR
Original site: W6UUL0_ECHGR 
ID   W6UUL0_ECHGR            Unreviewed;       672 AA.
AC   W6UUL0;
DT   16-APR-2014, integrated into UniProtKB/TrEMBL.
DT   16-APR-2014, sequence version 1.
DT   27-MAR-2024, entry version 28.
DE   SubName: Full=Cathepsin B {ECO:0000313|EMBL:EUB64953.1};
GN   ORFNames=EGR_00222 {ECO:0000313|EMBL:EUB64953.1};
OS   Echinococcus granulosus (Hydatid tapeworm).
OC   Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes; Cestoda;
OC   Eucestoda; Cyclophyllidea; Taeniidae; Echinococcus;
OC   Echinococcus granulosus group.
OX   NCBI_TaxID=6210 {ECO:0000313|EMBL:EUB64953.1, ECO:0000313|Proteomes:UP000019149};
RN   [1] {ECO:0000313|EMBL:EUB64953.1, ECO:0000313|Proteomes:UP000019149}
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX   PubMed=24013640; DOI=10.1038/ng.2757;
RA   Zheng H., Zhang W., Zhang L., Zhang Z., Li J., Lu G., Zhu Y., Wang Y.,
RA   Huang Y., Liu J., Kang H., Chen J., Wang L., Chen A., Yu S., Gao Z.,
RA   Jin L., Gu W., Wang Z., Zhao L., Shi B., Wen H., Lin R., Jones M.K.,
RA   Brejova B., Vinar T., Zhao G., McManus D.P., Chen Z., Zhou Y., Wang S.;
RT   "The genome of the hydatid tapeworm Echinococcus granulosus.";
RL   Nat. Genet. 45:1168-1175(2013).
CC   -!- SIMILARITY: Belongs to the peptidase C1 family.
CC       {ECO:0000256|ARBA:ARBA00008455}.
CC   -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC       whole genome shotgun (WGS) entry which is preliminary data.
CC       {ECO:0000313|EMBL:EUB64953.1}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; APAU02000001; EUB64953.1; -; Genomic_DNA.
DR   AlphaFoldDB; W6UUL0; -.
DR   STRING; 6210.W6UUL0; -.
DR   EnsemblMetazoa; XM_024489471.1; XP_024356149.1; GeneID_36335937.
DR   OMA; KDNGCHG; -.
DR   OrthoDB; 808912at2759; -.
DR   Proteomes; UP000019149; Unassembled WGS sequence.
DR   GO; GO:0008234; F:cysteine-type peptidase activity; IEA:UniProtKB-KW.
DR   GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR   CDD; cd02620; Peptidase_C1A_CathepsinB; 2.
DR   Gene3D; 3.90.70.10; Cysteine proteinases; 2.
DR   InterPro; IPR038765; Papain-like_cys_pep_sf.
DR   InterPro; IPR025661; Pept_asp_AS.
DR   InterPro; IPR000169; Pept_cys_AS.
DR   InterPro; IPR025660; Pept_his_AS.
DR   InterPro; IPR013128; Peptidase_C1A.
DR   InterPro; IPR000668; Peptidase_C1A_C.
DR   PANTHER; PTHR12411:SF895; CATHEPSIN B; 1.
DR   PANTHER; PTHR12411; CYSTEINE PROTEASE FAMILY C1-RELATED; 1.
DR   Pfam; PF00112; Peptidase_C1; 2.
DR   PRINTS; PR00705; PAPAIN.
DR   SMART; SM00645; Pept_C1; 2.
DR   SUPFAM; SSF54001; Cysteine proteinases; 2.
DR   PROSITE; PS00640; THIOL_PROTEASE_ASN; 2.
DR   PROSITE; PS00139; THIOL_PROTEASE_CYS; 1.
DR   PROSITE; PS00639; THIOL_PROTEASE_HIS; 2.
PE   3: Inferred from homology;
KW   Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW   Hydrolase {ECO:0000256|ARBA:ARBA00022801};
KW   Protease {ECO:0000256|ARBA:ARBA00022670};
KW   Reference proteome {ECO:0000313|Proteomes:UP000019149};
KW   Thiol protease {ECO:0000256|ARBA:ARBA00022807}.
FT   DOMAIN          18..270
FT                   /note="Peptidase C1A papain C-terminal"
FT                   /evidence="ECO:0000259|SMART:SM00645"
FT   DOMAIN          405..667
FT                   /note="Peptidase C1A papain C-terminal"
FT                   /evidence="ECO:0000259|SMART:SM00645"
SQ   SEQUENCE   672 AA;  74150 MW;  A7A6739FC848321B CRC64;
     MGRRLPVLYS LSENYKSLPA SFDPRKKWPN CKTLFEIRDQ GSCGSCWAFG AAEAMSDRLC
     IQQQTVSGRA VMVRLSADDL LSCCRDCGMG CNGGFPSQAW NFWKHEGLVS GGLYGTKGVC
     RAYEIPPCEH HVNGTRPPCE GDAPTPKCKN VCQEEYKVPY KKDKHYAVKV YSVHSNEDAI
     KHELITHGPV EADFEVYADF PTYKSGVYQH VSGALLGGHA IKLMGWGEED GVPYWLCANS
     WNTDWGEGGF FKILRGKNHC GIESDIVADG DVDCIAGLGR GQKKTGIAHA QFTRGDEAAT
     KAGASVLLGR PILVARLHHR KMLLQQLLML LIAHWSSRKP HQSDRLTDII EYINNKANTT
     WRAGENERFT DALSAKSQMG SLFNPGGSML PTKSSHLSSM QKAALPSEFD ARKAWPDCPT
     IGEIRDQGTC GSCWASPQRF DLPSNLIAFG ATEAMSDRIC IHSEGKEVVR ISADDLLSCC
     GLFCGFGCNG GLPKNAWKYW AREGIVSGGL YGSHVGCRPY DIPPCEHHTS GNRPDCKGNS
     KTPKCQRQCV ESFDGEYQAD KHFASNVYNV RASEEDIMNE IMVYGPVEAD FIVYADFLTY
     KSGVYQYVKG GFLGGHAVKI LGWGEENGVP YWLCANSWNT DWGDGGFFKI LRGYNHCKIE
     ADINAGIPKI RK
//
DBGET integrated database retrieval system