GenomeNet

Database: UniProt/SWISS-PROT
Entry: CATB_MACFA
LinkDB: CATB_MACFA
Original site: CATB_MACFA 
ID   CATB_MACFA              Reviewed;         339 AA.
AC   Q4R5M2;
DT   29-APR-2008, integrated into UniProtKB/Swiss-Prot.
DT   19-JUL-2005, sequence version 1.
DT   27-MAR-2024, entry version 91.
DE   RecName: Full=Cathepsin B;
DE            EC=3.4.22.1 {ECO:0000250|UniProtKB:P07858};
DE   Contains:
DE     RecName: Full=Cathepsin B light chain {ECO:0000250|UniProtKB:P07858};
DE   Contains:
DE     RecName: Full=Cathepsin B heavy chain {ECO:0000250|UniProtKB:P07858};
DE   Flags: Precursor;
GN   Name=CTSB; ORFNames=QccE-13673;
OS   Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
OC   Cercopithecidae; Cercopithecinae; Macaca.
OX   NCBI_TaxID=9541;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
RC   TISSUE=Brain cortex;
RG   International consortium for macaque cDNA sequencing and analysis;
RT   "DNA sequences of macaque genes expressed in brain or testis and its
RT   evolutionary implications.";
RL   Submitted (JUN-2005) to the EMBL/GenBank/DDBJ databases.
CC   -!- FUNCTION: Thiol protease which is believed to participate in
CC       intracellular degradation and turnover of proteins (By similarity).
CC       Cleaves matrix extracellular phosphoglycoprotein MEPE (By similarity).
CC       Involved in the solubilization of cross-linked TG/thyroglobulin in the
CC       thyroid follicle lumen (By similarity). Has also been implicated in
CC       tumor invasion and metastasis (By similarity).
CC       {ECO:0000250|UniProtKB:P00787, ECO:0000250|UniProtKB:P07858,
CC       ECO:0000250|UniProtKB:P10605}.
CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=Hydrolysis of proteins with broad specificity for peptide
CC         bonds. Preferentially cleaves -Arg-Arg-|-Xaa bonds in small molecule
CC         substrates (thus differing from cathepsin L). In addition to being an
CC         endopeptidase, shows peptidyl-dipeptidase activity, liberating C-
CC         terminal dipeptides.; EC=3.4.22.1;
CC         Evidence={ECO:0000250|UniProtKB:P07858};
CC   -!- SUBUNIT: Dimer of a heavy chain and a light chain cross-linked by a
CC       disulfide bond. Interacts with SRPX2. Directly interacts with SHKBP1.
CC       {ECO:0000250|UniProtKB:P07858}.
CC   -!- SUBCELLULAR LOCATION: Lysosome {ECO:0000250|UniProtKB:P10605}.
CC       Melanosome {ECO:0000250|UniProtKB:P07858}. Secreted, extracellular
CC       space {ECO:0000250|UniProtKB:P10605}. Apical cell membrane
CC       {ECO:0000250|UniProtKB:P10605}; Peripheral membrane protein
CC       {ECO:0000250|UniProtKB:P10605}; Extracellular side
CC       {ECO:0000250|UniProtKB:P10605}. Note=Localizes to the lumen of thyroid
CC       follicles and to the apical membrane of thyroid epithelial cells.
CC       {ECO:0000250|UniProtKB:P10605}.
CC   -!- SIMILARITY: Belongs to the peptidase C1 family. {ECO:0000255|PROSITE-
CC       ProRule:PRU10088, ECO:0000255|PROSITE-ProRule:PRU10089,
CC       ECO:0000255|PROSITE-ProRule:PRU10090}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; AB169521; BAE01603.1; -; mRNA.
DR   RefSeq; NP_001270414.1; NM_001283485.1.
DR   AlphaFoldDB; Q4R5M2; -.
DR   SMR; Q4R5M2; -.
DR   IntAct; Q4R5M2; 1.
DR   MINT; Q4R5M2; -.
DR   STRING; 9541.ENSMFAP00000011452; -.
DR   MEROPS; C01.060; -.
DR   GlyCosmos; Q4R5M2; 1 site, No reported glycans.
DR   Ensembl; ENSMFAT00000090811.1; ENSMFAP00000060243.1; ENSMFAG00000014829.2.
DR   VEuPathDB; HostDB:ENSMFAG00000014829; -.
DR   eggNOG; KOG1543; Eukaryota.
DR   GeneTree; ENSGT00940000158680; -.
DR   OMA; DEKIPYW; -.
DR   OrthoDB; 808912at2759; -.
DR   Proteomes; UP000233100; Chromosome 8.
DR   Bgee; ENSMFAG00000014829; Expressed in colon and 13 other cell types or tissues.
DR   GO; GO:0016324; C:apical plasma membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0005576; C:extracellular region; IEA:UniProtKB-SubCell.
DR   GO; GO:0005764; C:lysosome; IEA:UniProtKB-SubCell.
DR   GO; GO:0042470; C:melanosome; IEA:UniProtKB-SubCell.
DR   GO; GO:0004197; F:cysteine-type endopeptidase activity; IEA:UniProtKB-EC.
DR   GO; GO:0004175; F:endopeptidase activity; ISS:UniProtKB.
DR   GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR   CDD; cd02620; Peptidase_C1A_CathepsinB; 1.
DR   Gene3D; 3.90.70.10; Cysteine proteinases; 1.
DR   InterPro; IPR038765; Papain-like_cys_pep_sf.
DR   InterPro; IPR025661; Pept_asp_AS.
DR   InterPro; IPR000169; Pept_cys_AS.
DR   InterPro; IPR025660; Pept_his_AS.
DR   InterPro; IPR013128; Peptidase_C1A.
DR   InterPro; IPR000668; Peptidase_C1A_C.
DR   InterPro; IPR012599; Propeptide_C1A.
DR   PANTHER; PTHR12411:SF895; CATHEPSIN B; 1.
DR   PANTHER; PTHR12411; CYSTEINE PROTEASE FAMILY C1-RELATED; 1.
DR   Pfam; PF00112; Peptidase_C1; 1.
DR   Pfam; PF08127; Propeptide_C1; 1.
DR   PRINTS; PR00705; PAPAIN.
DR   SMART; SM00645; Pept_C1; 1.
DR   SUPFAM; SSF54001; Cysteine proteinases; 1.
DR   PROSITE; PS00640; THIOL_PROTEASE_ASN; 1.
DR   PROSITE; PS00139; THIOL_PROTEASE_CYS; 1.
DR   PROSITE; PS00639; THIOL_PROTEASE_HIS; 1.
PE   2: Evidence at transcript level;
KW   Acetylation; Cell membrane; Disulfide bond; Glycoprotein; Hydrolase;
KW   Lysosome; Membrane; Protease; Reference proteome; Secreted; Signal;
KW   Thiol protease; Zymogen.
FT   SIGNAL          1..17
FT                   /evidence="ECO:0000255"
FT   PROPEP          18..79
FT                   /note="Activation peptide"
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT                   /id="PRO_0000330875"
FT   CHAIN           80..333
FT                   /note="Cathepsin B"
FT                   /id="PRO_0000330876"
FT   CHAIN           80..126
FT                   /note="Cathepsin B light chain"
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT                   /id="PRO_0000330877"
FT   CHAIN           129..333
FT                   /note="Cathepsin B heavy chain"
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT                   /id="PRO_0000330878"
FT   PROPEP          334..339
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT                   /id="PRO_0000330879"
FT   ACT_SITE        108
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU10088"
FT   ACT_SITE        278
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU10089"
FT   ACT_SITE        298
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU10090"
FT   MOD_RES         220
FT                   /note="N6-acetyllysine"
FT                   /evidence="ECO:0000250|UniProtKB:P10605"
FT   CARBOHYD        192
FT                   /note="N-linked (GlcNAc...) asparagine"
FT                   /evidence="ECO:0000255"
FT   DISULFID        93..122
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT   DISULFID        105..150
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT   DISULFID        141..207
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT   DISULFID        142..146
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT   DISULFID        179..211
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
FT   DISULFID        187..198
FT                   /evidence="ECO:0000250|UniProtKB:P07858"
SQ   SEQUENCE   339 AA;  37776 MW;  57D18B0CE1CD56CF CRC64;
     MWWLWASLCC LLALGDARSR PSFHPLSDEL VNYVNKQNTT WQAGHNFYNV DVSYLKRLCG
     TFLGGPKPPQ RVMFTEDLKL PESFDAREQW PQCPTIKEIR DQGSCGSCWA FGAVEAISDR
     ICIHTNAHVS VEVSAEDLLT CCGIMCGDGC NGGYPAGAWN FWTRKGLVSG GLYDSHVGCR
     PYSIPPCEHH VNGSRPPCTG EGDTPKCSKI CEPGYSPTYK QDKHYGYNSY SVSNSEKDIM
     AEIYKNGPVE GAFSVYSDFL LYKSGVYQHV TGEMMGGHAI RILGWGVENG TPYWLVANSW
     NTDWGDNGFF KILRGQDHCG IESEVVAGIP RTDQYWEKI
//
DBGET integrated database retrieval system