ID G3W184_SARHA Unreviewed; 407 AA.
AC G3W184;
DT 16-NOV-2011, integrated into UniProtKB/TrEMBL.
DT 07-APR-2021, sequence version 2.
DT 27-MAR-2024, entry version 58.
DE SubName: Full=Cathepsin O {ECO:0000313|Ensembl:ENSSHAP00000009189.2};
GN Name=CTSO {ECO:0000313|Ensembl:ENSSHAP00000009189.2};
OS Sarcophilus harrisii (Tasmanian devil) (Sarcophilus laniarius).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Metatheria; Dasyuromorphia; Dasyuridae; Sarcophilus.
OX NCBI_TaxID=9305 {ECO:0000313|Ensembl:ENSSHAP00000009189.2, ECO:0000313|Proteomes:UP000007648};
RN [1] {ECO:0000313|Ensembl:ENSSHAP00000009189.2, ECO:0000313|Proteomes:UP000007648}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=21709235; DOI=10.1073/pnas.1102838108;
RA Miller W., Hayes V.M., Ratan A., Petersen D.C., Wittekindt N.E., Miller J.,
RA Walenz B., Knight J., Qi J., Zhao F., Wang Q., Bedoya-Reina O.C.,
RA Katiyar N., Tomsho L.P., Kasson L.M., Hardie R.A., Woodbridge P.,
RA Tindall E.A., Bertelsen M.F., Dixon D., Pyecroft S., Helgen K.M.,
RA Lesk A.M., Pringle T.H., Patterson N., Zhang Y., Kreiss A., Woods G.M.,
RA Jones M.E., Schuster S.C.;
RT "Genetic diversity and population structure of the endangered marsupial
RT Sarcophilus harrisii (Tasmanian devil).";
RL Proc. Natl. Acad. Sci. U.S.A. 108:12348-12353(2011).
RN [2] {ECO:0000313|Ensembl:ENSSHAP00000009189.2}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Lysosome {ECO:0000256|ARBA:ARBA00004371}.
CC -!- SIMILARITY: Belongs to the peptidase C1 family.
CC {ECO:0000256|ARBA:ARBA00008455}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; G3W184; -.
DR STRING; 9305.ENSSHAP00000009189; -.
DR Ensembl; ENSSHAT00000009267.2; ENSSHAP00000009189.2; ENSSHAG00000007955.2.
DR eggNOG; KOG1542; Eukaryota.
DR GeneTree; ENSGT00940000159253; -.
DR HOGENOM; CLU_012184_1_3_1; -.
DR InParanoid; G3W184; -.
DR TreeFam; TF331594; -.
DR Proteomes; UP000007648; Unassembled WGS sequence.
DR GO; GO:0005764; C:lysosome; IEA:UniProtKB-SubCell.
DR GO; GO:0008234; F:cysteine-type peptidase activity; IEA:UniProtKB-KW.
DR GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR CDD; cd02248; Peptidase_C1A; 1.
DR Gene3D; 3.90.70.10; Cysteine proteinases; 1.
DR InterPro; IPR038765; Papain-like_cys_pep_sf.
DR InterPro; IPR000169; Pept_cys_AS.
DR InterPro; IPR025660; Pept_his_AS.
DR InterPro; IPR013128; Peptidase_C1A.
DR InterPro; IPR000668; Peptidase_C1A_C.
DR InterPro; IPR039417; Peptidase_C1A_papain-like.
DR PANTHER; PTHR12411:SF947; CATHEPSIN O; 1.
DR PANTHER; PTHR12411; CYSTEINE PROTEASE FAMILY C1-RELATED; 1.
DR Pfam; PF00112; Peptidase_C1; 1.
DR PRINTS; PR00705; PAPAIN.
DR SMART; SM00645; Pept_C1; 1.
DR SUPFAM; SSF54001; Cysteine proteinases; 1.
DR PROSITE; PS00139; THIOL_PROTEASE_CYS; 1.
DR PROSITE; PS00639; THIOL_PROTEASE_HIS; 1.
PE 3: Inferred from homology;
KW Hydrolase {ECO:0000256|ARBA:ARBA00022801};
KW Lysosome {ECO:0000256|ARBA:ARBA00023228};
KW Protease {ECO:0000256|ARBA:ARBA00022670};
KW Reference proteome {ECO:0000313|Proteomes:UP000007648};
KW Thiol protease {ECO:0000256|ARBA:ARBA00022807}.
FT DOMAIN 194..406
FT /note="Peptidase C1A papain C-terminal"
FT /evidence="ECO:0000259|SMART:SM00645"
FT REGION 1..73
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 109..128
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 407 AA; 44574 MW; 0181B5B941A93915 CRC64;
MGSLNSKDGS AAAGDIQRGA LAPRLDQRRA GSTQLGAPPP QPIPRPQLER EELPKRKRKQ
ACGARRPPSS EPMKPAGLQW LWLFWGCSCS LGSAPPGLPR GALPGAAAAA NSSRSRLDSP
ERSEKRSAAF RESLKRHRYL NSFSSRANTS AIYGINQFSH LFPEEFRAIY LRSKPSQLPL
YHKELKMPAT HMPLPIRFDW RDKNVVTKVR NQQMCGGCWA FSVVGGIESA YAIKGESLED
LSVQQVIDCS YNNFGCSGGS TVNALNWLNK TQVRLVRDSE YSFKAQTGLC HYFSGSHAGV
SIKGYSSYDF SDKEDEMAKV LLAYGPLAVI VDAISWQDYL GGIIQHHCSS GEANHAVLIT
GFDKTGNTPY WIVRNSWGTS WGVDGYAFVK MGANICGIAD SVSAVFV
//