ID A0A2I3HGI2_NOMLE Unreviewed; 2093 AA.
AC A0A2I3HGI2;
DT 28-FEB-2018, integrated into UniProtKB/TrEMBL.
DT 28-FEB-2018, sequence version 1.
DT 27-MAR-2024, entry version 31.
DE RecName: Full=Aggrecan core protein {ECO:0000256|ARBA:ARBA00039399};
DE AltName: Full=Cartilage-specific proteoglycan core protein {ECO:0000256|ARBA:ARBA00042947};
OS Nomascus leucogenys (Northern white-cheeked gibbon) (Hylobates leucogenys).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hylobatidae;
OC Nomascus.
OX NCBI_TaxID=61853 {ECO:0000313|Ensembl:ENSNLEP00000042612.1, ECO:0000313|Proteomes:UP000001073};
RN [1] {ECO:0000313|Ensembl:ENSNLEP00000042612.1, ECO:0000313|Proteomes:UP000001073}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RG Gibbon Genome Sequencing Consortium;
RL Submitted (OCT-2012) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSNLEP00000042612.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix {ECO:0000256|ARBA:ARBA00004498}.
CC -!- SIMILARITY: Belongs to the aggrecan/versican proteoglycan family.
CC {ECO:0000256|ARBA:ARBA00006838}.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00323}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; ADFV01044460; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADFV01044461; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADFV01044462; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADFV01044463; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR Ensembl; ENSNLET00000046708.1; ENSNLEP00000042612.1; ENSNLEG00000030173.1.
DR GeneTree; ENSGT00940000155971; -.
DR Proteomes; UP000001073; Chromosome 6.
DR GO; GO:0005540; F:hyaluronic acid binding; IEA:InterPro.
DR GO; GO:0005537; F:mannose binding; IEA:UniProtKB-KW.
DR GO; GO:0007155; P:cell adhesion; IEA:InterPro.
DR GO; GO:0045087; P:innate immune response; IEA:UniProtKB-KW.
DR CDD; cd00033; CCP; 1.
DR CDD; cd03588; CLECT_CSPGs; 1.
DR CDD; cd05900; Ig_Aggrecan; 1.
DR CDD; cd03517; Link_domain_CSPGs_modules_1_3; 1.
DR CDD; cd03520; Link_domain_CSPGs_modules_2_4; 2.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR Gene3D; 2.60.40.10; Immunoglobulins; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 5.
DR InterPro; IPR001304; C-type_lectin-like.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR018378; C-type_lectin_CS.
DR InterPro; IPR033987; CSPG_CTLD.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR007110; Ig-like_dom.
DR InterPro; IPR036179; Ig-like_dom_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR003006; Ig/MHC_CS.
DR InterPro; IPR003599; Ig_sub.
DR InterPro; IPR013106; Ig_V-set.
DR InterPro; IPR000538; Link_dom.
DR InterPro; IPR035976; Sushi/SCR/CCP_sf.
DR InterPro; IPR000436; Sushi_SCR_CCP_dom.
DR PANTHER; PTHR22804:SF42; AGGRECAN CORE PROTEIN; 1.
DR PANTHER; PTHR22804; AGGRECAN/VERSICAN PROTEOGLYCAN; 1.
DR Pfam; PF00059; Lectin_C; 1.
DR Pfam; PF00084; Sushi; 1.
DR Pfam; PF07686; V-set; 1.
DR Pfam; PF00193; Xlink; 3.
DR PRINTS; PR01265; LINKMODULE.
DR SMART; SM00032; CCP; 1.
DR SMART; SM00034; CLECT; 1.
DR SMART; SM00409; IG; 1.
DR SMART; SM00406; IGv; 1.
DR SMART; SM00445; LINK; 3.
DR SUPFAM; SSF56436; C-type lectin-like; 5.
DR SUPFAM; SSF57535; Complement control module/SCR domain; 1.
DR SUPFAM; SSF48726; Immunoglobulin; 1.
DR PROSITE; PS00615; C_TYPE_LECTIN_1; 1.
DR PROSITE; PS50041; C_TYPE_LECTIN_2; 1.
DR PROSITE; PS50835; IG_LIKE; 1.
DR PROSITE; PS00290; IG_MHC; 1.
DR PROSITE; PS01241; LINK_1; 2.
DR PROSITE; PS50963; LINK_2; 3.
DR PROSITE; PS50923; SUSHI; 1.
PE 3: Inferred from homology;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00302}; EGF-like domain {ECO:0000256|ARBA:ARBA00022536};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Immunity {ECO:0000256|ARBA:ARBA00022859};
KW Immunoglobulin domain {ECO:0000256|ARBA:ARBA00023319};
KW Innate immunity {ECO:0000256|ARBA:ARBA00022588};
KW Lectin {ECO:0000256|ARBA:ARBA00022734};
KW Mannose-binding {ECO:0000256|ARBA:ARBA00023035};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Proteoglycan {ECO:0000256|ARBA:ARBA00022974};
KW Reference proteome {ECO:0000313|Proteomes:UP000001073};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP};
KW Sushi {ECO:0000256|ARBA:ARBA00022659, ECO:0000256|PROSITE-
KW ProRule:PRU00302}.
FT SIGNAL 1..16
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 17..2093
FT /note="Aggrecan core protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5014190953"
FT DOMAIN 34..147
FT /note="Ig-like"
FT /evidence="ECO:0000259|PROSITE:PS50835"
FT DOMAIN 153..248
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 254..352
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 542..638
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 1890..2004
FT /note="C-type lectin"
FT /evidence="ECO:0000259|PROSITE:PS50041"
FT DOMAIN 2008..2068
FT /note="Sushi"
FT /evidence="ECO:0000259|PROSITE:PS50923"
FT REGION 708..943
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1144..1205
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1298..1375
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1387..1412
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1551..1624
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1651..1735
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1760..1810
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1850..1870
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2074..2093
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 728..748
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 755..771
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1154..1168
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1312..1328
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1342..1375
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1607..1624
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1696..1710
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2076..2093
FT /note="Basic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT DISULFID 199..220
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 297..318
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 585..606
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 2010..2053
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
FT DISULFID 2039..2066
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
SQ SEQUENCE 2093 AA; 218460 MW; F2B5B6BE149B8AC0 CRC64;
MTTLLWVFVT LRVITAAVTV ETSDHDNSLS VSIPQPSPLR VLLGTSLTIP CYFIDPMHPV
TTAPSTAPLA PRIKWSRVSK EKEVVLLVAT EGRVRVNSAY QDKVSLPNYP AIPSDATLEI
QSLRSNDSGV YRCEVMHGIE DSEATLEVVV KGIVFHYRAI STRYTLDFDR AQRACLQNSA
IIATPEQLQA AYEDGFHQCD AGWLADQTVR YPIHTPREGC YGDKDEFPGV RTYGIRDTNE
TYDVYCFAEE MEGEVFYATS PEKFTFQEAA NECRRLGARL ATTGQLYLAW QAGMDMCSAG
WLADRSVRYP ISKARPNCGG NLLGVRTVYM HANQTGGARL VVGREFTPLK WDFVDIPENF
FGVGGEEDIT IQTVTWPDME LPLPRNITEG EARGSVILTV KPIFEVSPSP LEPEEPFTFA
PEIGATAFPE VENETGEATR PWGFPTPGLG PATAFTSEDL VVQVTAVPGQ PHLPGAQGPA
STRPWNRPLS SPPTLLCRYP IVSPRTPCVG DKDSSPGVRT YGVRPSTETY DVYCYVDRLE
GEVFFATRLE QFTFQEALEF CESNNATLAT TGQLYAAWSR GLDKCYAGWL ADGSLRYPIV
TPRPACGGDK PGVRTVYLYP NQTGLPDPLS RHHAFCFRGI SAVPSPGEEE GGTPTSPSVV
EEWIVTQVVP GVAAVPVEEE TTAVPLGETT AILEFTIEPE NQTEWEPAYT PVGTSPLPGI
LPTWPPTGAA TEESTEGPSA TEVPSASEEP SPSEAPFPSE EPSPSEEPFP SVRPFPSVEL
FPSEEPFPSK EPSPSEEPSA SEEPYTPSPP VPSWTELPSS GEESGAPDVS GDFTGSGDVS
GHLDFSGQLS GDRASGLPSG DLDSSGDEER IEWSSTPTVG ELPSGAEILE GSASGVGDLS
GLPSGEVLET SASGVGDLSG LPSGEVLETS TSGVGDLSGL PSGEVLETTA SGVEDISGPP
SGEVLETTAS GVEDISGLPS GEVLETTASG VEDISGLPSG EILETTASGV EDISGLPSGE
VLETTASGVG DLSGLPSGEV LETSTSGVGD LSGLPSGEVL ETSTSAVGDL SGLPSGGEVL
ETSASGVEDI SGLPSGEVVE TTASGIEDVS ELPSGEGLET SASGVEDLSK LPSGEEVLEI
SASGVGDLSG LPSGGESLET SASEVGTDLS GLPSGREGLE TSASGAEDLS GLPSGKEDLV
GSASGDLDLG KLPSVTLGSG RAPETSGLPS GFSGEYSGVD FGSGPPSGLP DFSGLPSGFP
TVSLVDSTLV EVVTASTASE LEGRGTIGIS GAGEISGLPS SELDISGGAS GLPSGTELSG
QASGSPDVSG ETPGLFDVSG QPSGFPDTSG EISGVTELSG LSSGQPGVSG EASGVLYGSS
QPFGITDLSG ETSGVPDLSG QPSGLPGFSG ATSGVPDLVS GATSGSSESS GITFVDTSLV
EVTPTTFKEE EGLGSVELSG LPSGEADLSG KSGMVDVSGQ FSGTVDSSGF TSQTPEFSGL
PSGIAEVGGE SSGAEIGSSL PSGAYYGSGL PSGFATVSLV DRTLVESVTQ APTAQEAGEG
PSGILELSGA HSGAPDMSGD HSGFLDLSGL QSRLVEPSGE PPSTPYFSGD FASTTSVSGE
SSVAMGTSGE ALGLPEVTLI TSEFVEVVTE PTVSQELGQR PPVTHTPPLF ESSGEVSAAG
DISGATPMLP GSGVEVSSVP ESSSETSAYP EAGVGASAAP EASREDSGSP DLSETTSAFH
EADLERSSGL GVSGSTLTFQ EGEASAAPEV SGESTTTHDV GTEAPGLPSA TPMASGDRTE
ISRDLSGHTS GLGVVISTSI PESEWTQQTQ RPAEAHLEIE SSSLLYSGEE THTVETATSP
TDASIPTSPE WKRESDQLLQ EVCEEGWNKY QGHCYRHFPD RETWVDAERR CREQQSHLSS
IVTPEEQEFV NNNAQDYQWI GLNDRTIEGD FRWSDGHPMQ FENWRPNQPD NFFAAGEDCV
VMIWHEKGEW NDVPCNYHLP FTCKKGTVAC GEPPVVEHAR TFGQKKDRYE INSLVRYQCT
EGFVQRHMPT IRCQPSGHWE EPRITCTDPT TYKRRLQKRS SRHPRRSRPS TAH
//