ID A0A093QWU3_PYGAD Unreviewed; 2114 AA.
AC A0A093QWU3;
DT 26-NOV-2014, integrated into UniProtKB/TrEMBL.
DT 26-NOV-2014, sequence version 1.
DT 27-MAR-2024, entry version 50.
DE RecName: Full=Aggrecan core protein {ECO:0000256|ARBA:ARBA00039399};
DE AltName: Full=Cartilage-specific proteoglycan core protein {ECO:0000256|ARBA:ARBA00042947};
GN ORFNames=AS28_04999 {ECO:0000313|EMBL:KFW63169.1};
OS Pygoscelis adeliae (Adelie penguin).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
OC Coelurosauria; Aves; Neognathae; Sphenisciformes; Spheniscidae; Pygoscelis.
OX NCBI_TaxID=9238 {ECO:0000313|EMBL:KFW63169.1, ECO:0000313|Proteomes:UP000054081};
RN [1] {ECO:0000313|EMBL:KFW63169.1, ECO:0000313|Proteomes:UP000054081}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=BGI_AS28 {ECO:0000313|EMBL:KFW63169.1};
RA Zhang G., Li C.;
RT "Genome evolution of avian class.";
RL Submitted (APR-2014) to the EMBL/GenBank/DDBJ databases.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix {ECO:0000256|ARBA:ARBA00004498}.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00076}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; KL224733; KFW63169.1; -; Genomic_DNA.
DR STRING; 9238.A0A093QWU3; -.
DR Proteomes; UP000054081; Unassembled WGS sequence.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0005509; F:calcium ion binding; IEA:InterPro.
DR GO; GO:0030246; F:carbohydrate binding; IEA:UniProtKB-KW.
DR GO; GO:0005540; F:hyaluronic acid binding; IEA:InterPro.
DR GO; GO:0007155; P:cell adhesion; IEA:InterPro.
DR CDD; cd00033; CCP; 1.
DR CDD; cd03588; CLECT_CSPGs; 1.
DR CDD; cd00054; EGF_CA; 1.
DR CDD; cd03517; Link_domain_CSPGs_modules_1_3; 2.
DR CDD; cd03520; Link_domain_CSPGs_modules_2_4; 2.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR Gene3D; 2.60.40.10; Immunoglobulins; 1.
DR Gene3D; 2.10.25.10; Laminin; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 5.
DR InterPro; IPR001304; C-type_lectin-like.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR018378; C-type_lectin_CS.
DR InterPro; IPR033987; CSPG_CTLD.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR001881; EGF-like_Ca-bd_dom.
DR InterPro; IPR000742; EGF-like_dom.
DR InterPro; IPR000152; EGF-type_Asp/Asn_hydroxyl_site.
DR InterPro; IPR018097; EGF_Ca-bd_CS.
DR InterPro; IPR007110; Ig-like_dom.
DR InterPro; IPR036179; Ig-like_dom_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR003599; Ig_sub.
DR InterPro; IPR013106; Ig_V-set.
DR InterPro; IPR000538; Link_dom.
DR InterPro; IPR035976; Sushi/SCR/CCP_sf.
DR InterPro; IPR000436; Sushi_SCR_CCP_dom.
DR PANTHER; PTHR22804:SF42; AGGRECAN CORE PROTEIN; 1.
DR PANTHER; PTHR22804; AGGRECAN/VERSICAN PROTEOGLYCAN; 1.
DR Pfam; PF00008; EGF; 1.
DR Pfam; PF00059; Lectin_C; 1.
DR Pfam; PF00084; Sushi; 1.
DR Pfam; PF07686; V-set; 1.
DR Pfam; PF00193; Xlink; 4.
DR PRINTS; PR01265; LINKMODULE.
DR SMART; SM00032; CCP; 1.
DR SMART; SM00034; CLECT; 1.
DR SMART; SM00181; EGF; 1.
DR SMART; SM00179; EGF_CA; 1.
DR SMART; SM00409; IG; 1.
DR SMART; SM00445; LINK; 4.
DR SUPFAM; SSF56436; C-type lectin-like; 5.
DR SUPFAM; SSF57535; Complement control module/SCR domain; 1.
DR SUPFAM; SSF48726; Immunoglobulin; 1.
DR PROSITE; PS00010; ASX_HYDROXYL; 1.
DR PROSITE; PS00615; C_TYPE_LECTIN_1; 1.
DR PROSITE; PS50041; C_TYPE_LECTIN_2; 1.
DR PROSITE; PS00022; EGF_1; 1.
DR PROSITE; PS50026; EGF_3; 1.
DR PROSITE; PS01187; EGF_CA; 1.
DR PROSITE; PS50835; IG_LIKE; 1.
DR PROSITE; PS01241; LINK_1; 3.
DR PROSITE; PS50963; LINK_2; 4.
DR PROSITE; PS50923; SUSHI; 1.
PE 4: Predicted;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00076};
KW EGF-like domain {ECO:0000256|ARBA:ARBA00022536, ECO:0000256|PROSITE-
KW ProRule:PRU00076}; Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Immunoglobulin domain {ECO:0000256|ARBA:ARBA00023319};
KW Lectin {ECO:0000256|ARBA:ARBA00022734};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Proteoglycan {ECO:0000256|ARBA:ARBA00022974};
KW Reference proteome {ECO:0000313|Proteomes:UP000054081};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Secreted {ECO:0000256|ARBA:ARBA00022525}; Signal {ECO:0000256|SAM:SignalP};
KW Sushi {ECO:0000256|ARBA:ARBA00022659, ECO:0000256|PROSITE-
KW ProRule:PRU00302}.
FT SIGNAL 1..17
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 18..2114
FT /note="Aggrecan core protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5001886841"
FT DOMAIN 34..143
FT /note="Ig-like"
FT /evidence="ECO:0000259|PROSITE:PS50835"
FT DOMAIN 149..244
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 250..346
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 522..617
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 623..719
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 1858..1894
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1907..2021
FT /note="C-type lectin"
FT /evidence="ECO:0000259|PROSITE:PS50041"
FT DOMAIN 2025..2085
FT /note="Sushi"
FT /evidence="ECO:0000259|PROSITE:PS50923"
FT REGION 800..834
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 901..926
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 967..1048
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1217..1246
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1314..1339
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1405..1427
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1745..1767
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1790..1821
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2094..2114
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1318..1339
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT DISULFID 195..216
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 293..314
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 568..589
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 666..687
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 1884..1893
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 2027..2070
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
FT DISULFID 2056..2083
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
SQ SEQUENCE 2114 AA; 225393 MW; 6BCA1AA16F7CF349 CRC64;
MTTLLLVFVC LRVITTAISV ELSDSSDGLE VKIPEQSPLR VVLGSSLNIP CYFNIPEEQD
TSALLTPRIK WSKLSNGTEV VLLVATGGKI RLNTEYREAI SLPNYPAIPT DATLEIKALR
SNHTGIYRCE VMYGIEDRQD TIEILVKGIV FHYRAISTRY TLNFEKAKQA CIQNSAVIAT
PEQLQAAYED GYEQCDAGWL ADQTVRYPIH WPRERCYGDK DEFPGVRTYG VREPDETYDV
YCYAEQMQGK VFYATDPEKF TFQEAFDKCR SLGARLATTG ELYLAWKDGM DMCSAGWLAD
RSVRYPISRA RPNCGGNLVG VRTVYLYVNQ TGYPHPHSRY DAICYSGDDV ETLVPGQFID
ETGSELGSAF TVQTVTQTEV ELPLPRNATE EEARGSIATL EPIEITPTAT ELYEGFTVLP
DLFATSVTVE TAAPEEENVT RGDVTGVWAV PEEVTTIALG TAITTETAEV SSVEEAMGVT
ATPGLEHWSL HCCWEDPFWP HISVEGGSCI CCLTTASLPP GVVFHYRAAT SRYAFSFVQA
QQACLENNAV IATPEQLQAA YEAGFDQCDA GWLRDQTVRY PIVNPRSNCL GDKESSPGVR
SYGMRPASET YDVYCYIDRL KGEVFFATQP EQFTFPEAQQ YCESQNATLA SVGQLHAAWK
QGLDRCYAGW LADGSLRYPI VSPRPACGGD APGVRTVYEL YNQTGFPDPL SRHHAFCFRA
LPPAEEEGVT SFFEEDVLAT QVIPGVEEVP SGEEATMETE FATQAENQTA WGTEVFPTDV
SLLSVSPSAF PPATIIPEET STNASVSEVS GEVTESGEHQ VSGESSASGW VSGVPDTSGE
LTSGVFELSG EHSGTGESGL PSVDLHTSGF LPGESGLPSG DLSGVPSGVV DISGLPSAEE
DVSVSTSRIP EVSGMPSGVE SSGLPSGFSG EVSGTELVSG VSSAEESGLA SGFPTVSLVD
TTLVEVVTTA PERREEGKGS IGVSGEGDLS GFPSAEWDTS GGTQGGEPSG GPELGGEPSG
VPELSGEPSG GPELSGLPSG LDVSGELSGT HEISGLVDLS GLTSGIDGSG EASGITFVDA
SLEEVTTTPS ITEAEAKEIL EISGLPSGGE QSSGMVSGSL DISGEPSGHV DFGGSVSGVL
EMSGYPSGTI DSSGEVSGVD VTSGLLSGEE SGLTSGFPTV SLVDTTLVEV VTQTSVAQEV
GEGPSGMIEI SGFPSGDRGL SGEGSGAVET SGFPSGTGDF SGEPSRIPYI SGDISGATDL
SGQSSAVTDI SGEASGLPEV TLVTSDLVKV VTRPTVSQEL GGETAVTFPY GFGPSGEASS
SGELSGETSA LPESGRETST AYEISGETSA FPETSVETST IHEISGETSA FPEFSIETST
IQEISGETSA FPEIIIKTST IQEVSGETSA FPESSTETST SQEISGETSA FPEIRIETST
IQDISGETSA FPEIRIETFT SQEARGETSG YPEISIETST VHETSGETSA FPEISIETST
VHEIIREISG ESSAFPEIRI ETSTNQEARG ETSAYPEISI ETSTVHETSG EASAFPDISI
ETSTVHEISG ETSAFPEISI ETPTVHEISG ETSAFPKISI ETSTVHEISG ETSAFPEIRI
ETSASQEARG ETSALPEISI ETSTVHEFSG ETSAFPEISI ETPRSQEARG ETSAFPEINI
ETSTVQELSG ETSAFPEIRI ETSTSQEAQG ETSAFPEISI ETSTVHETSG EASALPAANI
ETAATSLASG EPSGAPEEKE IPDTTSGAVT HSIAGVSGET SVPDVVISTS APDVEPTQGP
RNPEEAQLEI EPSPPAVSGQ KTETDVVLNN PHLLATAAAA LPQVPQEAID TLGPTTEDTD
ECHSSPCLNG ATCVDGIDSF KCLCLPSYGG DLCEIDLENC EEGWTKFQGH CYRHFEERET
WMDAETRCRQ HQAHLSSIIT PEEQQFVNSH AQDYQWIGLS DRAVENDFRW SDGHSLQFEN
WRAHQPDNFF AAGEDCVVMI WHEQGEWNDV PCNYHLPFTC KKGTVVCGDP PVVENARTFG
RKKDRYEINS MVRYQCNQGY IQRHVPTIRC QPNGQWEEPR ISCINPSNYQ RRLYKRSPRS
RSRPSGRAVH RPTH
//