ID A0A556TS46_BAGYA Unreviewed; 1361 AA.
AC A0A556TS46;
DT 16-OCT-2019, integrated into UniProtKB/TrEMBL.
DT 16-OCT-2019, sequence version 1.
DT 27-MAR-2024, entry version 15.
DE SubName: Full=Neurocan core protein {ECO:0000313|EMBL:TSK53689.1};
GN ORFNames=Baya_3249 {ECO:0000313|EMBL:TSK53689.1};
OS Bagarius yarrelli (Goonch) (Bagrus yarrelli).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Ostariophysi; Siluriformes;
OC Sisoridae; Sisorinae; Bagarius.
OX NCBI_TaxID=175774 {ECO:0000313|EMBL:TSK53689.1, ECO:0000313|Proteomes:UP000319801};
RN [1] {ECO:0000313|EMBL:TSK53689.1, ECO:0000313|Proteomes:UP000319801}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=JWS20170419001 {ECO:0000313|EMBL:TSK53689.1};
RC TISSUE=Muscle {ECO:0000313|EMBL:TSK53689.1};
RX PubMed=31274158;
RA Jiang W., Lv Y., Cheng L., Yang K., Chao B., Wang X., Li Y., Pan X.,
RA You X., Zhang Y., Yang J., Li J., Zhang X., Liu S., Sun C., Yang J.,
RA Shi Q.;
RT "Whole-Genome Sequencing of the Giant Devil Catfish, Bagarius yarrelli.";
RL Genome Biol. Evol. 11:2071-2077(2019).
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00076}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:TSK53689.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; VCAZ01000015; TSK53689.1; -; Genomic_DNA.
DR Proteomes; UP000319801; Unassembled WGS sequence.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0005509; F:calcium ion binding; IEA:InterPro.
DR GO; GO:0030246; F:carbohydrate binding; IEA:UniProtKB-KW.
DR GO; GO:0005540; F:hyaluronic acid binding; IEA:InterPro.
DR GO; GO:0048856; P:anatomical structure development; IEA:UniProt.
DR GO; GO:0007155; P:cell adhesion; IEA:InterPro.
DR CDD; cd00033; CCP; 1.
DR CDD; cd00054; EGF_CA; 1.
DR CDD; cd03517; Link_domain_CSPGs_modules_1_3; 1.
DR CDD; cd03520; Link_domain_CSPGs_modules_2_4; 1.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR Gene3D; 2.60.40.10; Immunoglobulins; 1.
DR Gene3D; 2.10.25.10; Laminin; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 3.
DR InterPro; IPR001304; C-type_lectin-like.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR018378; C-type_lectin_CS.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR001881; EGF-like_Ca-bd_dom.
DR InterPro; IPR000742; EGF-like_dom.
DR InterPro; IPR000152; EGF-type_Asp/Asn_hydroxyl_site.
DR InterPro; IPR018097; EGF_Ca-bd_CS.
DR InterPro; IPR007110; Ig-like_dom.
DR InterPro; IPR036179; Ig-like_dom_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR003599; Ig_sub.
DR InterPro; IPR013106; Ig_V-set.
DR InterPro; IPR000538; Link_dom.
DR InterPro; IPR035976; Sushi/SCR/CCP_sf.
DR InterPro; IPR000436; Sushi_SCR_CCP_dom.
DR PANTHER; PTHR22804; AGGRECAN/VERSICAN PROTEOGLYCAN; 1.
DR PANTHER; PTHR22804:SF24; NEUROCAN CORE PROTEIN; 1.
DR Pfam; PF00008; EGF; 1.
DR Pfam; PF00059; Lectin_C; 1.
DR Pfam; PF00084; Sushi; 1.
DR Pfam; PF07686; V-set; 1.
DR Pfam; PF00193; Xlink; 2.
DR PRINTS; PR01265; LINKMODULE.
DR SMART; SM00032; CCP; 1.
DR SMART; SM00034; CLECT; 1.
DR SMART; SM00181; EGF; 1.
DR SMART; SM00179; EGF_CA; 1.
DR SMART; SM00409; IG; 1.
DR SMART; SM00445; LINK; 2.
DR SUPFAM; SSF56436; C-type lectin-like; 3.
DR SUPFAM; SSF57535; Complement control module/SCR domain; 1.
DR SUPFAM; SSF57196; EGF/Laminin; 1.
DR SUPFAM; SSF48726; Immunoglobulin; 1.
DR PROSITE; PS00010; ASX_HYDROXYL; 1.
DR PROSITE; PS00615; C_TYPE_LECTIN_1; 1.
DR PROSITE; PS50041; C_TYPE_LECTIN_2; 1.
DR PROSITE; PS00022; EGF_1; 1.
DR PROSITE; PS50026; EGF_3; 1.
DR PROSITE; PS01187; EGF_CA; 1.
DR PROSITE; PS50835; IG_LIKE; 1.
DR PROSITE; PS01241; LINK_1; 1.
DR PROSITE; PS50963; LINK_2; 2.
DR PROSITE; PS50923; SUSHI; 1.
PE 4: Predicted;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00076};
KW EGF-like domain {ECO:0000256|ARBA:ARBA00022536, ECO:0000256|PROSITE-
KW ProRule:PRU00076}; Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Immunoglobulin domain {ECO:0000256|ARBA:ARBA00023319};
KW Lectin {ECO:0000256|ARBA:ARBA00022734};
KW Proteoglycan {ECO:0000256|ARBA:ARBA00022974};
KW Reference proteome {ECO:0000313|Proteomes:UP000319801};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Secreted {ECO:0000256|ARBA:ARBA00022525}; Signal {ECO:0000256|SAM:SignalP};
KW Sushi {ECO:0000256|ARBA:ARBA00022659, ECO:0000256|PROSITE-
KW ProRule:PRU00302}.
FT SIGNAL 1..29
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 30..1361
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5022085755"
FT DOMAIN 53..153
FT /note="Ig-like"
FT /evidence="ECO:0000259|PROSITE:PS50835"
FT DOMAIN 160..255
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 261..357
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 1062..1098
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1144..1258
FT /note="C-type lectin"
FT /evidence="ECO:0000259|PROSITE:PS50041"
FT DOMAIN 1262..1322
FT /note="Sushi"
FT /evidence="ECO:0000259|PROSITE:PS50923"
FT REGION 444..484
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 923..967
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1324..1361
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 454..470
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 923..945
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 953..967
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1324..1351
FT /note="Basic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT DISULFID 206..227
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 304..325
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 1088..1097
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 1264..1307
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
FT DISULFID 1293..1320
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
SQ SEQUENCE 1361 AA; 151467 MW; A2F3017F449CFE81 CRC64;
MLSVFHVSGL QSLLTVLLVV SLEHETVWGE TIVNMRKVTH QVEQIPLSGT ALLPCIFTLR
PSPSHESHDI PRIKWTKIWG HRGLDGLQRE QSVLVAKGNV VKVKKAFQGR VTLPGYMENR
YNASLALTGL RSSDSGMYRC EVVVGLNDEQ DIVPLQVTGV VFHYRAPYDR YALSFADAKQ
ACVENSGVIA TPGQLQATFD DGYDNCDAGW LSDQTVRYPI QSPRPGCFGD REDSPGVRNY
GSRDTDELFD VYCFAESLKG EVFYVNVPEK LSLATASTHC HKLGAQLATV GQLYLAWQAG
LDRCDPGWLA DGSVRYPINQ PRRNCGGDET GVRTLYHNPN RTGFPDTASL FDAYCYRESQ
PIALALMQTS QTSSNSTDDW EQLQQNQSFA QPSNWTGLVD LDKEEFTSIA DKESSEISGE
HVVIHLSPKE RPVSQVMHVS NYQSPPVLEL NGGSAREESD EREVTEDHRL APKPSMTTSS
SSIETQTSNS MLFNFVNSIM KPWKYWKGNT DTDVPSSVPS IVKKTANEVQ TARMKPGGEL
RGTKGDVENA IVEPGSIPTD PLTPYSEEGL LEQEKDMVLS IPIKELKAPS QLHMEPSVSL
PTASKGLIAS GRIEVINPTA ATSPLGSGSS FVFMPEVASQ QEQASQSLGV YTTTSSNGIY
TTPTSEPVET KREAMEMRTI PVLGDHENYS GEGKDHIEES YLLPKVSTLQ SSPQTGILEK
ENDTDGSGGN EFFIISSPKK QGTPEALKEQ NVLIQTTVQW ELLQLPTRPP SHSPGVSKDI
ADPEEARGEI LYMHHPTQKL ENNSFDKIAV EKSLSKHTPN VSEIPNPYMA TTPEHSPNNS
STNYQMPSTS TITTLNSQIE NTDTTAAEDT VSTSDLAISV SWLPVVQKET QEPPTHVPNI
VTDGRFETTS NTILLTTPYM EEATQHAETE PQTQAAPTTK TTSKDNTSKS TDDEQENSSA
SSGNNGAFTY EATSKDRCEC LLYPGHHLIA PHHQPAHVGC HVAYVTSFWR SRKKGRLLCE
CSWFKSINRS FGFLALQPES CLLAGQDSFS LAIAQETILV DDIDECQSNP CQNGGTCIDE
INSFVCLCLP SYGGATCEKA LWLIALSSQP GPSAAEHANH SPYGPICTGQ FADTEGCDHN
WRKFHGHCYR YFIHRLNWED AEKDCREHNG HLASIHTREE QNFINSMSHE NTWIGLNDRT
VEEDFHWTDN MDLQYENWRE NQPDNFFAGG EDCVVMIAHE NGKWNDVPCN YKLPYICKKG
TVLCGPPPLV DNAFLIGRKR SHYDIHSVVR YQCADGFLQR HVPTTKCRAS GKWDHPKILC
TKSRRSHRYR RHHHKSHHER RKHKKHGSDS HRGRHDSRDH F
//