ID W5Q233_SHEEP Unreviewed; 3375 AA.
AC W5Q233;
DT 16-APR-2014, integrated into UniProtKB/TrEMBL.
DT 16-APR-2014, sequence version 1.
DT 24-JAN-2024, entry version 70.
DE SubName: Full=Versican {ECO:0000313|Ensembl:ENSOARP00000016771.1};
GN Name=VCAN {ECO:0000313|Ensembl:ENSOARP00000016771.1};
OS Ovis aries (Sheep).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Artiodactyla; Ruminantia; Pecora; Bovidae;
OC Caprinae; Ovis.
OX NCBI_TaxID=9940 {ECO:0000313|Ensembl:ENSOARP00000016771.1, ECO:0000313|Proteomes:UP000002356};
RN [1] {ECO:0000313|Ensembl:ENSOARP00000016771.1, ECO:0000313|Proteomes:UP000002356}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Texel {ECO:0000313|Ensembl:ENSOARP00000016771.1,
RC ECO:0000313|Proteomes:UP000002356};
RX PubMed=20809919; DOI=10.1111/j.1365-2052.2010.02100.x;
RA Archibald A.L., Cockett N.E., Dalrymple B.P., Faraut T., Kijas J.W.,
RA Maddox J.F., McEwan J.C., Hutton Oddy V., Raadsma H.W., Wade C., Wang J.,
RA Wang W., Xun X.;
RT "The sheep genome reference sequence: a work in progress.";
RL Anim. Genet. 41:449-453(2010).
RN [2] {ECO:0000313|Ensembl:ENSOARP00000016771.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (JUL-2023) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix {ECO:0000256|ARBA:ARBA00004498}.
CC -!- SIMILARITY: Belongs to the aggrecan/versican proteoglycan family.
CC {ECO:0000256|ARBA:ARBA00006838}.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00076}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AMGL01094251; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01094252; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01094253; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01094254; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR RefSeq; XP_004009116.1; XM_004009067.3.
DR SMR; W5Q233; -.
DR STRING; 9940.ENSOARP00000016771; -.
DR PaxDb; 9940-ENSOARP00000016771; -.
DR Ensembl; ENSOART00000017009.1; ENSOARP00000016771.1; ENSOARG00000015632.1.
DR GeneID; 100294605; -.
DR KEGG; oas:100294605; -.
DR CTD; 1462; -.
DR eggNOG; ENOG502QRBE; Eukaryota.
DR HOGENOM; CLU_000303_1_1_1; -.
DR OMA; ELTWKPE; -.
DR OrthoDB; 5323609at2759; -.
DR Proteomes; UP000002356; Chromosome 5.
DR Bgee; ENSOARG00000015632; Expressed in mitral valve and 51 other cell types or tissues.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0005509; F:calcium ion binding; IEA:InterPro.
DR GO; GO:0030246; F:carbohydrate binding; IEA:UniProtKB-KW.
DR GO; GO:0005540; F:hyaluronic acid binding; IEA:InterPro.
DR GO; GO:0007155; P:cell adhesion; IEA:InterPro.
DR CDD; cd00033; CCP; 1.
DR CDD; cd03588; CLECT_CSPGs; 1.
DR CDD; cd00054; EGF_CA; 2.
DR CDD; cd05901; Ig_Versican; 1.
DR CDD; cd03517; Link_domain_CSPGs_modules_1_3; 1.
DR CDD; cd03520; Link_domain_CSPGs_modules_2_4; 1.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR Gene3D; 2.60.40.10; Immunoglobulins; 1.
DR Gene3D; 2.10.25.10; Laminin; 2.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 3.
DR InterPro; IPR001304; C-type_lectin-like.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR018378; C-type_lectin_CS.
DR InterPro; IPR033987; CSPG_CTLD.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR001881; EGF-like_Ca-bd_dom.
DR InterPro; IPR000742; EGF-like_dom.
DR InterPro; IPR000152; EGF-type_Asp/Asn_hydroxyl_site.
DR InterPro; IPR018097; EGF_Ca-bd_CS.
DR InterPro; IPR007110; Ig-like_dom.
DR InterPro; IPR036179; Ig-like_dom_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR003599; Ig_sub.
DR InterPro; IPR013106; Ig_V-set.
DR InterPro; IPR000538; Link_dom.
DR InterPro; IPR035976; Sushi/SCR/CCP_sf.
DR InterPro; IPR000436; Sushi_SCR_CCP_dom.
DR PANTHER; PTHR22804; AGGRECAN/VERSICAN PROTEOGLYCAN; 1.
DR Pfam; PF00008; EGF; 2.
DR Pfam; PF00059; Lectin_C; 1.
DR Pfam; PF00084; Sushi; 1.
DR Pfam; PF07686; V-set; 1.
DR Pfam; PF00193; Xlink; 2.
DR PRINTS; PR01265; LINKMODULE.
DR SMART; SM00032; CCP; 1.
DR SMART; SM00034; CLECT; 1.
DR SMART; SM00181; EGF; 2.
DR SMART; SM00179; EGF_CA; 2.
DR SMART; SM00409; IG; 1.
DR SMART; SM00406; IGv; 1.
DR SMART; SM00445; LINK; 2.
DR SUPFAM; SSF56436; C-type lectin-like; 3.
DR SUPFAM; SSF57535; Complement control module/SCR domain; 1.
DR SUPFAM; SSF57196; EGF/Laminin; 1.
DR SUPFAM; SSF48726; Immunoglobulin; 1.
DR PROSITE; PS00010; ASX_HYDROXYL; 1.
DR PROSITE; PS00615; C_TYPE_LECTIN_1; 1.
DR PROSITE; PS50041; C_TYPE_LECTIN_2; 1.
DR PROSITE; PS00022; EGF_1; 2.
DR PROSITE; PS01186; EGF_2; 1.
DR PROSITE; PS50026; EGF_3; 2.
DR PROSITE; PS01187; EGF_CA; 1.
DR PROSITE; PS50835; IG_LIKE; 1.
DR PROSITE; PS01241; LINK_1; 1.
DR PROSITE; PS50963; LINK_2; 2.
DR PROSITE; PS50923; SUSHI; 1.
PE 3: Inferred from homology;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00076};
KW EGF-like domain {ECO:0000256|ARBA:ARBA00022536, ECO:0000256|PROSITE-
KW ProRule:PRU00076}; Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Immunoglobulin domain {ECO:0000256|ARBA:ARBA00023319};
KW Lectin {ECO:0000256|ARBA:ARBA00022734};
KW Proteoglycan {ECO:0000256|ARBA:ARBA00022974};
KW Reference proteome {ECO:0000313|Proteomes:UP000002356};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Secreted {ECO:0000256|ARBA:ARBA00022525}; Signal {ECO:0000256|SAM:SignalP};
KW Sushi {ECO:0000256|ARBA:ARBA00022659, ECO:0000256|PROSITE-
KW ProRule:PRU00302}.
FT SIGNAL 1..20
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 21..3375
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5004869736"
FT DOMAIN 33..147
FT /note="Ig-like"
FT /evidence="ECO:0000259|PROSITE:PS50835"
FT DOMAIN 151..246
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 252..348
FT /note="Link"
FT /evidence="ECO:0000259|PROSITE:PS50963"
FT DOMAIN 3068..3104
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 3106..3142
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 3155..3269
FT /note="C-type lectin"
FT /evidence="ECO:0000259|PROSITE:PS50041"
FT DOMAIN 3273..3333
FT /note="Sushi"
FT /evidence="ECO:0000259|PROSITE:PS50923"
FT REGION 417..439
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 556..575
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 588..623
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 817..865
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1043..1081
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1219..1245
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1414..1539
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1707..1760
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1795..1818
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1954..1984
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2038..2087
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2099..2138
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2312..2389
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2593..2614
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2815..2898
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 3350..3375
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 600..617
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 850..864
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1219..1233
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1430..1469
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1508..1532
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1722..1760
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1956..1984
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2041..2058
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2059..2073
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2115..2138
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2316..2332
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2341..2389
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2831..2850
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2866..2882
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 3357..3375
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT DISULFID 197..218
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 295..316
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00323"
FT DISULFID 3094..3103
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 3132..3141
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 3275..3318
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
FT DISULFID 3304..3331
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00302"
SQ SEQUENCE 3375 AA; 368506 MW; 98977C685F13D130 CRC64;
MLINIKSILW MCSTLIAAHA LQKVNMEKSP SVKGSLSGKV NLPCHFSTMP TLPPSYNTTS
EFLRIKWSKI ELDKTGKDLK ETTVLVAQNG NIKIGQDYKG RVSVPTHPED VGDASLTMVK
LLASDAGRYR CDVMYGIEDT QDTVSLTVEG VVFHYRAATS RYTLNFEMAQ KACVDIGAVI
ATPEQLHAAY EDGFEQCDAG WLSDQTVRYP IRVPREGCYG DMMGKEGVRT YGFRAPHETY
DVYCYVDHLD GDVFHITAPN KFTFEEAGEE CKNQDARLAT VGELQAAWRN GFDRCDYGWL
LDASVRHPVT VARAQCGGGL LGVRTLYRFE NQTGFPPPDS RFDAYCFKPK QNISEATTIE
LNMLAETVSP TLLEELQMGP DRTTPIVPLI TELPVTTTKV PAIGNIVNIE QKSTVQPLTS
THRSAAESLP PDGSTKKPWD MDYYSPSASG PLGEPDISEI KEEMPQSTTV ISHHAPESWD
SVKEDLQRKE SVTQVEQIEV GPLVTSMEIS KHMPSKEFTV TATPFVSTTM TLESKTAKKA
ISTVSEPVTT SHYGFTLRED DGQDRTSTVR SGQRTSIFSQ IPEVITVSKT SEDTTRSQLE
DVESVPASTV VSPDSDGSSM DHRQEKQTHG RIIEDFLGQY VSTTPFPSQH HTEVELFPFS
GDKRLVEGTS TVISPSPRTG RESTETLRPA MRTVTYTNDE IQEKITKDSF IEKIEEEGFS
GMKFPTASPE QIHHTEYSVG MTKSFESPAL TTTTKLGVIP TEATGVEEDF TTPGGLETDG
YQDTTKYEEG IATVHLIQST LNVEVVTVSK WSLDEDNTTS KPLWSTEHVG SPKLPPALIT
TTGVSGKDKE MPSLTEDGRD EFTHIPGSIQ KPLEEFTEED TIDHEKFIVR FQPTTSIATT
EKSTLRDSIT EERVAPFTST EVRVTHATTE GSALDEGQDV DVSKPLSTVP QFAHPSDVEG
STFVNYSSTQ EPTTYVDTSH TIPLSVIPKT EWGVLVPSLP SEGEVLGEPS QDIRVINQTH
FEASIYPETV RTTTEIIQEA TGEDFPWKEQ TPEKPVSPPR STTDTAKETL PLDEQESDGS
AYTVFEDRSV TGSDRVSVLV TTPIGKFEQS TSFPPGAVTK AKTDDVVTLT PTTGSKVTFS
PWPEQKYETE GTSPRGFVSP FSIGVTQVIE ETTPEKREKT SLDYTDLGSG LFEKPKATEL
PEFSTVKATV PSDITAAFSS ADRFHTPSSS TEKPPLIDRE PDEETTSDMV IIGESTSRVP
PTTLEDIVAK ETETDIDREY FTTSSTSTTQ PTRPPTVEGK EAFGPQAFST PEPPAGTKFH
PDINVYIIEV RENKTGRMSD LSVIGHPIDS ESKEDEPCSE ETDPEHDLIA EILPELLGML
HSEEDEEDEE CANATDVTTT PSVQYINGKH VVTTVPKDPE AAEARRGQFE SVAPSQNFSD
SSESDSHQFI ITHAGLSTAM QPNESKETTE SLEITWRPEI YPETAEPFSS GEPDIFPTAS
IHEGEATEGP DSVTEKSPEL DHRAHEHTES VPLFPEESSG DAAIDQESQK VIFSGATEGT
FGEEAEESST THTPSIVASS VSAPVSEDAL FISTGTPQSD EPLSTVESWV EITPRHTVEF
SGSPSIPIPE GSGEAEEDKD KIFAMITDLS QRNTTDSLVT LDTSKIMITE SLLGVPATTV
YSTSERVSAA VPTKFVRETD TYEWVFSPPL EETRKDEKGT TGTASTAEVH SPTQRSDQFV
SPSELESSSD TPPDDSTAAP RKSFLSLMTA TQSERETTSS TVVFTETEVL DNLAAQTTDP
SLSSQPGVLE GSPTVPGSPT SLFMEQGSGE AALDPETTTV SSPALNLEPE ILAEEEAAGT
WSPHVETVFP FEPTEQVLST AVDREVAEII SQTSKENLVS ETSGEPTHRA EIKGFSTDFP
LEEDFSGDFR EYSTVSYPIT KEEIVMMEGS GDAAFKDTQM SPSVTPTSDL SNHSADSEEP
GSTLVSTSAF PWEEFTASAE GSGEQLLSVS SSVDQVFPSA VGKASGTDSP FIDQRLGEGA
INETDQRSTI LPTAEAESTK ASTKEEEVKE NHTVSMDFPP TAEPDELWPR QEVNPVRQGN
GSEIVSEEKT QEQESFEPLQ SSVAPEQTTF DSQTFPEPGL RTTGYFALTT KKTYSTDEKM
EEEVISLADV STPTLDSKGS ALYTTLPEVT EKSHFFLATA SVTESVPAES VIAGSTIKEE
ESIKSFPKVT SPIIKESDTD LLFSGLGSGE EVLPTIGSVN FTEIEQVLST LYPPTSQVQS
LEASILNDTS GDYEGMENVA NENEVRPLIS KTDSVFEDGE TASSTTSPEI LSDARTEGPF
TAPLTFSTGP GHPQNQTHGR AEEIQPSRPQ PLTDQVSSEN SLTAGTKETA TSSADFLART
YDLEMAKGFV TSTPKPSDLF YEHSGEGSGE LDAVDAEVHA SGMTQATRQG STTFVSDRPL
EKHPKVPSVE AVTVDGFPTV SMVLPLHPEQ NEGSPGAAST LASTASYERA TEGAADSFQD
HFGGFKDSTL KPDRRKATES IIIDLDKDDK DLILTMTEST ILEILPELTS DKNTIIDIDH
TKPIYEDILG MQTDLDPEAP SGPPESSEES TQVQEKYEAA VNLSSTEENF EASGDILLVN
YTQATPESKA PEDRNPLDHT GFIFTTGIPI LSSETELDVL LPTATSLPIP SKSATVNPES
KTEDKTLEDI FESSTLSDGQ AIADQSEVIS TLGYLERTQN EDEEKKYVSP SFQPEFSSGA
EEALIDPTPY VSIGTTYLTA QSLTEAPDVM EGARLPDSIG TSTISAFAEL LSQTPSSPPL
SVHLGSGDSE HSEDLQPSAL PSTDASTPPV PSGELANIEA TFKPSSEEGF HTTEPPSLSL
DTEPSEDENK PKPLEPTEAS ATELIAQEEI EIFQNSHSTT SVQVSGETVK VFPSIETVVT
AASERKLEGA TLRPHSTSAS VMHGAEAVVV PQPSPQTSEQ PTIPSPLEIN PETQAALIRG
EDSTVAAPKL QVPTRKLDSN KQATLSTTEL NTELATPSFP PLETSNETSF LIGINEESVE
GTAIYLPGPD RCKTNPCLNG GTCYATETSY VCTCVPGYSG DRCELDFDEC HSNPCRNGAT
CIDGFNTFRC LCLPSYVGAL CEQDTETCDY GWHKFQGQCY KYFAHRRTWD AAERECRLQG
AHLTSILSHE EQMFVNRVGH DYQWIGLNDK MFEHDFRWTD GSTLQYENWR PNQPDSFFST
GEDCVVIIWH ENGQWNDVPC NYHLTYTCKK GTVACGQPPV VENAKTFGKM KPRYEINSLI
RYHCKDGFIQ RHLPTIRCLG NGRWAMPKIT CLNPSAYQRT YSQKYFKNSS SAKDNSINTS
KHDHRWSRRW QESRR
//