ID F6SSG3_HORSE Unreviewed; 1454 AA.
AC F6SSG3;
DT 27-JUL-2011, integrated into UniProtKB/TrEMBL.
DT 13-SEP-2023, sequence version 3.
DT 27-MAR-2024, entry version 59.
DE SubName: Full=Collagen type I alpha 1 chain {ECO:0000313|Ensembl:ENSECAP00000015810.3};
GN Name=COL1A1 {ECO:0000313|VGNC:VGNC:16730};
OS Equus caballus (Horse).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Perissodactyla; Equidae; Equus.
OX NCBI_TaxID=9796 {ECO:0000313|Ensembl:ENSECAP00000015810.3, ECO:0000313|Proteomes:UP000002281};
RN [1] {ECO:0000313|Ensembl:ENSECAP00000015810.3, ECO:0000313|Proteomes:UP000002281}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000015810.3,
RC ECO:0000313|Proteomes:UP000002281};
RX PubMed=19892987; DOI=10.1126/science.1178158;
RG Broad Institute Genome Sequencing Platform;
RG Broad Institute Whole Genome Assembly Team;
RA Wade C.M., Giulotto E., Sigurdsson S., Zoli M., Gnerre S., Imsland F.,
RA Lear T.L., Adelson D.L., Bailey E., Bellone R.R., Bloecker H., Distl O.,
RA Edgar R.C., Garber M., Leeb T., Mauceli E., MacLeod J.N., Penedo M.C.T.,
RA Raison J.M., Sharpe T., Vogel J., Andersson L., Antczak D.F., Biagi T.,
RA Binns M.M., Chowdhary B.P., Coleman S.J., Della Valle G., Fryc S.,
RA Guerin G., Hasegawa T., Hill E.W., Jurka J., Kiialainen A., Lindgren G.,
RA Liu J., Magnani E., Mickelson J.R., Murray J., Nergadze S.G., Onofrio R.,
RA Pedroni S., Piras M.F., Raudsepp T., Rocchi M., Roeed K.H., Ryder O.A.,
RA Searle S., Skow L., Swinburne J.E., Syvaenen A.C., Tozaki T., Valberg S.J.,
RA Vaudin M., White J.R., Zody M.C., Lander E.S., Lindblad-Toh K.;
RT "Genome sequence, comparative analysis, and population genetics of the
RT domestic horse.";
RL Science 326:865-867(2009).
RN [2] {ECO:0000313|Ensembl:ENSECAP00000015810.3}
RP IDENTIFICATION.
RC STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000015810.3};
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000256|ARBA:ARBA00003647}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR Ensembl; ENSECAT00000019323.4; ENSECAP00000015810.3; ENSECAG00000013693.4.
DR VGNC; VGNC:16730; COL1A1.
DR GeneTree; ENSGT00940000156584; -.
DR OrthoDB; 2970887at2759; -.
DR Proteomes; UP000002281; Chromosome 11.
DR Bgee; ENSECAG00000013693; Expressed in synovial membrane of synovial joint and 21 other cell types or tissues.
DR ExpressionAtlas; F6SSG3; baseline.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001007; VWF_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 9.
DR Pfam; PF00093; VWC; 1.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00214; VWC; 1.
DR SUPFAM; SSF57603; FnI-like domain; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
PE 1: Evidence at protein level;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Proteomics identification {ECO:0007829|PeptideAtlas:F6SSG3};
KW Reference proteome {ECO:0000313|Proteomes:UP000002281};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..22
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 23..1454
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5040425153"
FT DOMAIN 38..96
FT /note="VWFC"
FT /evidence="ECO:0000259|PROSITE:PS50184"
FT DOMAIN 1219..1454
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 98..1204
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 122..156
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 182..225
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 418..432
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 548..571
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 832..846
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 878..892
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1166..1185
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1454 AA; 138200 MW; F3B56544F324DC3E CRC64;
MFSFVDLRLL LLLAATALLT HGQEEGQEEG QEEDIPAVTC IQDGLRYHDR AVWKPEPCRV
CICDNGNVLC DDVICEDTKN CPGASVPKDE CCPVCPEGQV SPTDDQTTGV EGPKGDTGPR
GPRGPAGPPG RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA PQLSYGYDEK SAGISVPGPM
GPSGPRGLPG PPGAPGPQGF QGPPGEPGEP GASGPMGPRG PPGPPGKNGD DGEAGKPGRP
GERGPPGPQG ARGLPGTAGL PGMKGHRGFS GLDGAKGDAG PAGPKGEPGS PGENGAPGQM
GPRGLPGERG RPGAPGPAGA RGNDGATGAA GPPGPTGPAG PPGFPGAVGA KGEAGPQGAR
GSEGPQGVRG EPGPPGPAGA AGPAGNPGAD GQPGAKGANG APGIAGAPGF PGARGPSGPQ
GPSGPPGPKG NSGEPGAPGN KGDTGAKGEP GPTGIQGPPG PAGEEGKRGA RGEPGPTGLP
GPPGERGGPG ARGFPGADGV AGPKGPAGER GAPGPAGPKG SPGEAGRPGE AGLPGAKGLT
GSPGSPGPDG KTGPPGPAGQ DGRPGPPGPP GARGQAGVMG FPGPKGAAGE PGKAGERGVP
GPPGAVGPAG KDGEAGAQGP PGPAGPAGER GEQGPAGSPG FQGLPGPAGP PGESGKPGEQ
GVPGDLGAPG PSGARGERGF PGERGVQGPP GPAGPRGSNG APGNDGAKGD AGAPGAPGSQ
GAPGLQGMPG ERGAAGLPGP KGDRGDAGPK GADGSPGKDG VRGLTGPIGP PGPAGAPGDK
GETGPSGPAG PTGARGAPGD RGEPGPPGPA GFAGPPGADG QPGAKGDAGP PGPAGPAGPP
GPIGSVGAPG PKGARGSAGP PGATGFPGAA GRVGPPGPSG NAGPPGPPGP VGKEGGKGPR
GETGPAGRPG EAGPPGPPGP AGEKGSPGAD GPAGAPGTPG PQGIAGQRGV VGLPGQRGER
GFPGLPGPSG EPGKQGPSGA SGERGPPGPV GPPGLAGPPG ESGREGSPGA EGSPGRDGSP
GPKGDRGETG PAGPPGAPGA PGAPGPVGPA GKSGDRGEAG PAGPAGPIGP VGARGPAGPQ
GPRGDKGETG EQGDRGIKGH RGFSGLQGPP GPPGSPGEQG PSGASGPAGP RGPPGSAGAP
GKDGLNGLPG PIGPPGPRGR TGDAGPVGPP GPPGPPGPPG PPSGGFDFSF LPQPPQEKSH
DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR DLKMCHSDWK
SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPQVAQK NWYISKNPKD KRHVWYGESM
TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ TGNLKKALLL
QGSNEIEIRA EGNSRFTYSV TYDGCTSHTG AWGKTVIEYK TTKTSRLPII DVAPLDIGAP
DQEFGIDIGP VCFL
//