GenomeNet

Database: UniProt
Entry: F6SSG3_HORSE
LinkDB: F6SSG3_HORSE
Original site: F6SSG3_HORSE 
ID   F6SSG3_HORSE            Unreviewed;      1454 AA.
AC   F6SSG3;
DT   27-JUL-2011, integrated into UniProtKB/TrEMBL.
DT   13-SEP-2023, sequence version 3.
DT   27-MAR-2024, entry version 59.
DE   SubName: Full=Collagen type I alpha 1 chain {ECO:0000313|Ensembl:ENSECAP00000015810.3};
GN   Name=COL1A1 {ECO:0000313|VGNC:VGNC:16730};
OS   Equus caballus (Horse).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Laurasiatheria; Perissodactyla; Equidae; Equus.
OX   NCBI_TaxID=9796 {ECO:0000313|Ensembl:ENSECAP00000015810.3, ECO:0000313|Proteomes:UP000002281};
RN   [1] {ECO:0000313|Ensembl:ENSECAP00000015810.3, ECO:0000313|Proteomes:UP000002281}
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC   STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000015810.3,
RC   ECO:0000313|Proteomes:UP000002281};
RX   PubMed=19892987; DOI=10.1126/science.1178158;
RG   Broad Institute Genome Sequencing Platform;
RG   Broad Institute Whole Genome Assembly Team;
RA   Wade C.M., Giulotto E., Sigurdsson S., Zoli M., Gnerre S., Imsland F.,
RA   Lear T.L., Adelson D.L., Bailey E., Bellone R.R., Bloecker H., Distl O.,
RA   Edgar R.C., Garber M., Leeb T., Mauceli E., MacLeod J.N., Penedo M.C.T.,
RA   Raison J.M., Sharpe T., Vogel J., Andersson L., Antczak D.F., Biagi T.,
RA   Binns M.M., Chowdhary B.P., Coleman S.J., Della Valle G., Fryc S.,
RA   Guerin G., Hasegawa T., Hill E.W., Jurka J., Kiialainen A., Lindgren G.,
RA   Liu J., Magnani E., Mickelson J.R., Murray J., Nergadze S.G., Onofrio R.,
RA   Pedroni S., Piras M.F., Raudsepp T., Rocchi M., Roeed K.H., Ryder O.A.,
RA   Searle S., Skow L., Swinburne J.E., Syvaenen A.C., Tozaki T., Valberg S.J.,
RA   Vaudin M., White J.R., Zody M.C., Lander E.S., Lindblad-Toh K.;
RT   "Genome sequence, comparative analysis, and population genetics of the
RT   domestic horse.";
RL   Science 326:865-867(2009).
RN   [2] {ECO:0000313|Ensembl:ENSECAP00000015810.3}
RP   IDENTIFICATION.
RC   STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000015810.3};
RG   Ensembl;
RL   Submitted (NOV-2023) to UniProtKB.
CC   -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC       forming collagen). {ECO:0000256|ARBA:ARBA00003647}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   Ensembl; ENSECAT00000019323.4; ENSECAP00000015810.3; ENSECAG00000013693.4.
DR   VGNC; VGNC:16730; COL1A1.
DR   GeneTree; ENSGT00940000156584; -.
DR   OrthoDB; 2970887at2759; -.
DR   Proteomes; UP000002281; Chromosome 11.
DR   Bgee; ENSECAG00000013693; Expressed in synovial membrane of synovial joint and 21 other cell types or tissues.
DR   ExpressionAtlas; F6SSG3; baseline.
DR   GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR   GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR   Gene3D; 2.60.120.1000; -; 1.
DR   Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR   InterPro; IPR008160; Collagen.
DR   InterPro; IPR000885; Fib_collagen_C.
DR   InterPro; IPR001007; VWF_dom.
DR   PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR   PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR   Pfam; PF01410; COLFI; 1.
DR   Pfam; PF01391; Collagen; 9.
DR   Pfam; PF00093; VWC; 1.
DR   SMART; SM00038; COLFI; 1.
DR   SMART; SM00214; VWC; 1.
DR   SUPFAM; SSF57603; FnI-like domain; 1.
DR   PROSITE; PS51461; NC1_FIB; 1.
DR   PROSITE; PS01208; VWFC_1; 1.
DR   PROSITE; PS50184; VWFC_2; 1.
PE   1: Evidence at protein level;
KW   Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW   Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW   Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW   Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW   Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW   Proteomics identification {ECO:0007829|PeptideAtlas:F6SSG3};
KW   Reference proteome {ECO:0000313|Proteomes:UP000002281};
KW   Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT   SIGNAL          1..22
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT   CHAIN           23..1454
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT                   /id="PRO_5040425153"
FT   DOMAIN          38..96
FT                   /note="VWFC"
FT                   /evidence="ECO:0000259|PROSITE:PS50184"
FT   DOMAIN          1219..1454
FT                   /note="Fibrillar collagen NC1"
FT                   /evidence="ECO:0000259|PROSITE:PS51461"
FT   REGION          98..1204
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        122..156
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        182..225
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        418..432
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        548..571
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        832..846
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        878..892
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        1166..1185
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ   SEQUENCE   1454 AA;  138200 MW;  F3B56544F324DC3E CRC64;
     MFSFVDLRLL LLLAATALLT HGQEEGQEEG QEEDIPAVTC IQDGLRYHDR AVWKPEPCRV
     CICDNGNVLC DDVICEDTKN CPGASVPKDE CCPVCPEGQV SPTDDQTTGV EGPKGDTGPR
     GPRGPAGPPG RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA PQLSYGYDEK SAGISVPGPM
     GPSGPRGLPG PPGAPGPQGF QGPPGEPGEP GASGPMGPRG PPGPPGKNGD DGEAGKPGRP
     GERGPPGPQG ARGLPGTAGL PGMKGHRGFS GLDGAKGDAG PAGPKGEPGS PGENGAPGQM
     GPRGLPGERG RPGAPGPAGA RGNDGATGAA GPPGPTGPAG PPGFPGAVGA KGEAGPQGAR
     GSEGPQGVRG EPGPPGPAGA AGPAGNPGAD GQPGAKGANG APGIAGAPGF PGARGPSGPQ
     GPSGPPGPKG NSGEPGAPGN KGDTGAKGEP GPTGIQGPPG PAGEEGKRGA RGEPGPTGLP
     GPPGERGGPG ARGFPGADGV AGPKGPAGER GAPGPAGPKG SPGEAGRPGE AGLPGAKGLT
     GSPGSPGPDG KTGPPGPAGQ DGRPGPPGPP GARGQAGVMG FPGPKGAAGE PGKAGERGVP
     GPPGAVGPAG KDGEAGAQGP PGPAGPAGER GEQGPAGSPG FQGLPGPAGP PGESGKPGEQ
     GVPGDLGAPG PSGARGERGF PGERGVQGPP GPAGPRGSNG APGNDGAKGD AGAPGAPGSQ
     GAPGLQGMPG ERGAAGLPGP KGDRGDAGPK GADGSPGKDG VRGLTGPIGP PGPAGAPGDK
     GETGPSGPAG PTGARGAPGD RGEPGPPGPA GFAGPPGADG QPGAKGDAGP PGPAGPAGPP
     GPIGSVGAPG PKGARGSAGP PGATGFPGAA GRVGPPGPSG NAGPPGPPGP VGKEGGKGPR
     GETGPAGRPG EAGPPGPPGP AGEKGSPGAD GPAGAPGTPG PQGIAGQRGV VGLPGQRGER
     GFPGLPGPSG EPGKQGPSGA SGERGPPGPV GPPGLAGPPG ESGREGSPGA EGSPGRDGSP
     GPKGDRGETG PAGPPGAPGA PGAPGPVGPA GKSGDRGEAG PAGPAGPIGP VGARGPAGPQ
     GPRGDKGETG EQGDRGIKGH RGFSGLQGPP GPPGSPGEQG PSGASGPAGP RGPPGSAGAP
     GKDGLNGLPG PIGPPGPRGR TGDAGPVGPP GPPGPPGPPG PPSGGFDFSF LPQPPQEKSH
     DGGRYYRADD ANVVRDRDLE VDTTLKSLSQ QIENIRSPEG SRKNPARTCR DLKMCHSDWK
     SGEYWIDPNQ GCNLDAIKVF CNMETGETCV YPTQPQVAQK NWYISKNPKD KRHVWYGESM
     TDGFQFEYGG QGSDPADVAI QLTFLRLMST EASQNITYHC KNSVAYMDQQ TGNLKKALLL
     QGSNEIEIRA EGNSRFTYSV TYDGCTSHTG AWGKTVIEYK TTKTSRLPII DVAPLDIGAP
     DQEFGIDIGP VCFL
//
DBGET integrated database retrieval system