ID F6VVM5_HORSE Unreviewed; 1649 AA.
AC F6VVM5;
DT 27-JUL-2011, integrated into UniProtKB/TrEMBL.
DT 13-SEP-2023, sequence version 3.
DT 27-MAR-2024, entry version 78.
DE SubName: Full=Collagen type IV alpha 1 chain {ECO:0000313|Ensembl:ENSECAP00000018554.3};
GN Name=COL4A1 {ECO:0000313|Ensembl:ENSECAP00000018554.3,
GN ECO:0000313|VGNC:VGNC:16738};
OS Equus caballus (Horse).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Perissodactyla; Equidae; Equus.
OX NCBI_TaxID=9796 {ECO:0000313|Ensembl:ENSECAP00000018554.3, ECO:0000313|Proteomes:UP000002281};
RN [1] {ECO:0000313|Ensembl:ENSECAP00000018554.3, ECO:0000313|Proteomes:UP000002281}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000018554.3,
RC ECO:0000313|Proteomes:UP000002281};
RX PubMed=19892987; DOI=10.1126/science.1178158;
RG Broad Institute Genome Sequencing Platform;
RG Broad Institute Whole Genome Assembly Team;
RA Wade C.M., Giulotto E., Sigurdsson S., Zoli M., Gnerre S., Imsland F.,
RA Lear T.L., Adelson D.L., Bailey E., Bellone R.R., Bloecker H., Distl O.,
RA Edgar R.C., Garber M., Leeb T., Mauceli E., MacLeod J.N., Penedo M.C.T.,
RA Raison J.M., Sharpe T., Vogel J., Andersson L., Antczak D.F., Biagi T.,
RA Binns M.M., Chowdhary B.P., Coleman S.J., Della Valle G., Fryc S.,
RA Guerin G., Hasegawa T., Hill E.W., Jurka J., Kiialainen A., Lindgren G.,
RA Liu J., Magnani E., Mickelson J.R., Murray J., Nergadze S.G., Onofrio R.,
RA Pedroni S., Piras M.F., Raudsepp T., Rocchi M., Roeed K.H., Ryder O.A.,
RA Searle S., Skow L., Swinburne J.E., Syvaenen A.C., Tozaki T., Valberg S.J.,
RA Vaudin M., White J.R., Zody M.C., Lander E.S., Lindblad-Toh K.;
RT "Genome sequence, comparative analysis, and population genetics of the
RT domestic horse.";
RL Science 326:865-867(2009).
RN [2] {ECO:0000313|Ensembl:ENSECAP00000018554.3}
RP IDENTIFICATION.
RC STRAIN=Thoroughbred {ECO:0000313|Ensembl:ENSECAP00000018554.3};
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- FUNCTION: Type IV collagen is the major structural component of
CC glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork
CC together with laminins, proteoglycans and entactin/nidogen.
CC {ECO:0000256|ARBA:ARBA00003696}.
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000256|ARBA:ARBA00004370}.
CC Secreted, extracellular space, extracellular matrix, basement membrane
CC {ECO:0000256|ARBA:ARBA00004302}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR STRING; 9796.ENSECAP00000018554; -.
DR PaxDb; 9796-ENSECAP00000018554; -.
DR Ensembl; ENSECAT00000022446.3; ENSECAP00000018554.3; ENSECAG00000019838.4.
DR VGNC; VGNC:16738; COL4A1.
DR GeneTree; ENSGT00940000157678; -.
DR HOGENOM; CLU_002023_0_0_1; -.
DR InParanoid; F6VVM5; -.
DR OMA; MIPPCPQ; -.
DR OrthoDB; 2882192at2759; -.
DR TreeFam; TF316865; -.
DR Proteomes; UP000002281; Chromosome 17.
DR Bgee; ENSECAG00000019838; Expressed in chorionic villus and 21 other cell types or tissues.
DR GO; GO:0005604; C:basement membrane; IBA:GO_Central.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005615; C:extracellular space; IBA:GO_Central.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0030020; F:extracellular matrix structural constituent conferring tensile strength; IBA:GO_Central.
DR GO; GO:0030198; P:extracellular matrix organization; IBA:GO_Central.
DR Gene3D; 2.170.240.10; Collagen IV, non-collagenous; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR PANTHER; PTHR24023:SF1019; COLLAGEN; 1.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 16.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; C-type lectin-like; 2.
DR PROSITE; PS51403; NC1_IV; 1.
PE 4: Predicted;
KW Basement membrane {ECO:0000256|ARBA:ARBA00022869};
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000002281};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..25
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 26..1649
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5040117352"
FT DOMAIN 1425..1649
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT REGION 28..1420
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 89..103
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 120..134
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 160..195
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 261..279
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 345..363
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 391..405
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 769..797
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1321..1335
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1395..1414
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1649 AA; 158892 MW; FFCA907EE4AEEE6D CRC64;
MGPRLGVWLL LLPAALLLHE ERSRAAAKGE RGLPGLQGVI GFPGMQGPEG PQGPPGQKGD
TGEPGLPGTK GTRGPPGASG YPGNPGLPGI PGQDGPPGPP GIPGCNGTKG ERGPLGPPGL
PGFTGTPGPP GLPGMKGDPG EILGHVPGTL LKGERGFPGP PGTPGSPGLP GLQGPVGPPG
FTGPPGPPGP PGPPGEKGQM GLSFQGPKGE KGDQGVSGPP GVPGQAQVKE KGDFATKGEK
GQKGEPGFQG MPGVGEKGEP GKPGPRGKPG KDGEKGEKGS TGFPGDSGYP GLPGREGFKG
DKGEAGPPGP PGIVIGTGPL GEKGERGFPG TPGLRGEPGP KGFPGLQGQP GPPGFPVPGQ
PGAPGFPGER GEKGDQGFPG RSLPGPSGRD GLQGPPGPPG PPGRPGHTNG IVECQPGPPG
DQGPPGIPGQ PGLTGEVGEK GQKGESCLIC DSTGLRGPPG PQGPPGEIGF PGQPGAKGDR
GLPGRDGLEG LPGPQGAPGL MGQPGAKGEP GEIYFDIRLK GDKGDPGFPG QPGMPGRAGS
PGRDGHPGLP GPKGSPGSVG LKGERGPPGG VGFPGSRGDI GPPGPPGFGP IGPIGDKGQA
GFPGSPGSPG LPGPKGEAGK VVPLPGPPGA EGLPGSPGFQ GPQGDRGFPG TPGRPGLPGE
KGAVGQPGIG FPGPPGPKGV DGLPGDVGPP GNPGRQGFSG LPGSPGVPGQ KGEPGIGLPG
LKGSPGIPGI PGTPGEKGSI GGPGIPGEHG AIGPPGLQGI RGDPGPPGLQ GPKGAPGVPG
IGPPGVMGPP GGQGPPGSSG PPGVKGEKGF PGFPGLDMPG PKGDKGTQGL PGLTGQSGLP
GLPGQQGTPG LPGFPGPKGE MGVMGTPGQP GSPGPAGVPG LPGEKGDHGF PGSSGPRGDP
GFKGDKGDVG LPGKPGSMDK VDMGSMKGEK GDQGEKGQIG PTGDKGSRGD PGTPGVPGKD
GQAGHPGQPG PKGDPGVGGT PGAPGLPGPK GSVGGMGLPG TPGEKGVPGI PGPQGVPGLP
GEKGAKGEKG QAGLPGIGIP GRPGDKGDQG LAGFPGTPGE KGEKGSAGIP GMPGSPGPKG
LPGSVGYPGS PGLPGEKGDK GLPGLDGIPG IKGEAGLPGK PGPSGPAGQK GEPGSDGIPG
SAGEKGEPGL PGRGFPGFPG TKGEKGSKGD VGFPGLAGSP GIPGSKGEPG FLGPPGPQGQ
PGLPGAPGHA VEGPKGDRGP QGQPGLPGLP GPMGPPGLPG LDGLKGDKGN PGWPGAPGAP
GPKGDPGFQG MPGIGGSPGI TGSKGDMGPP GVPGFQGQKG LPGLQGVKGD QGDQGFPGTK
GLPGPPGPPG PYDIIKGEPG LPGPEGPAGL KGLQGPPGPK GQQGVTGLVG LPGPPGIPGF
DGAPGQKGET GPFGPPGPRG FPGPPGPDGL PGSMGPPGTP SVDHGFLVTR HSQTIDDPQC
PPGTKILYHG YSLLYVQGNE RAHGQDLGTA GSCLRKFSTM PFLFCNINNV CNFASRNDYS
YWLSTPEPMP MSMAPITGDN IRPFISRCAV CEAPAMVMAV HSQTIQIPQC PSGWSSLWIG
YSFVMHTSAG AEGSGQALAS PGSCLEEFRS APFIECHGRG TCNYYANAYS FWLATIERSE
MFKKPTPSTL KAGELRTHVS RCQVCMRRT
//