ID H3AGX1_LATCH Unreviewed; 1646 AA.
AC H3AGX1;
DT 18-APR-2012, integrated into UniProtKB/TrEMBL.
DT 18-APR-2012, sequence version 1.
DT 27-MAR-2024, entry version 75.
DE SubName: Full=Collagen, type IV, alpha 5 (Alport syndrome) {ECO:0000313|Ensembl:ENSLACP00000008892.1};
GN Name=COL4A5 {ECO:0000313|Ensembl:ENSLACP00000008892.1};
OS Latimeria chalumnae (Coelacanth).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Coelacanthiformes; Coelacanthidae; Latimeria.
OX NCBI_TaxID=7897 {ECO:0000313|Ensembl:ENSLACP00000008892.1, ECO:0000313|Proteomes:UP000008672};
RN [1] {ECO:0000313|Proteomes:UP000008672}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Wild caught {ECO:0000313|Proteomes:UP000008672};
RA Di Palma F., Alfoldi J., Johnson J., Berlin A., Gnerre S., Jaffe D.,
RA MacCallum I., Young S., Walker B.J., Lander E., Lindblad-Toh K.;
RT "The draft genome of Latimeria chalumnae.";
RL Submitted (AUG-2011) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSLACP00000008892.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- FUNCTION: Type IV collagen is the major structural component of
CC glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork
CC together with laminins, proteoglycans and entactin/nidogen.
CC {ECO:0000256|ARBA:ARBA00003696}.
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000256|ARBA:ARBA00004370}.
CC Secreted, extracellular space, extracellular matrix, basement membrane
CC {ECO:0000256|ARBA:ARBA00004302}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AFYH01091153; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091154; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091155; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091156; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091157; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091158; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091159; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091160; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091161; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01091162; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR STRING; 7897.ENSLACP00000008892; -.
DR Ensembl; ENSLACT00000008961.1; ENSLACP00000008892.1; ENSLACG00000007854.1.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000157678; -.
DR HOGENOM; CLU_002023_1_0_1; -.
DR InParanoid; H3AGX1; -.
DR OMA; SNNESCG; -.
DR TreeFam; TF316865; -.
DR Proteomes; UP000008672; Unassembled WGS sequence.
DR Bgee; ENSLACG00000007854; Expressed in pharyngeal gill and 5 other cell types or tissues.
DR GO; GO:0005604; C:basement membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.170.240.10; Collagen IV, non-collagenous; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1100; FIBRILLAR COLLAGEN NC1 DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 16.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; C-type lectin-like; 2.
DR PROSITE; PS51403; NC1_IV; 1.
PE 4: Predicted;
KW Basement membrane {ECO:0000256|ARBA:ARBA00022869};
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000008672};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}.
FT DOMAIN 1423..1646
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT REGION 1..445
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 464..1056
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1091..1421
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 37..51
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 61..90
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 112..126
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 166..183
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 220..240
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 259..279
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 388..425
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 493..507
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 632..661
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 726..742
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 790..808
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 843..857
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1197..1237
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1339..1365
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1396..1410
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1646 AA; 158420 MW; 383D66410E98112E CRC64;
ACHGCSSGSK CDCSGVKGEK GERGFPGLEG QMGLPGFPGP EGPPGPRGPK GSDGFPGPVG
PRGIRGPPGL PGFPGTPGLP GLPGQDGPPG PLGIPGCNGT KGEHGFPGSP GFPGLQGPPG
PPGIPGLKGE PGDFTTSMLP GQKGDPGIPG RPGVPGPQGS DGSPGPLGPR GPPGRPGSPG
SPGSPGPKGN MGLNFQGPKG EKGVPGLPGP PGPPGQIGEQ KRPDDIEYLK GDKGDKGDFG
KPGPRGSPGP PGLPDGGKGE KGEPGEAGKR GKPGKDGEPG RPGFDGLPGP PGNSGFPGQP
GLKGDKGDFG RPGPPGPVIP IPGVGDIVGP KGNIGFPGAQ GLKGERGPSG PSGFPGQPGA
PGQPSTGTPG APGFPGERGQ KGDAGPPGFS LPGPPGLDGQ PGLPGPPGPP GPSGPTEPED
ICQPGPPGLP GFQGERGFPG ERGIKGGKGE TCFNCIGDAI VGPLGPPGPP GSPGLAGSPG
FPGAKGQPGF PGGTGPMGPP GSPGIPGLPG PQGVKGDRGD AISLPGMKGD KGGPGFPGPP
GLPGLDGSPG RDGRPGIPGP KGDPGGFAFK GERGLPGDPG IPGAPGERGL VGPPGFGPQG
PPGEKGIQGV SGRSGAPGSP GAKGEPGVTE TQPGLPGPPG APGDPGSVGP PGDPGLPGQP
GLPGLPGAKG DPGLPGIGLP GQPGQKGFIG VPGPPGPPGT PGRSGFDGLP GTPGLPGVKG
DPGIGQPGPQ GPPGPPGPKG IFGPKGDPGF PGNPGQPGRS GFDGTPGAKG DPGPVGPSGP
QGTPGLPGIG GQGPPGPPGL PGPSGPPGFP GIAGEKGDPG SPGLDIPGPP GDRGNPGFPG
SSGPRGPSGP PGAPGRDGAP GVPGFKGEMG VMGTPGPPGT PGAPGSNGYP GPKGDDGLPG
QPGRPGSQGF KGDKGDRGVP GAPGSFSPPA GAKGQKGESG FPGLPGDPGP KGIQGVPGDL
GIPGKDGLPG LPGQTGLKGE PGFPGQPGLI GNPGLKGSIG EMGLPGPPGG KGSPGGPGRP
GQPGAPGGVG FKGVKGDPGL PGSGLPGFPG PKGERYLPKN KTTSFFNFSL GSNGTSVREI
LVFLHPVTGT PGIPGPKGID GPPGSSGLPG PSGTPGQPGQ PGGPGSSGEK GQPGRDGIPG
PAGIKGDAGQ PGIGRPGSPG LPGLPGPKGD AGFPGIPGAP GVPGLKGETG FPGMPGVQGP
LGPPGQPGRP LEGPKGNPGP SGPPGRSGLP GPEGPRGPPG FGGLKGEKGN PGLPGQSGFP
GLKGDPGTPG IPGIMSGPGM KGTIGPPGRA GTPGDIGNQG PPAISGCLSV ADNPRYTPGP
PGSPAPAAQP IIVKGDPGFP GPRGPPGQRG LSGPPGLPGP PGPLGSPGDD GTDGPPGFNG
PEGRKGDTGP SGQPGQRGFP GPPGPDGLQG PPGLPGSGSI AHGFLITRHS QTTEVPVCPF
GTSRIYDGFS LLYVQGNERA HGQDLGTAGS CLRRFSTMPF MFCNINNVCN FASRNDYSYW
LSTLQQMPMN MAPVNGESIK PFISRCTVCE APAMVIAVHS QTIQVPLCPL GWDSLWIGYS
FMMVRKLAEN YSEAIAALRS CVHKFKMAYF LFCQQKSNNI NEYTKFVDLF ISFVFLLLNR
KPQSETLKAG ELQMRVSRCQ VCMKRT
//