ID W5P983_SHEEP Unreviewed; 1692 AA.
AC W5P983;
DT 16-APR-2014, integrated into UniProtKB/TrEMBL.
DT 16-APR-2014, sequence version 1.
DT 27-MAR-2024, entry version 61.
DE RecName: Full=Collagen IV NC1 domain-containing protein {ECO:0000259|PROSITE:PS51403};
GN Name=COL4A2 {ECO:0000313|Ensembl:ENSOARP00000006991.1};
OS Ovis aries (Sheep).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Artiodactyla; Ruminantia; Pecora; Bovidae;
OC Caprinae; Ovis.
OX NCBI_TaxID=9940 {ECO:0000313|Ensembl:ENSOARP00000006991.1, ECO:0000313|Proteomes:UP000002356};
RN [1] {ECO:0000313|Ensembl:ENSOARP00000006991.1, ECO:0000313|Proteomes:UP000002356}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Texel {ECO:0000313|Ensembl:ENSOARP00000006991.1,
RC ECO:0000313|Proteomes:UP000002356};
RX PubMed=20809919; DOI=10.1111/j.1365-2052.2010.02100.x;
RA Archibald A.L., Cockett N.E., Dalrymple B.P., Faraut T., Kijas J.W.,
RA Maddox J.F., McEwan J.C., Hutton Oddy V., Raadsma H.W., Wade C., Wang J.,
RA Wang W., Xun X.;
RT "The sheep genome reference sequence: a work in progress.";
RL Anim. Genet. 41:449-453(2010).
RN [2] {ECO:0000313|Ensembl:ENSOARP00000006991.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- FUNCTION: Type IV collagen is the major structural component of
CC glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork
CC together with laminins, proteoglycans and entactin/nidogen.
CC {ECO:0000256|ARBA:ARBA00003696}.
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000256|ARBA:ARBA00004370}.
CC Secreted, extracellular space, extracellular matrix, basement membrane
CC {ECO:0000256|ARBA:ARBA00004302}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AMGL01016447; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016448; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016449; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016450; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016451; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016452; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016453; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016454; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016455; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AMGL01016456; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR STRING; 9940.ENSOARP00000006991; -.
DR PaxDb; 9940-ENSOARP00000006991; -.
DR Ensembl; ENSOART00000007098.1; ENSOARP00000006991.1; ENSOARG00000006515.1.
DR eggNOG; KOG3544; Eukaryota.
DR HOGENOM; CLU_002023_1_0_1; -.
DR OMA; ATEPIWS; -.
DR Proteomes; UP000002356; Chromosome 10.
DR Bgee; ENSOARG00000006515; Expressed in placentome of cotyledonary placenta and 52 other cell types or tissues.
DR GO; GO:0005604; C:basement membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.170.240.10; Collagen IV, non-collagenous; 2.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1100; FIBRILLAR COLLAGEN NC1 DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 14.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; C-type lectin-like; 2.
DR PROSITE; PS51403; NC1_IV; 2.
PE 4: Predicted;
KW Basement membrane {ECO:0000256|ARBA:ARBA00022869};
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000002356};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..33
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 34..1692
FT /note="Collagen IV NC1 domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5004870725"
FT DOMAIN 1489..1545
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT DOMAIN 1631..1692
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT REGION 58..935
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 962..997
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1018..1483
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 172..186
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 212..226
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 426..442
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 639..653
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1692 AA; 165432 MW; 8A07F44FFBD04BFF CRC64;
MDRELRAAAR PALRRWLLLG AVTVGLLAQS VLAGVKKLDV PCGGRDCSGG CQCYPEKGGR
GQPGPVGPQG YTGPPGLQGF PGLQGRKGDK GERGAPGITG PKGDVGPRGV SGFPGADGIP
GHPGQGGPRG PPGYDGCNGT MGDSGYAGPP GPGGFLGPRG PQGPKGQKGE PYALSSEDRD
KYRGEPGEPG LVGLQGPPGR PGPVGQMGPV GAPGRPGPPG PPGPKGQPGN RGLGFYGEKG
EKGDMGLQGP GGIPPDNGYV EKPTPGYELL PEQYKGEKGS QGEPGRIGVS LKGEEGVMGF
SGPRGAPGFD GEKGLPGQKX GGGVAGWGGP LPAPSSAGGE MGNPGPPGLP AYSPHPSVAK
GIRGEPAPGE PGARGEPGDP GLPGRPGTTI GDEDEKRGLP GEMGPKGFIG ERGSPALYPG
PPGAEWRVIL GPPCPPPALG GPPWGRLEAL PSGPPRAGGY FPGICPGSSG HKNQKGWKGD
AGDCKCADQF IRGPPGLPGP KGFAGANGQP GSKGSQGDPG PHGLPGFSGF KGAPGNVGPP
GPKGMKGDSR TITTKGERGQ PGVPGVPGLK GTDGIPGPPG LDGFHGLPGP PGDGIKGPRG
DAGQPGAPGT KGLPGERGPP GLGLPGLKGE RGFPGDAGLP GPPGFPGPPG LPGAPGQTDC
DSGVKRPIGA DGQETIQPGC VGGPKGSPGQ PGLPGPPGAK GLRGVPGFSG ADGVPGLKGL
PGDPGREGFP GPPGFMGPRG SKGAVGPPGL DGLPGASGLP GPVGPPGDRG LPGEVLGAQP
GPRGDSGLPG RPGLKGPPGE RGPPGFRGSQ GMPGMPGQKG QPGSPGFSGQ PGLPGAGGRA
WVCPREPALQ SRRGSESAAH GPGLPGDRGE PGDTGVPGPV GRKGGSGDRG DPGQQGERGH
PGLPGFKGVS GMPGTPGLSR ARGSPGMDGF EGMLGLKGRP GLPGIKGEAG FFGIPGLKGL
AGEPGVKGSR GDPGPPGPPP IILPGMKDIK GEKGDEGPMG LKGYLGLKGL PGMPGIPGLS
GIPGLPGRPG HIKGVKGDTG VPGVPGSPGF PGVPGSPGVM GFQGFTGSRG DKGAPGRAGL
FGEVGETGDF GDIGDTIDLP GSPGLKGERG TTGIPGQKGF LGERGTEGDV GFPGITGLAG
VQGPPGFQGQ KGFPGLTGLQ GPQGDPGRAG VPGIKGDSGW PGNPGLPGLP GLRGISGLHG
LPGNKGFPGS PGADVHGDPG FPGPAGDKGD PGEANTLPGP TGAPGQKGER GAPGERGPIG
SPGLQGFPGI TPPSNISGSP GDIGAPGIFG LEGYRGPPGP PGPAALPGSK GDEGSPGTPG
NPGIKGWVGD PGPQGRPGVF GLPGEKGPKG EPGFMGNIGP TGSPGDRGPR GPKGDRGLPG
APGAVGAPGI AGIPQRIAVE RGPVGPQGRR GPPGAQGEMG PQGPPGEPGF RGVQGKAGPQ
GRGGVSAVPG FRGDQGPVGL QGPVGFEGEP GRPGSPGLPG MPGRSISIGY LLVKHSQTEK
EPMCPVGMNK LWSGYSLLYF EGQEKAHNQD LGLAGSCPAR VSTTPLKLCQ SGQATFYLEG
LLLSGDLGTD GAAPVCPWGT EEEALDPLCF NSHHEMRKKQ ANVLASLPPA PPAQHTAAGD
EGGGQSLVSL RATPFNECNG ARGTCHYYAN KYSFWLTTIP EQNFQGTPSA DTLKAGLIRT
HISRCQVCMK NL
//