ID A0A094ZV34_SCHHA Unreviewed; 3313 AA.
AC A0A094ZV34;
DT 26-NOV-2014, integrated into UniProtKB/TrEMBL.
DT 26-NOV-2014, sequence version 1.
DT 27-MAR-2024, entry version 44.
DE SubName: Full=Collagen alpha-1(IV) chain {ECO:0000313|EMBL:KGB36999.1};
GN ORFNames=MS3_05320 {ECO:0000313|EMBL:KGB36999.1};
OS Schistosoma haematobium (Blood fluke).
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes; Trematoda;
OC Digenea; Strigeidida; Schistosomatoidea; Schistosomatidae; Schistosoma.
OX NCBI_TaxID=6185 {ECO:0000313|EMBL:KGB36999.1};
RN [1] {ECO:0000313|EMBL:KGB36999.1}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=22246508; DOI=10.1038/ng.1065;
RA Young N.D., Jex A.R., Li B., Liu S., Yang L., Xiong Z., Li Y.,
RA Cantacessi C., Hall R.S., Xu X., Chen F., Wu X., Zerlotini A., Oliveira G.,
RA Hofmann A., Zhang G., Fang X., Kang Y., Campbell B.E., Loukas A.,
RA Ranganathan S., Rollinson D., Rinaldi G., Brindley P.J., Yang H., Wang J.,
RA Wang J., Gasser R.B.;
RT "Whole-genome sequence of Schistosoma haematobium.";
RL Nat. Genet. 44:221-225(2012).
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000256|ARBA:ARBA00004370}.
CC Secreted, extracellular space, extracellular matrix, basement membrane
CC {ECO:0000256|ARBA:ARBA00004302}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; KL250834; KGB36999.1; -; Genomic_DNA.
DR RefSeq; XP_012796761.1; XM_012941307.1.
DR STRING; 6185.A0A094ZV34; -.
DR GO; GO:0005604; C:basement membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.170.240.10; Collagen IV, non-collagenous; 2.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1082; COLLAGEN ALPHA-1(X) CHAIN; 1.
DR Pfam; PF01413; C4; 4.
DR Pfam; PF01391; Collagen; 30.
DR SMART; SM00111; C4; 4.
DR SUPFAM; SSF56436; C-type lectin-like; 4.
DR PROSITE; PS51403; NC1_IV; 2.
PE 4: Predicted;
KW Basement membrane {ECO:0000256|ARBA:ARBA00022869};
KW Collagen {ECO:0000256|ARBA:ARBA00023119, ECO:0000313|EMBL:KGB36999.1};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}.
FT DOMAIN 1368..1599
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT DOMAIN 3077..3307
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT REGION 1..36
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 52..435
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 448..533
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 558..1181
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1213..1364
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1607..3068
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 374..403
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 448..476
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 507..531
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 558..581
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 644..675
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1305..1334
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1611..1628
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1758..1772
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1975..1989
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2241..2256
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2286..2306
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2540..2554
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2717..2739
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 3313 AA; 327829 MW; 30DC8B9C7FD064DE CRC64;
MDSKGLAGLD GEDGLPGPRG PQGESHINVV PNSHRIPPRL KRVGEKQGAV GLMGDIGPKG
ESGVYNYTNI IPGDRGEPGP RGQDGQPCED MDREYTEEEL ISFTRGQAGD IGSPGPKGDS
GESGDKGDMG LQGEEGPIGE KGEPGEPGLD GNLGSDGPPG PEGTIGDKGD RGPHGRDGLQ
GDPGQAGIPG KATCPIEFFP KGDKGLPGHP GDKGTQGEPG LPGDQGEKGV PGDEGPKGVR
GSQGKICEPG PQGPVGVKGW YGKNGAQGQP GIDGDPGQPG QRGLPGDDGF GHLGEIGEVG
VEGLRGIKGE SGEEGDPGEK GFGGDSNIGP VGPRGKPGDD GSDGEPGKHG EPGPPGPKGE
PLTACPLCKD GDAGARGEIG DPGENGEKGS DGMPGDRGDM GRLGEDGLIG LPGTKGEKGR
IGEDGFNGDP GEPGESAIVS RVVEDVIPGD EGDEGHEGSI GDIGDRGDEG EPGHVGDEGL
KGQQGTDGEQ GPIGEVGDDG DDGETGEDGI DVKGMVGELG ERGDEGSVGE KGEKGIAGVM
TNCTIDWRQL VAKNESELIG DRGEKGDDGE VGEKGSRGID GRKGEIGQTA APSERGETGF
PGPDGEKGDP GPPGIEGPQG LQGDPGQVID GIEGDIGEQG PEGPKGFKGE EGDVGDNGDS
PERIKGERGP DGEPGTEGDE GEPGIPGQKG EPGAEAGQQG ETGSPGEKGD SGPKGLPGPK
GEPSSDMESG DQGPKGEPGL MGSEGEKGGA GDVAECNEAG LIGDPGPDGA MGTIGIPGDD
GEQGEDGDVG PEGELGPDGE HGEIGDIGIE GIRGITGEKG EPGSMGDIGE KGSPGPEGET
GDIGPQGPQG ADGSQGPTGD PGPSGEFVEG ERGDTGPDGT QGEQGEKGEP GPPGEVGETG
PTAETLPGIK GEPGQPGEDG QLGEPGLDGN DGEPGLKGDE GLVGEPGSKG SRGPRGKVGN
VGAEGSRGAK GNVGSEGELG PDGEKGDRGE AGDIGPPGNT TAGLKGREGL EGPQGVEGQA
GDLGEQGESG IPGEEGDQGD EGPTGETGEI GTPGLEGDIG PEGNEGPIGE QGSEGDVGMI
GVYGPEGEKG GPGLRGDSGE AAKNCSIGPK GQQGRLGPLG LPGPQGPPGD QGLPGLRGKD
GPPGVGVIGA PGELGDTGHI GLPGEAGEPG DQGPEGPSGL AALTGEPGMK GERGIPGDIG
DPGVGGVQGL PGYDGEKGIQ GSQGSSGIDG IDGIMGPKGF SGDQGPSGNR GQTGDQGEMG
EAGIQGPKGA QGVATRGELG EIGDPGENGA LGEQGDPGPD GEVGDDGDIG EKGEKGQPGL
VEGERGEVGD VGERGDIGEV GPPGPLGDVG PKGKNGEKGQ PGQTNYSSIL FARHYQTPFV
DDLTCPTGTN KLFTGYSYVM GGGVDDLVSM DLGTPSSCLS KFSSLPMTQC ERDTTCQSSM
RHERSYWLAT LVPRSEQPIP VDQTADQIAR CVVCEAPAHV FAFHSQGETL QPCPNTWTEL
WTGVSLILHT SGAHGGGQQL SSPGSCMEHF RYSPVIECNN NVGMCHYWSD AKVYYLRALN
PNITQFEKPV GFVMKAAEGP VLNNQFRRDI DCSVRCDYNF CRCFGPQGPQ GDPGPRGPPG
PIGPKGYPGE QGPPGLRGMK GDPGQDGVDG PVGERGPSGD FGSPGMHGEH GESGETGEKG
AKGIAGCPGG KGDPGEKGEP GRQGMDGDRG LPGIDGERGD PGESGFGRRG LPGPKGEPGN
VGPKGDDGIQ GERGDPGLPG LEGDRGEDGD FGVPGEAGEP GESIIDTYTP VKGAPGAPGD
PGDPGRNCSV TTLEQGRKAL IGPAGDRGDT GPKGYPGSKG AKGIRGIDGT IGLPGVKGLP
GPDGPLGPKG IKGSAGQPGV KGGQGRDGLI GDRGPPGPKG LPGDPGADGL RGFVGDPGEP
GGTTECTGIL PGRPGQRGIP GEPGEKGFPG IPGSIGDRGL KGLVGPQGPV GDPGFDCPIG
PPGPPGPKGS PGDDGRTGLP GLPGSPGDDG LPGDKGEPGI GHRGASGDPG DRGYDGPQGP
KGPKGLKGER GLPGTKGSGR PGPPGEKGER GLDGLAGKHG KQGPMGPQGE PGERCSACIP
GIPGAKGEKG DIGPSGLPGP RGRDGAKGER GSRGAPGNQG RTGSQGYEGD QGEHGLPGPK
GFKGEPGETR TEYTGILKGP KGYPGVDGVK GDKGIKGIIG SPGEKGERGL EGPQGDPGLR
GLPGQKGPKG ILGEKGYKGA TADVRFGDKG EKGQPGRPGL KGELGDSGEP GNCTVNVIQS
RNLTVEKGDR GPKGEPGPRG DKGKPGDEGP PGQPGDSGIR GIKGLPGIPG PAGIKGIVGE
AGDPGEQGSP GEPGFAGVTG VKGDVGDIGP VGDRGYPGPK GEKGLPGQPG TGIPGERGEP
GRPGIPGPRG SQGDPGVIGD SGDPGLDGKP GLAGERGDFG LPGLQGDPGP SGGAGIKGIM
GDPGDSGEPG LPGRPGIPGD KGRCLGEADK GEPGLPGPEG PEGQKGEPGL RGAKGERGPS
GPIGRTGEKG LRGEPGLPGP DGPSGDPGPP GRPGTTGPKG FEGTPGPKGF EGPPGSPGDE
GQPGPKGESG LPGAESFGEK GDTGERGPQG DVGEKGSTGK PGLRGPPGDA GTNGQAVVGD
VGDVGDIGEK GLPGMPGDPG LPGPKGSPGV IGLIGPKGIV GDPGVPGLQG RPGKPGQEGQ
PGLPGPKGFP GGKGQKGIKG ERGEPGETIV GRDGEPGDMG ERGIPGLPGP KGDRGFPGDA
GPKGLRGLPG RVGDKGQMGV FGDKGEPGIT GLDGYPGRKG DKGESGDIGP MGDPGPRGQP
GPKGFPGAPG QCLPAEESPI GDPGPQGPRG PPGEKGLTGL PGKVGKTGDP GPPGRGPKGE
QGEAGDPGDM ALPGAKGQRG YPGIKGEKGE VGMPGDIGPR GPIGDPGPPG DVGLPGDDGR
PGLLGPRGDK GDTGWPGTVG MKGEVGDVGD PGPIGAPGMP GEDAIGLPGV KGDKGEPGAP
GQKGMIGEPA LRGLKGAKGE MGLPGPTGRE PGEPGPKGDP GLKGTPGDIG PRGVPGDVGP
PGPEGFSGPP GQALQSTYLF TRHFQNPDTE PVCPPGSQKI SDGYSFVMAN GNGELVTMDL
GSHSSCLSMF SIMPFFQCLR DGSCQLGVRA DRSYWLATTE PLPMMPLDVS EVTRRVARCV
VCEAPTQPYA FHAQANQLQE VNCPEGWARL WDGYSFVMHS VGSTGGGQQL SSPGSCLEYF
SYSPLLECNN GMSLCNYWSD AKAYYLRHVS NNTEFQKPVG QYITDHLPDE TKVLREISRC
RVCTKSRFQS YFV
//