ID A0A1Y9IW51_9DIPT Unreviewed; 1772 AA.
AC A0A1Y9IW51;
DT 30-AUG-2017, integrated into UniProtKB/TrEMBL.
DT 30-AUG-2017, sequence version 1.
DT 27-MAR-2024, entry version 27.
DE RecName: Full=Collagen IV NC1 domain-containing protein {ECO:0000259|PROSITE:PS51403};
OS Anopheles minimus.
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae;
OC Anophelinae; Anopheles.
OX NCBI_TaxID=112268 {ECO:0000313|EnsemblMetazoa:AMIN016018-PA, ECO:0000313|Proteomes:UP000075920};
RN [1] {ECO:0000313|Proteomes:UP000075920}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=MINIMUS1 {ECO:0000313|Proteomes:UP000075920};
RG The Broad Institute Genomics Platform;
RA Neafsey D.E., Walton C., Walker B., Young S.K., Zeng Q., Gargeya S.,
RA Fitzgerald M., Haas B., Abouelleil A., Allen A.W., Alvarado L.,
RA Arachchi H.M., Berlin A.M., Chapman S.B., Gainer-Dewar J., Goldberg J.,
RA Griggs A., Gujja S., Hansen M., Howarth C., Imamovic A., Ireland A.,
RA Larimer J., McCowan C., Murphy C., Pearson M., Poon T.W., Priest M.,
RA Roberts A., Saif S., Shea T., Sisk P., Sykes S., Wortman J., Nusbaum C.,
RA Birren B.;
RT "The Genome Sequence of Anopheles minimus MINIMUS1.";
RL Submitted (MAR-2013) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|EnsemblMetazoa:AMIN016018-PA}
RP IDENTIFICATION.
RC STRAIN=MINIMUS1 {ECO:0000313|EnsemblMetazoa:AMIN016018-PA};
RG EnsemblMetazoa;
RL Submitted (MAY-2020) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000256|ARBA:ARBA00004370}.
CC Secreted, extracellular space, extracellular matrix, basement membrane
CC {ECO:0000256|ARBA:ARBA00004302}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR STRING; 112268.A0A1Y9IW51; -.
DR EnsemblMetazoa; AMIN016018-RA; AMIN016018-PA; AMIN016018.
DR EnsemblMetazoa; AMIN016018-RB; AMIN016018-PB; AMIN016018.
DR VEuPathDB; VectorBase:AMIN016018; -.
DR OrthoDB; 2882192at2759; -.
DR Proteomes; UP000075920; Unassembled WGS sequence.
DR GO; GO:0005604; C:basement membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0048856; P:anatomical structure development; IEA:UniProt.
DR Gene3D; 2.170.240.10; Collagen IV, non-collagenous; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1104; COLLAGEN ALPHA-1(IV) CHAIN; 1.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 19.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; C-type lectin-like; 2.
DR PROSITE; PS51403; NC1_IV; 1.
PE 4: Predicted;
KW Basement membrane {ECO:0000256|ARBA:ARBA00022869};
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Secreted {ECO:0000256|ARBA:ARBA00022530};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..23
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 24..1772
FT /note="Collagen IV NC1 domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5011908702"
FT DOMAIN 1550..1772
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT REGION 87..1117
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1177..1210
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1283..1498
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1515..1540
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 253..269
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 635..657
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1772 AA; 175560 MW; 514670DD14D8678A CRC64;
MGTRIKWLIT TSLLIWYGQH AYAQVWANGG GFMGKRQEQE HPRFIQPQID PSYTLMDTAS
GPQGPPSKNC TASGCCLPKC FSEKGNRGFP GPAGLKGDKG NRGYPGSEGL PGDKGSKGEP
GQFGLQGPKG DRGRDGLPGY PGIPGTNGVP GMPGAPGIPG RDGCNGTDGL PGFEGLPGSP
GPRGYPGSPG SKGEKGEPAR HPENYNKGQK GEPGNDGPEG MPGPAGKDGP RGHPGMPGEK
GVPGIPGLRG ERGDKGVCIK GEKGQKGAKG EEVYGATGTT TTTGPKGEKG DRGDLGEPGL
PGEKGQAGDR GQLGERGHKG EKGLPGQPGP RGRDGNFGPV GLPGQKGDRG SEGLHGLKGQ
SGPKGDSGRD GIPGQPGISG PPGAPGGGEG RPGAPGPKGP RGYEGPQGPK GMDGFDGEKG
ERGQMGPKGG QGVPGRPGPE GMPGDKGDKG ESGSVGMPGP QGPRGYPGQP GPEGLRGEPG
QPGYGMPGQK GNAGMAGFPG LKGQKGERGF KGVMGTPGDA KEGRPGAPGM PGRDGEKGEP
GRPGLSGAKG ERGMKGELGG RCTDCRPGMK GDKGERGYAG EPGRPGSSGI PGERGYPGMP
GEDGTPGLRG EPGPKGDPGL VGPPGPSGEP GRDAEIPMDQ LKPIKGDKGE VGERGLMGIK
GEKGFPGPVG PEGKMGLRGM KGDKGRSGES GMDGVPGQPG ADGQPGRHGQ TLKGEPGLKG
NVGYSGDKGD KGYSGLKGEP GKCAEVPPNI MEAIRGPPGL QGEKGARGIQ GVPGEKGDMG
EQGRTGAQGS AGPPGAPGPV GPRGLTGHRG EKGNTGPLGP PGAPGRDGLA GAPGLTGPKG
AKGDPGLSMV GPPGPKGNPG LRGPKGDRGG TGDRGDSGPP GAPGYPGEKG DSGLSGAPGY
PGEVGPKGEP GPKGPAGHPG APGRPGVDGV KGLPGLKGDI GAPGVIGLPG QKGDMGQAGN
DGLKGFQGRK GMMGAPGIQG ARGPQGPKGD QGEKGDRGEI GMKGLTGQTG QPGIAGPKGD
KGLSGLPGPA CLPGLSGEKG DKGYTGPEGP PGEAGAASEK GQKGEPGVPG LRGNDGLPGL
EGPAGPKGDA GVPGYGRPGP QGEKGNDGTT GLNGLPGLNG VKGDMGVPGF PGVKGDKGTT
GLPGVPGAPC MDGLPGAEGP IGPRGYDGEK GFKGEPGRIG ERGDQGEKGD TGLVGPTGLM
GRKGDRGIPG SPGLPASVAA IKGDKGEPGF PGAIGRPGKV GAPGLPGDMG AKGEMGIQGL
PGLPGPAGLN GLAGMKGDMG PMGEKGDTCP VVKGEKGLPG RPGKTGRDGP PGLTGEKGDK
GLAGLPGPIG PPGPPGPLGR QGEKGDRGDS GLMGRPGKDG LPGPQGQRGL SGLQGEKGDQ
GPPGFIGPKG EKGERGRDGL NGMNGPQGLK GDRGTPGLEG VAGLPGMVGE KGDRGQPGMA
GLNGAPGEKG QKGETPQLPP QRKGPPGPPG FNGPKGDKGL PGLAGPAGIP GAPGAPGEMG
LRGFEGARGL QGLRGDVGPE GRHGRDGAPG LPGPKGEPGR DCESAPYYTG ILLVRHSQSD
EVPVCEPGHL KLWDGYSLLY VDGNDYPHNQ DLGSAGSCVR KFSTLPILAC GQNNVCNYAS
RNDRTFWLST SAPIPMMAVK ENEMRPYISR CTVCEAPTNV IAVHSQTLHI PECPNGWDGL
WIGYSFLMHT AVGHGGGGQS LSGPGSCLED FRATPFIECN GGKGHCHYYE TQTSFWLVSL
EDHQQFQQPE QQTLKAGNLL SRVSRCQVCI RR
//