ID W8AM68_CERCA Unreviewed; 1780 AA.
AC W8AM68;
DT 16-APR-2014, integrated into UniProtKB/TrEMBL.
DT 16-APR-2014, sequence version 1.
DT 27-MAR-2024, entry version 38.
DE SubName: Full=(Mediterranean fruit fly) hypothetical protein {ECO:0000313|EMBL:CAD6999962.1};
DE SubName: Full=Collagen alpha-1(IV) chain {ECO:0000313|EMBL:JAB87027.1};
GN Name=CO4A1 {ECO:0000313|EMBL:JAB87027.1};
GN ORFNames=CCAP1982_LOCUS8469 {ECO:0000313|EMBL:CAD6999962.1};
OS Ceratitis capitata (Mediterranean fruit fly) (Tephritis capitata).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; Tephritoidea;
OC Tephritidae; Ceratitis; Ceratitis.
OX NCBI_TaxID=7213 {ECO:0000313|EMBL:JAB87027.1};
RN [1] {ECO:0000313|EMBL:JAB87027.1}
RP NUCLEOTIDE SEQUENCE.
RA Geib S.;
RL Submitted (JUL-2013) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|EMBL:JAB87027.1}
RP NUCLEOTIDE SEQUENCE.
RX PubMed=24495485; DOI=10.1186/1471-2164-15-98;
RA Calla B., Hall B., Hou S., Geib S.M.;
RT "A genomic perspective to assessing quality of mass-reared SIT flies used
RT in Mediterranean fruit fly (Ceratitis capitata) eradication in
RT California.";
RL BMC Genomics 15:98-98(2014).
RN [3] {ECO:0000313|EMBL:CAD6999962.1}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=EGII {ECO:0000313|EMBL:CAD6999962.1};
RA Whitehead M.;
RL Submitted (NOV-2020) to the EMBL/GenBank/DDBJ databases.
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000256|ARBA:ARBA00004370}.
CC Secreted, extracellular space, extracellular matrix, basement membrane
CC {ECO:0000256|ARBA:ARBA00004302}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CAJHJT010000012; CAD6999962.1; -; Genomic_DNA.
DR EMBL; GAMC01019528; JAB87027.1; -; mRNA.
DR Proteomes; UP000606786; Unassembled WGS sequence.
DR GO; GO:0005604; C:basement membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0048856; P:anatomical structure development; IEA:UniProt.
DR Gene3D; 2.170.240.10; Collagen IV, non-collagenous; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF533; COLLAGEN ALPHA-6(IV) CHAIN; 1.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 14.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; C-type lectin-like; 2.
DR PROSITE; PS51403; NC1_IV; 1.
PE 2: Evidence at transcript level;
KW Basement membrane {ECO:0000256|ARBA:ARBA00022869};
KW Collagen {ECO:0000256|ARBA:ARBA00023119, ECO:0000313|EMBL:JAB87027.1};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000606786};
KW Secreted {ECO:0000256|ARBA:ARBA00022530};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..24
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 25..1780
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5033979109"
FT DOMAIN 1556..1779
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000259|PROSITE:PS51403"
FT REGION 96..1219
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1249..1545
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 199..213
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 297..328
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 386..400
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 652..667
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1104..1124
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1457..1471
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1780 AA; 174792 MW; C09024A43490868D CRC64;
MMEPHWKRLI FAATIAAALL SANAQFWKVA DTGAIYNPPK HYRDEATPPR TPIDDSYAVL
DTVTTNRQGP PRNCTSGTVG CIPKCFAEKG NRGFPGEVGL PGPKGMPGYQ GPEGPPGDKG
QKGDPGPIGP RGYKGERGVA GIPGMPGVAG VQGISGNPGA TGAPGKDGCD GQDGLNGLQG
LSGMPGPRGY PGQPGVKGEK GEPAKENGDY AKGEKGEPGF AGRSGNPGPE GQLGSKGDRG
DTGPYGPPGP RGDRGIKGDK GAPCFAPPQP GKKGDKGEKG EPASVKPTIG IQSVMGEKGE
RGEKGETGPA GEKGERGYPG ERGSDGTKGE KGLPGGPGDR GRQGNFGPPG PTGQKGDRGE
TGLNGLPGKH GQKGEPGNPG RPGERGLLGP PGPPGGGRGS PGAPGPKGPR GYTGAPGPKG
LDGFDGPPGA QGFPGQKGGP GLPGRTGVEG PPGEKGEKGN AGRTGPAGPV GPMGYTGPPG
PEGEKGEPGF PGIGEMGPKG DDGVPGIPGL RGQKGERGFK GNAGAPGDSK YGLPGSPGRA
GLPGQKGDQG RSGNPGLKGD MGPKGDAGGK CSLCPPGFKG DKGDRGLNGI PGEPGVRGPP
GATGAPGERG LDGIAGMDGP PGAKGEDGRD GLPGDPGPPG RDAIVDLSQV KVEKGPKGER
GFTGPEGLKG ERGEPGQPGI NGAKGEIGAK GDKGYAGPPG SDGIPGNPGR GGRDGLPGLS
VKGEPGRPGR DGLKGDKGTE GNPGFKGDPG TCDAASLTVP AKGNKGDRGI PGMPGPMGPM
GDKGSQGLPG LKGEMGPQGP VGPVGPRGLT GPRGEKGNTG AMGAPGNPGK DGLRGPPGRN
GERGQKGEQG IAVAGPPGPQ GRSGFPGEKG DRGVPGPLGP QGQDGAVGYP GDKGDAGLPG
QSGQTGPVGP KGDTGPLGPT GPPGPPGKPG VDGVVGRDGA KGEPGNPGLV GMPGAKGERG
APGNDGAKGF TGAPGSPGRR GSPGPAGIPG MKGDKGEIGL TGNDGATGPR GPPGAPGLMG
AKGDIGPEGP PGTDGRPGLD GEKGSQGLPG FDGQQGLPGD AAEKGQKGEP GIPGLRGPEG
LTGAPGMTGE KGFPGTPIHG NPGAKGDKGD RGRDGIDGRD GIPGEKGDAG LPGRNGQKGD
KGDMGLPGAP GTPGLDGRPG EPGSPGPVGY TGAKGDKGDV GFQGVQGMKG DKGAIGFPGL
TGAPGLKGER GFPGINGRDA QPITIKGDKG EMGEFGVIGL PGLIGPKGNQ GIPGLSGQKG
ERGLPGPLGM PGLNGAPGLK GDQGLPGEPG APGPAIKGEK GLPGRSGRNG REGTPGLSGQ
KGEKGLPGLA GMPGLTGMPG PVGPAGPKGD RGPMGVPGRD GADGLPGQVG QKGDMGFPGI
KGERGLAGFE GQKGEKGEQG LPGPQGLAGL NGMKGDRGYP GLDGVPGPVG AIGEKGSIGP
KGRDGRDGIP GAPGQKGEPG LVPPPGPKGE PGHPGYDGQK GERGPPGPRG LNGLQGERGE
KGETGLIGLT GQPGRAGMKG DQGLPGLQGR DGAPGLPGPQ GETGAACSAA QDYLTGILLV
KHSQSEEIPR CLPGQIELWT GYSMLYVDGN DYAHNQDLGS AGSCVRRFST LPVMSCGQNN
VCNYASRNDK TFWLSTSAPI PMMPINNNEI SKYISRCVVC EAPANVIAVH SQSLTIPDCP
NGWESLWIGY SFAMHTAVGN GGGGQALASP GSCLEDFRAT PFIECNGAKG QCHFYETMTS
FWLVTVESHE QFQRPAMQTL KAGTLLQRVS RCTVCMKNST
//