ID A0A8S1C2X8_9INSE Unreviewed; 804 AA.
AC A0A8S1C2X8;
DT 12-OCT-2022, integrated into UniProtKB/TrEMBL.
DT 12-OCT-2022, sequence version 1.
DT 28-JAN-2026, entry version 14.
DE RecName: Full=Collagenase NC10/endostatin domain-containing protein {ECO:0008006|Google:ProtNLM};
GN ORFNames=CLODIP_2_CD05047 {ECO:0000313|EMBL:CAB3362315.1};
OS Cloeon dipterum.
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Palaeoptera; Ephemeroptera; Pisciforma; Baetidae; Cloeon.
OX NCBI_TaxID=197152 {ECO:0000313|EMBL:CAB3362315.1, ECO:0000313|Proteomes:UP000494165};
RN [1] {ECO:0000313|EMBL:CAB3362315.1, ECO:0000313|Proteomes:UP000494165}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RA Alioto T., Alioto T., Gomez Garrido J.;
RL Submitted (APR-2020) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:CAB3362315.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CADEPI010000008; CAB3362315.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A8S1C2X8; -.
DR OrthoDB; 5983381at2759; -.
DR Proteomes; UP000494165; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0031012; C:extracellular matrix; IEA:TreeGrafter.
DR GO; GO:0005615; C:extracellular space; IEA:TreeGrafter.
DR GO; GO:0030020; F:extracellular matrix structural constituent conferring tensile strength; IEA:TreeGrafter.
DR GO; GO:0030198; P:extracellular matrix organization; IEA:TreeGrafter.
DR Gene3D; 3.40.1620.70; -; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 1.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR050149; Collagen_superfamily.
DR InterPro; IPR010515; Collagenase_NC10/endostatin.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR045463; XV/XVIII_trimerization_dom.
DR PANTHER; PTHR24023:SF1112; COL_CUTICLE_N DOMAIN-CONTAINING PROTEIN-RELATED; 1.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR Pfam; PF01391; Collagen; 3.
DR Pfam; PF20010; Collagen_trimer; 1.
DR Pfam; PF06482; Endostatin; 1.
DR SUPFAM; SSF56436; C-type lectin-like; 1.
PE 4: Predicted;
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Reference proteome {ECO:0000313|Proteomes:UP000494165}.
FT DOMAIN 523..570
FT /note="Collagen type XV/XVIII trimerization"
FT /evidence="ECO:0000259|Pfam:PF20010"
FT DOMAIN 610..776
FT /note="Collagenase NC10/endostatin"
FT /evidence="ECO:0000259|Pfam:PF06482"
FT REGION 14..476
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 16..28
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 43..53
FT /note="Gly residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 54..63
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 191..203
FT /note="Gly residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 208..246
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 312..321
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 349..359
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 367..379
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 393..405
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 410..420
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 421..434
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 804 AA; 83291 MW; 44EE2E9A84A51122 CRC64;
MHLWCNNTFL NCSDGLKGEK GEKGERGPPG EVLRGPPGPP GEPGYGGNGG GDFLSGEGIS
RGPIGPPGPP GTCTCNTSMI FGPDGQIPDL IPGPPGMPGR DGKTGQPGLS GLPGPTGERG
APGQRGEKGD RGDGGPQGLE GQPGQKGEPG RDGTPGLQGL PGPPGSPGNS DYTNYDPGWR
PRGNLKDSLL GGSGFSQGMG RPGSPGPKGE KGDGGPHGPR GERGVPGNKG DRGEAGTRGQ
KGDKGHQGSI GYQGFKGERG EPGIDGMPGM PGENGLPGGR GDKGDMGPMG PPGPPAINAA
SLIAGDNLLS TEKGDKGDKG DTGYPGVDGK DGYPGTKGEP GELISTDGKS LKGDRGERGK
RGKRGEPGPP GPPGPPGPVG PRGDFGLPGW GVKGDKGDSG SKGEPGEGEG LLPKGLHYVP
VPGPPGPPGP PGPPGISIQG EKGDAGPPGE PGFHSYRPPS PVYGDPTLTG KHGSARGSLE
ELRFIKQMKD MKDGLAPMIH HSGHQHHHEA PPPPPQVKIV PGAVTFQNAE TMIKMSSASP
VGTIAFLLDE EALLVRVKSG WQYIALGNLV SVDSTVPPEV MTTTEPTIVP PFKPQLKVSN
QLKTVDGPHL RIAALNEPTS GNMHGVRGAD YNCYRQARRA NLRGTFRAFL ASRVQDLDSI
VRSSDRDLPV LNIKGDVLFN SWNELFNGDG ALFLGQSRIY SFSGRNILTD QTWPLKAIWH
GSQLLGERAV DTYCEAWHSE SMDKLGLGSS LLKGRLLGQE RFTCDQKLIV LCVEATSIAR
SRRSIPEDKI LSESDYNALI SGVN
//