ID A0A3Q1ERK6_9TELE Unreviewed; 1335 AA.
AC A0A3Q1ERK6;
DT 10-APR-2019, integrated into UniProtKB/TrEMBL.
DT 10-APR-2019, sequence version 1.
DT 27-MAR-2024, entry version 25.
DE SubName: Full=Collagen alpha-1(I) chain-like {ECO:0000313|Ensembl:ENSAPOP00000006217.1};
OS Acanthochromis polyacanthus (spiny chromis).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Neoteleostei; Acanthomorphata;
OC Ovalentaria; Pomacentridae; Acanthochromis.
OX NCBI_TaxID=80966 {ECO:0000313|Ensembl:ENSAPOP00000006217.1, ECO:0000313|Proteomes:UP000257200};
RN [1] {ECO:0000313|Ensembl:ENSAPOP00000006217.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR STRING; 80966.ENSAPOP00000006217; -.
DR Ensembl; ENSAPOT00000007036.1; ENSAPOP00000006217.1; ENSAPOG00000008178.1.
DR GeneTree; ENSGT00940000156584; -.
DR InParanoid; A0A3Q1ERK6; -.
DR Proteomes; UP000257200; Unplaced.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001007; VWF_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 5.
DR Pfam; PF00093; VWC; 1.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00214; VWC; 1.
DR SUPFAM; SSF57603; FnI-like domain; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
PE 4: Predicted;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Reference proteome {ECO:0000313|Proteomes:UP000257200};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..22
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 23..1335
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5018690689"
FT DOMAIN 30..88
FT /note="VWFC"
FT /evidence="ECO:0000259|PROSITE:PS50184"
FT DOMAIN 1105..1335
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 94..1105
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 99..135
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 775..795
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 974..988
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1053..1069
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1335 AA; 126092 MW; C91131CB6BB3184C CRC64;
MFSFVDIRLA LLLSATVLLA RAQGEDDTFG SCTLDGQLYN DKDVWKPEPC QICVCDSGSV
MCDEVICEDT TDCDDPIIPD GECCPICPDT DGPVGPPGND GIPGQPGLPG PPGPPGPPGL
GGVGPMGPRG PPGPSGSSGP QGFTGPAGEP GEPGASGPMG SRGPAGPPGK NGDDGEPGKP
GRPGERGAAG PQGARGFPGT PGLPGIKGHR GFSGLDGAKG DSGPSGPKGE AGAPGENGVP
GAMGARGLPG ERGRPGPPGP AGARGNDGNT GAAGPPGATG PAGAPGFPGG AGAKGETGPQ
GGRGNEGPQG ARGEPGNPGP SGPAGPAGAP GSDGSPGAKG SPGAAGIAGA PGFPGARGPA
GAQGAVGTPG PKGNTQGDHG PSGPKGEPGA KGEPGPAGVQ GLPGPSGEEG KRGPRGEPGG
AGPRGPPGER QGAPGERGAP GAMGAQGATG ESGSPGAPGA PGSKGVTGSP GSPGPDGKVG
PAGAPGQDGR TGPAGPAGSR GQPGVMGFPG PKGAAGAPGK DGDVGAPGPS GAAGPAGEKG
EQGPAGPPGF QQGLPGPQGA TGETGKPGEQ GERGFPGERG GPGPAGPAGA RGAPGPAGND
GAKGEPGVGG APGGVGSPGM QGMPGERGAS GLPGVKGERG DGGGKGADGA PGKDGVRGLT
GAIGVPGPPG AQGEKGEPGA MGVAGPSGPR GSPGERGETG PSGPAGFAGP PGADGQPGAK
GETGDTGPKG DAGAPGPGGP VGAAGPQGPA GPPGAKGARG GAGSPGATGF PGPAGRVGPP
GPAGAGGPPG PPGPVGKDGA RGARGETGAA GRPGEAGAAG VTGPGGEKGS PGADGAPGSP
GLPGPQGIAG QRGVVGLPGQ RGERGFTGLP GPGGEPGKQG PSGPVGERGA PGPAGPPGLS
GATGEAGREG SAGHDGAPGR DGAPGPKGDR GESGMPGPPG PPGTPGAPGT VGPSGKSGDR
GEGQGPAGAK GDRGEAGEAG DRGHKGHRGF SGMQGLPGPA GAPGERGPAG ASGPAGPRGP
SGSSGTVGKD GMNGMPGPIG PPGPRGRNGE MGPAGPPGPP GPPGPPGAPG GGFDFISQPL
QEKAPDPLRG GYRADDPNVQ RDRDNEVDTT LKTLTQKVEK IRSPDGTQKS PARMCRDLRM
CHPEWKSWVD PNQGSALDAI KVHCNMETGE TCVPPTRSNI PMKNWYQSKN SKKHVWFSES
MTGGFQFQYG TDGADSEDVN IQMTFMRLMS NQASQNVTYH CKNSIAYMDS TTGNLKKALL
LQGSNDVEIR AEGNSRFTYS VSEDGCTSHT GAWGKTVIDY KTTKTSRLPI IDIAPMDVGA
PDQEFGVEVG PVCFL
//