ID A0A087X9Z6_POEFO Unreviewed; 1489 AA.
AC A0A087X9Z6;
DT 29-OCT-2014, integrated into UniProtKB/TrEMBL.
DT 26-NOV-2014, sequence version 2.
DT 27-MAR-2024, entry version 53.
DE SubName: Full=Collagen alpha-1(II) chain-like {ECO:0000313|Ensembl:ENSPFOP00000002599.2};
OS Poecilia formosa (Amazon molly) (Limia formosa).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Neoteleostei; Acanthomorphata;
OC Ovalentaria; Atherinomorphae; Cyprinodontiformes; Poeciliidae; Poeciliinae;
OC Poecilia.
OX NCBI_TaxID=48698 {ECO:0000313|Ensembl:ENSPFOP00000002599.2, ECO:0000313|Proteomes:UP000028760};
RN [1] {ECO:0000313|Proteomes:UP000028760}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=female {ECO:0000313|Proteomes:UP000028760};
RA Schartl M., Warren W.;
RL Submitted (OCT-2013) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSPFOP00000002599.2}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (SEP-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AYCK01005973; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR STRING; 48698.ENSPFOP00000002599; -.
DR Ensembl; ENSPFOT00000002603.2; ENSPFOP00000002599.2; ENSPFOG00000000897.2.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000155224; -.
DR OMA; GDVNHGM; -.
DR Proteomes; UP000028760; Unassembled WGS sequence.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001007; VWF_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF58; COLLAGEN ALPHA-1(II) CHAIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 8.
DR Pfam; PF00093; VWC; 1.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00214; VWC; 1.
DR SUPFAM; SSF57603; FnI-like domain; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
PE 4: Predicted;
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Reference proteome {ECO:0000313|Proteomes:UP000028760};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..26
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 27..1489
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5001832592"
FT DOMAIN 36..94
FT /note="VWFC"
FT /evidence="ECO:0000259|PROSITE:PS50184"
FT DOMAIN 1255..1489
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 104..183
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 196..1236
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 162..176
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 239..253
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 353..367
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 434..448
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1202..1219
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1489 AA; 141557 MW; 7DDA8A4E942B676C CRC64;
MFSFVDSRTV LLLVASQVVL LSVVRCQQED DRKEAGGCIQ DARLYNDKDV WKPEPCRICV
CDSGAVLCDE IICEEIKECA NPIIPSGECC PICPADATAP IGSMPNAQGQ KGEPGEIADV
VGPRGPPGPM GPSGEQGMRG EAGAKGDKGN PGPRGRDGEP GTPGNPGPPG PPGPPGLGGN
FAAQMSQGFD EKAGTASMGV MQGPMGPMGP RGPPGPSGAP GPQGFQGAPG ETGEPGPAGP
IGPRGPPGPS GKPGSDGDPG KPGKPGERGP AGPQGARGFP GTPGLPGIKG HRGHPGLDGA
KGETGAAGAK GETGAAGESG SNGPMGPRGL PGERGRPGAS GAAGARGNDG LPGPAGPPGP
VGPAGAPGFP GSPGSKGEAG PTGARGAEGA QGPRGEAGTP GSPGPAGAGG NPGTDGIPGA
KGSAGAPGIA GAPGFPGPRG PPGPQGATGP LGPKGQSGDP GLPGLKGESG PKGELGPAGP
QGAPGPAGEE GKRGARGEPG TAGPNGPPGE RGAPGNRGFP GQDGLAGAKG APGDRGVPGA
AGPKGATGDP GRTGEAGLPG ARGLTGRPGD TGPQGKVGAS GAPGEDGRPG PPGPLGARGQ
PGVMGFPGPK GANGEPGKPG EKGLVGRPGL RGLPGKDGET GPSGPPGPAG PAGERGEQGQ
PGPPGFQGLP GPTGSPGEGG KPGDQGVPGE GGAPGAVGPR GERGFPGERG SAGPQGLQGP
RGLPGTPGSD GPKGAIGPAG ALGAQGPPGL QGMPGERGAG GIPGAKGDRG DNGQKGPEGA
PGKDGGRGLT GPIGPPGPSG PNGAKGETGP TGPSGAPGVR GAPGDRGELG PPGPAGFAGP
PGADGQPGAK GELGEPGLKG EAGASGPQGP SGAPGPVGPT GVSGPKGARG AQGAPGATGF
PGAAGRVGPP GPNGNPGAAG PAGPAGKDGP KGVRGDAGPP GRPGDAGLRG PAGPPGEKGE
QGPTGEPGAD GPSGPQGLAG GRGIVGLPGQ RGERGFPGLP GPSGEPGKQG ASGGAGDRGP
PGPVGPPGLT GPAGEPGREG TPGSDGPPGR DGAAGVKGER GNTGPAGAPG APGAPGAPGP
VGPLGKQGDR GEAGAQGPAG PPGLAGARGI AGPQGPRGDK GEAGESGERG QKGHRGFTGL
QGLPGPPGPA GDAGPAGPAG SSGAKGPPGP LGPAGKDGSN GQPGPIGPPG PRGRSGETGP
AGPPGNPGPP GPPGPPGPGI DISAFAGLGQ TEKSPDPLRY MRADEASSSL RQHDIEVDST
IKSLNTQIEN LRSPDGTQKN PARSCSDLKL CHPEWKSGDY WVDPNLGSTA DAIKVFCNME
TGETCVYPSI AKVPKKNWWT SKSKARKHIW FGEAMNGGFH FSYAQEGPAA SAANVQLTFL
RLLSTEASQN FTYHCKNSIA YMDQASGNLK KALLLQGSND VEIRAEGNSR FTYSVLEDSC
KSHTGRWGKT VFEYKTQKTS RLPIVDIAPM DIGGADQEFG VDIGAVCFL
//