ID H3C397_TETNG Unreviewed; 1459 AA.
AC H3C397;
DT 18-APR-2012, integrated into UniProtKB/TrEMBL.
DT 18-APR-2012, sequence version 1.
DT 27-MAR-2024, entry version 57.
DE SubName: Full=Collagen, type I, alpha 1a {ECO:0000313|Ensembl:ENSTNIP00000002716.1};
OS Tetraodon nigroviridis (Spotted green pufferfish) (Chelonodon
OS nigroviridis).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Neoteleostei; Acanthomorphata;
OC Eupercaria; Tetraodontiformes; Tetradontoidea; Tetraodontidae; Tetraodon.
OX NCBI_TaxID=99883 {ECO:0000313|Ensembl:ENSTNIP00000002716.1, ECO:0000313|Proteomes:UP000007303};
RN [1] {ECO:0000313|Proteomes:UP000007303}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=15496914; DOI=10.1038/nature03025;
RA Jaillon O., Aury J.-M., Brunet F., Petit J.-L., Stange-Thomann N.,
RA Mauceli E., Bouneau L., Fischer C., Ozouf-Costaz C., Bernot A., Nicaud S.,
RA Jaffe D., Fisher S., Lutfalla G., Dossat C., Segurens B., Dasilva C.,
RA Salanoubat M., Levy M., Boudet N., Castellano S., Anthouard V., Jubin C.,
RA Castelli V., Katinka M., Vacherie B., Biemont C., Skalli Z., Cattolico L.,
RA Poulain J., De Berardinis V., Cruaud C., Duprat S., Brottier P.,
RA Coutanceau J.-P., Gouzy J., Parra G., Lardier G., Chapple C.,
RA McKernan K.J., McEwan P., Bosak S., Kellis M., Volff J.-N., Guigo R.,
RA Zody M.C., Mesirov J., Lindblad-Toh K., Birren B., Nusbaum C., Kahn D.,
RA Robinson-Rechavi M., Laudet V., Schachter V., Quetier F., Saurin W.,
RA Scarpelli C., Wincker P., Lander E.S., Weissenbach J., Roest Crollius H.;
RT "Genome duplication in the teleost fish Tetraodon nigroviridis reveals the
RT early vertebrate proto-karyotype.";
RL Nature 431:946-957(2004).
RN [2] {ECO:0000313|Ensembl:ENSTNIP00000002716.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR Ensembl; ENSTNIT00000000353.1; ENSTNIP00000002716.1; ENSTNIG00000011168.1.
DR GeneTree; ENSGT00940000156584; -.
DR Proteomes; UP000007303; Unassembled WGS sequence.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001007; VWF_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF58; COLLAGEN ALPHA-1(II) CHAIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 12.
DR Pfam; PF00093; VWC; 1.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00214; VWC; 1.
DR SUPFAM; SSF57603; FnI-like domain; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
PE 4: Predicted;
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Reference proteome {ECO:0000313|Proteomes:UP000007303};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..22
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 23..1459
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5003580952"
FT DOMAIN 31..89
FT /note="VWFC"
FT /evidence="ECO:0000259|PROSITE:PS50184"
FT DOMAIN 1224..1459
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 95..1207
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 121..143
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 166..189
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1459 AA; 138455 MW; E91B37FD3ED8091A CRC64;
MFSFVDLRLA LLLSAQGLLV RAQGEDDRPS GSCTIGGQVF ADRDVWKPEP CQICVCDSGT
VMCDEVICED TSDCPNPVIP HDECCPICPD DGIRFQEPQV EGPTGTRGPK GDRGPAGPPG
RDGIPGQPGP AGPPGPPGPP GLGGNFSPQM SGGYDDKSSP AMAVPGPMGP MGPRGPPGPA
GPSGPQGFTG PPGEPGEAGA AGPMRPRGPP GPSGKTERIG RAGELDLLPT GRLSDTGREI
RNDQGGARGF PGTPGLPGIK GHRGFSGLDG AKGDTGPAGP KGEAGAPGEN GTPGAMGPRG
LPGERGRTGA SGPAGARGND GAAGAAGPPG PTGPAGPAGF PGGPGPKGDA GPQGARGGEG
PAGARGEPGN PGPAGPAGPS GAPGNDGAAG AKGSPGAAGV AGAPGFPGPR GPPGPQGAAG
APGPKGNTGD VGAPGAKGEP GVKGEAGAAG VQGPPGPSGE EGKRGARGEP GPAGARGAPG
ERGGPGGRGF PGADGPAGPK GATGERGAPG VAGPKGATGE TGRTGEPGLP GAKGMTGSPG
SPGPDGKMGP AGAPGQDGRP GPPGAVGARG QPGVMGFPGP KGAAGEPGKT GERGAMGPSG
AVGAPGKDGE VGAQGPAGPA GLQGERGEQG PAGATGFQGL PGPQGAVGET GKPGEQGVPG
EAGLPGPAGS RGDRGFPGER GAPGAAGPTG ARGSPGPAGN DGAKGDAGAP GNPGAQGPPG
LQGMPGERGA AGLPGLRGDR GDQGAKGGDG APGKDGPRGM TGAIGLPGPA GASGDKGEPG
PAGAVGPAGP RGAPGERGES GPPGPAGFAG PPGAEGQPGA KGDAGENGAK GDAGPAGPAG
PTGAPGPQGP VGSTGPKGSR GPSGPPGATG FPGAAGRVGP PGPSKKGNPG PAGPAGPAGK
EGAKGNRGDT GPAGRTGEMG PAGAPGVPGE KGSPGADGLA GSPGLPGPQG IAGGRGIVGL
PGQRGERGFP GPPGPSGETG KQGGAGPSGE RGPPGPMGPP GLAGPSGEPG REGAPGNEGA
AGRDGAAGPK GDRGESGPAG AAGAPGPPGA PGPVGPAGKN GDRGESGPAG PAGPAGPAGP
RGPAGVAGLR GDKGESGEAG ERGMKGHRGF TGPQGPPGPT GAVGEQGPAG SAGPAGPRGP
SGAAGSPGKD GMSGLPGPSG PPGPRGRSGE MGPAGPPGPA GPPAPGAPGG GFDLGFISQP
QEKAPDPYRM FRADDANVLR DRDLEVDSTL KSLSQQIEQI RSPDGTRKNP ARTCRDLKMC
HPDWKSGEYW IDPDQGCTQD AIKVFCNMET GETCVHPTQT EVAKKNWYLS KNIKEKKHVW
FGETMNDGFQ FEYGSEGSQP EDVNIQLTFL RLMSTEASQN ITYHCKNSVA YMDAAAGNLK
KALLLQGSNE IEIRAEGNSR FTYSVLEDGC TSHTGTWGKT VIDYKTSKTS RLPIIDIAPM
DVGAPDQEFG LEVGPVCFL
//