GenomeNet

Database: UniProt
Entry: I3JC78_ORENI
LinkDB: I3JC78_ORENI
Original site: I3JC78_ORENI 
ID   I3JC78_ORENI            Unreviewed;      1232 AA.
AC   I3JC78;
DT   11-JUL-2012, integrated into UniProtKB/TrEMBL.
DT   17-JUN-2020, sequence version 2.
DT   27-MAR-2024, entry version 66.
DE   SubName: Full=Collagen type II alpha 1 chain {ECO:0000313|Ensembl:ENSONIP00000006468.2};
GN   Name=COL2A1 {ECO:0000313|Ensembl:ENSONIP00000006468.2};
OS   Oreochromis niloticus (Nile tilapia) (Tilapia nilotica).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC   Actinopterygii; Neopterygii; Teleostei; Neoteleostei; Acanthomorphata;
OC   Ovalentaria; Cichlomorphae; Cichliformes; Cichlidae; African cichlids;
OC   Pseudocrenilabrinae; Oreochromini; Oreochromis.
OX   NCBI_TaxID=8128 {ECO:0000313|Ensembl:ENSONIP00000006468.2, ECO:0000313|Proteomes:UP000005207};
RN   [1] {ECO:0000313|Proteomes:UP000005207}
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RG   Broad Institute Genome Assembly Team;
RG   Broad Institute Sequencing Platform;
RA   Di Palma F., Johnson J., Lander E.S., Lindblad-Toh K.;
RT   "The Genome Sequence of Oreochromis niloticus (Nile Tilapia).";
RL   Submitted (JAN-2012) to the EMBL/GenBank/DDBJ databases.
RN   [2] {ECO:0000313|Ensembl:ENSONIP00000006468.2}
RP   IDENTIFICATION.
RG   Ensembl;
RL   Submitted (NOV-2023) to UniProtKB.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   AlphaFoldDB; I3JC78; -.
DR   STRING; 8128.ENSONIP00000070225; -.
DR   Ensembl; ENSONIT00000006473.2; ENSONIP00000006468.2; ENSONIG00000005146.2.
DR   eggNOG; KOG3544; Eukaryota.
DR   GeneTree; ENSGT00940000155224; -.
DR   HOGENOM; CLU_001074_2_3_1; -.
DR   TreeFam; TF344135; -.
DR   Proteomes; UP000005207; Linkage group LG5.
DR   GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR   GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR   Gene3D; 2.60.120.1000; -; 1.
DR   Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR   InterPro; IPR008160; Collagen.
DR   InterPro; IPR000885; Fib_collagen_C.
DR   InterPro; IPR001007; VWF_dom.
DR   PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR   PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR   Pfam; PF01410; COLFI; 1.
DR   Pfam; PF01391; Collagen; 9.
DR   Pfam; PF00093; VWC; 1.
DR   SMART; SM00038; COLFI; 1.
DR   SMART; SM00214; VWC; 1.
DR   SUPFAM; SSF57603; FnI-like domain; 1.
DR   PROSITE; PS51461; NC1_FIB; 1.
DR   PROSITE; PS01208; VWFC_1; 1.
DR   PROSITE; PS50184; VWFC_2; 1.
PE   4: Predicted;
KW   Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW   Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW   Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW   Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW   Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW   Reference proteome {ECO:0000313|Proteomes:UP000005207};
KW   Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT   SIGNAL          1..22
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT   CHAIN           23..1232
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT                   /id="PRO_5025578488"
FT   DOMAIN          32..90
FT                   /note="VWFC"
FT                   /evidence="ECO:0000259|PROSITE:PS50184"
FT   DOMAIN          998..1232
FT                   /note="Fibrillar collagen NC1"
FT                   /evidence="ECO:0000259|PROSITE:PS51461"
FT   REGION          97..221
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          302..515
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          553..852
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          867..983
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        135..149
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        157..173
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        360..374
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        727..742
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        945..962
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ   SEQUENCE   1232 AA;  122097 MW;  5F7BA2FAD988C369 CRC64;
     MDSRTVLLLV ASQVCLLAVV RCQVEDDQED AFSCVQDGQR YNDKDVWKPE PCRICVCDTG
     TVLCDEIVCE ELKDCPKPEI PFGECCPICA ADQPSTSGIP GVKGQKGEPG DITDVIGPRG
     PPGPTGPPGE QGPRGSRGEK GEKGSPGPRG RDGEPGTPGN PGPPGPPGPN GPPGLGGNFA
     AQMAGGFDEK AGGAQMGVMQ GPMGPMGPRG PPGPPGKPGD DVSVTVATKM KCPFFGARGF
     PGTPGLPGIK GHRGYPGRDG AKGETGAVGA KVRWSDFSDS DPQLVLVCLP LLKMTLGPVG
     PSGAPGFPGS PGAKGEAGPT GARGPEGAQG PRGESGTPGS PGPSGASVSD GAPGIAGAPG
     FPGPRGPPGP QGATGPLGPK GTSVCTGPPG LQGPNGPQGE EGKRGPRGEP GSAGPRGPPG
     ERGAPGNRGF PGQDGLAGPK GAPGERGPSG ASGPKGASGD PGRPGEPGLP GARLFSVWLD
     FTQGAPGEDG RPGPPGPQGA RGQPGVMGFP GPKGATVSKK KKKYLLYGSN ILQCVAMSFS
     FSFFSPFKGV PGEAGPAGAT GPRGERGFPG ERGAAGSQGL QGPRGLPGTP GTDGPKGAIG
     PAGTAGAQGP PGLQGMPGER GGAGIPGPKG DRVSHDGKQG DRGETGPPGP AGFAGPPGAD
     GQPGTKGEQG EPGQKGDAGA PGPQGPSGAP GPAGPTGVSG PKGARGAQGP PGATGFPGAA
     GRVGPPGPNG NPGPPGPAGP PGKDGPKGVR GDGGPPGRQG DAGLRGPAGA SGEKGDAGED
     GPPGPPGPSG PQGLAGQRGI VGLPGQRGER GFPGLPGPSG EPGKQGASGG PGDRGPPGPV
     GPPGLTGPAG EPGREVREYL EIHTHITTLC SQGPQGPRGD KGEAGEAGER GQKGHRGFTG
     LQGLPGPPGP PGPVGPAGKD GTNGLPGPIG PPGPRGRSGE TGPAGPPGNP GPPGPPGPPG
     PGIDMSAFAG LGQTEKGPDP LRYMRADQAS GNLRQHDAEV DATLKSLNNQ IENIRSPEGS
     KKNPARTCRD LKLCHPDWKS GEYWIDPNQG CTVDAIKVFC NMETGESCVH PKPSSIPRKN
     WWTSKSTVPK HVWFGESMNG GFHFNYGDDS LAPNTAAIQM TFLRLLSTEA SQNITYHCKN
     SVAYMDGATG NLKKAMLLQG SNDVEIRAEG NSRFTYAVLE DGCTRHTGRW GKTVIEYRSQ
     KTSRLPILDI APMDIGGADQ EFGVDVGAVC FL
//
DBGET integrated database retrieval system