ID A0A340WUG7_LIPVE Unreviewed; 1366 AA.
AC A0A340WUG7;
DT 10-OCT-2018, integrated into UniProtKB/TrEMBL.
DT 10-OCT-2018, sequence version 1.
DT 27-MAR-2024, entry version 25.
DE SubName: Full=Collagen alpha-2(I) chain-like isoform X1 {ECO:0000313|RefSeq:XP_007450704.1};
GN Name=LOC103069249 {ECO:0000313|RefSeq:XP_007450704.1};
OS Lipotes vexillifer (Yangtze river dolphin).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Artiodactyla; Whippomorpha; Cetacea; Odontoceti;
OC Lipotidae; Lipotes.
OX NCBI_TaxID=118797 {ECO:0000313|Proteomes:UP000265300, ECO:0000313|RefSeq:XP_007450704.1};
RN [1] {ECO:0000313|RefSeq:XP_007450704.1}
RP IDENTIFICATION.
RG RefSeq;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000256|ARBA:ARBA00003647}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR RefSeq; XP_007450704.1; XM_007450642.1.
DR GeneID; 103069249; -.
DR KEGG; lve:103069249; -.
DR OrthoDB; 2970887at2759; -.
DR Proteomes; UP000265300; Unplaced.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.60.120.1000; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 10.
DR SMART; SM00038; COLFI; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
PE 4: Predicted;
KW Collagen {ECO:0000313|RefSeq:XP_007450704.1};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000265300};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..24
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 25..1366
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5016413560"
FT DOMAIN 1133..1366
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 27..1131
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 43..72
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 248..262
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 800..814
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1088..1104
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1366 AA; 129463 MW; 9DD6987E006DFC4A CRC64;
MLSFVDTRTL LLLAVTSCLT TCQSLQEATA RKGPTGDRGP RGERGPPGPP GRDGDDGIPG
PPGPPGPPGP PGLGGNFAAQ YDGKGVGMGP GPMGLMGPRG PPGASGAPGP QGFQGPPGEP
GEPGQTGPAG SRGPPGPPGK AGEDGHPGKP GRPGERGVVG PQGARGFPGT PGLPGFKGIR
GHNGLDGLKG QPGTPGVKGE PGAPGENGIP GQIGARGLPG ERGRVGAPGP AGARGSDGSV
GPVGPAGPLG SAGPPGFPGA PGPKGELGPV GNPGPAGPAG SRGEVGLPGV SGPVGPPGNP
GANGLHGAKG AAGLPGVAGA PGLPGPRGIP GPVGAAGATG ARGLVGEPGP AGSKGESGNK
GEPGAAGPTG PPGPSGEEGK RGTTGEIGSA GPPGPPGLRG NPGSRGLPGA DGRAGVMGPH
GSRGGTGPAG VRGPSGDSGR PGEPGLMGPR GFPGSPGNAG PAGKEGPMGL PGIDGRPGPI
GPAGTRGEPG NIGFPGPKGP TGDPGKNGEK GHAGLAGPRG APGPDGNNGA QGPPGPQGVS
GGKGEQGPAG PPGFQGLPGP AGTAGEAGKA GERGIPGEFG LPGPAGPRGE RGPPGESGAA
GPAGPIGSRG PSGPAGPDGN KGEPGVVGAP GTAGPSGPNG LPGERGAAGI PGGKGEKGET
GLRGDAGSHG RDGARGAPGA VGAPGPAGAN GDRGEAGPAG PAGPSGPRGS PGERGEVGPA
GPNGFAGPAG AAGQPGAKGE RGTKGPKGEN GPTGPTGPVG AAGPAGPNGP PGPAGSRGDG
GPPGATGFPG AAGRTGPPGP SGITGPPGPP GPAGKEGLRG PRGDQGPVGR TGETGASGPP
GFAGEKGPSG EPGTAGSPGT PGPQGLLGAP GFLGLPGSRG ERGLPGVAGS VGEPGPLGIA
GPTGARGPPG AVGNPGVNGA PGEAGRDGNP GSDGAPGRDG QAGHKGERGY PGNAGPTGTV
GAPGPQGPVG PTGKHGNRGE SGPSGPVGLA GAVGPRGPSG PQGIRGDKGE PGDKGPRGLP
GLKGHNGLQG LPGLAGHHGD QGAPGTVGPA GPRGPSGPSG PSGKDGRTGH PGAVGPAGIR
GSHGSQGPSG PPGPPGPPGP PGPSGGGYDF GFEGDFYRAD QPRSPPSLRP KDYEVDATLK
SLNNQIETLL TPEGSRKNPA RTCRDLRLSH PEWSSGYYWV DPNQGCTMDA IKVYCDFSTG
ETCIRAQPEN IPVKNWYRSS KAKKHVWVGE TINGGTQFEY NVEGVTTKEM ATQLAFMRLL
ANHASQNITY HCKNSIAYMD EETGNLKKAV ILQGSNDVEL VAEGNSRFTY TVLVDGCSKK
INEWRKTIIE YKTNKPSRLP ILDIAPLDIG GADQEIRLNI GPVCFK
//