ID L8HQF7_9CETA Unreviewed; 1366 AA.
AC L8HQF7;
DT 03-APR-2013, integrated into UniProtKB/TrEMBL.
DT 03-APR-2013, sequence version 1.
DT 27-MAR-2024, entry version 37.
DE SubName: Full=Collagen alpha-2(I) chain {ECO:0000313|EMBL:ELR46121.1};
GN ORFNames=M91_14202 {ECO:0000313|EMBL:ELR46121.1};
OS Bos mutus (wild yak).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Artiodactyla; Ruminantia; Pecora; Bovidae;
OC Bovinae; Bos.
OX NCBI_TaxID=72004 {ECO:0000313|EMBL:ELR46121.1, ECO:0000313|Proteomes:UP000011080};
RN [1] {ECO:0000313|EMBL:ELR46121.1, ECO:0000313|Proteomes:UP000011080}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=yakQH1 {ECO:0000313|Proteomes:UP000011080};
RX PubMed=22751099; DOI=10.1038/ng.2343;
RA Qiu Q., Zhang G., Ma T., Qian W., Wang J., Ye Z., Cao C., Hu Q., Kim J.,
RA Larkin D.M., Auvil L., Capitanu B., Ma J., Lewin H.A., Qian X., Lang Y.,
RA Zhou R., Wang L., Wang K., Xia J., Liao S., Pan S., Lu X., Hou H., Wang Y.,
RA Zang X., Yin Y., Ma H., Zhang J., Wang Z., Zhang Y., Zhang D., Yonezawa T.,
RA Hasegawa M., Zhong Y., Liu W., Zhang Y., Huang Z., Zhang S., Long R.,
RA Yang H., Wang J., Lenstra J.A., Cooper D.N., Wu Y., Wang J., Shi P.,
RA Wang J., Liu J.;
RT "The yak genome and adaptation to life at high altitude.";
RL Nat. Genet. 44:946-949(2012).
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000256|ARBA:ARBA00003647}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; JH883643; ELR46121.1; -; Genomic_DNA.
DR STRING; 72004.ENSBMUP00000006346; -.
DR Proteomes; UP000011080; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.60.120.1000; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 7.
DR SMART; SM00038; COLFI; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
PE 4: Predicted;
KW Collagen {ECO:0000313|EMBL:ELR46121.1};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000011080};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..20
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 21..1366
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5003991370"
FT DOMAIN 1133..1366
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 33..1130
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 45..74
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 248..262
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 800..814
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1088..1104
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1366 AA; 129258 MW; A5F381C28ADA1DE9 CRC64;
MLSFVDTRTL LLLAVTSCLA TCQCKCLQLV SGSLGKSGDR GPRGERGPPG PPGRDGDDGI
PGPPGPPGPP GPPGLGGNFA AQFDAKGGGP GPMGLMGPRG PPGASGAPGP QGFQGPPGEP
GEPGQTGPAG ARGPPGPPGK AGEDGHPGKP GRPGERGVVG PQGARGFPGT PGLPGFKGIR
GHNGLDGLKG QPGAPGVKGE PGAPGENGTP GQTGARGLPG ERGRVGAPGP AGARGSDGSV
GPVGPAGPIG SAGPPGFPGA PGPKGELGPV GNPGPAGPAG PRGEVGLPGL SGPVGPPGNP
GANGLPGAKG AAGLPGVAGA PGLPGPRGIP GPVGASGATG ARGLVGEPGP AGSKGESGNK
GEPGAVGQPG PPGPSGEEGK RGSTGEIGPA GPPGPPGLRG NPGSRGLPGA DGRAGVMGPA
GSRGATGPAG VRGPNGDSGR PGEPGLMGPR GFPGSPGNIG PAGKEGPVGL PGIDGRPGPI
GPAGARGEPG NIGFPGPKGP SGDPGKAGEK GHAGLAGARG APGPDGNNGA QGPPGLQGVQ
GGKGEQGPAG PPGFQGLPGP AGTAGEAGKP GERGIPGEFG LPGPAGARGE RGPPGESGAA
GPTGPIGSRG PSGPPGPDGN KGEPGVVGAP GTAGPSGPSG LPGERGAAGI PGGKGEKGET
GLRGDIGSPG RDGARGAPGA IGAPGPAGAN GDRGEAGPAG PAGPAGPRGS PGERGEVGPA
GPNGFAGPAG AAGQPGAKGE RGTKGPKGEN GPVGPTGPVG AAGPSGPNGP PGPAGSRGDG
GPPGATGFPG AAGRTGPPGP SGISGPPGPP GPAGKEGLRG PRGDQGPVGR SGETGASGPP
GFVGEKGPSG EPGTAGPPGT PGPQGLLGAP GFLGLPGSRG ERGLPGVAGS VGEPGPLGIA
GPPGARGPPG NVGNPGVNGA PGEAGRDGNP GNDGPPGRDG QPGHKGERGY PGNAGPVGAA
GAPGPQGPVG PVGKHGNRGE PGPAGAVGPA GAVGPRGPSG PQGIRGDKGE PGDKGPRGLP
GLKGHNGLQG LPGLAGHHGD QGAPGAVGPA GPRGPAGPSG PAGKDGRIGQ PGAVGPAGIR
GSQGSQGPAG PPGPPGPPGP PGPSGGGYEF GFDGDFYRAD QPRSPTSLRP KDYEVDATLK
SLNNQIETLL TPEGSRKNPA RTCRDLRLSH PEWSSGYYWI DPNQGCTMDA IKVYCDFSTG
ETCIRAQPED IPVKNWYRNS KAKKHVWVGE TINGGTQFEY NVEGVTTKEM ATQLAFMRLL
ANHASQNITY HCKNSIAYMD EETGNLKKAV ILQGSNDVEL VAEGNSRFTY TVLVDGCSKK
TNEWQKTIIE YKTNKPSRLP ILDIAPLDIG GADQEIRLNI GPVCFK
//