ID L8HZE2_9CETA Unreviewed; 1457 AA.
AC L8HZE2;
DT 03-APR-2013, integrated into UniProtKB/TrEMBL.
DT 03-APR-2013, sequence version 1.
DT 27-MAR-2024, entry version 36.
DE SubName: Full=Collagen alpha-1(II) chain {ECO:0000313|EMBL:ELR49258.1};
DE Flags: Fragment;
GN ORFNames=M91_21472 {ECO:0000313|EMBL:ELR49258.1};
OS Bos mutus (wild yak).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Artiodactyla; Ruminantia; Pecora; Bovidae;
OC Bovinae; Bos.
OX NCBI_TaxID=72004 {ECO:0000313|EMBL:ELR49258.1, ECO:0000313|Proteomes:UP000011080};
RN [1] {ECO:0000313|EMBL:ELR49258.1, ECO:0000313|Proteomes:UP000011080}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=yakQH1 {ECO:0000313|Proteomes:UP000011080};
RX PubMed=22751099; DOI=10.1038/ng.2343;
RA Qiu Q., Zhang G., Ma T., Qian W., Wang J., Ye Z., Cao C., Hu Q., Kim J.,
RA Larkin D.M., Auvil L., Capitanu B., Ma J., Lewin H.A., Qian X., Lang Y.,
RA Zhou R., Wang L., Wang K., Xia J., Liao S., Pan S., Lu X., Hou H., Wang Y.,
RA Zang X., Yin Y., Ma H., Zhang J., Wang Z., Zhang Y., Zhang D., Yonezawa T.,
RA Hasegawa M., Zhong Y., Liu W., Zhang Y., Huang Z., Zhang S., Long R.,
RA Yang H., Wang J., Lenstra J.A., Cooper D.N., Wu Y., Wang J., Shi P.,
RA Wang J., Liu J.;
RT "The yak genome and adaptation to life at high altitude.";
RL Nat. Genet. 44:946-949(2012).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; JH882468; ELR49258.1; -; Genomic_DNA.
DR STRING; 72004.ENSBMUP00000002385; -.
DR Proteomes; UP000011080; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 2.10.70.10; Complement Module, domain 1; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001007; VWF_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF58; COLLAGEN ALPHA-1(II) CHAIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 5.
DR Pfam; PF00093; VWC; 1.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00214; VWC; 1.
DR SUPFAM; SSF57603; FnI-like domain; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
PE 4: Predicted;
KW Collagen {ECO:0000313|EMBL:ELR49258.1};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Reference proteome {ECO:0000313|Proteomes:UP000011080};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}.
FT DOMAIN 4..62
FT /note="VWFC"
FT /evidence="ECO:0000259|PROSITE:PS50184"
FT DOMAIN 1226..1457
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 68..1207
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 106..122
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 130..147
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 210..224
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 324..338
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 405..419
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 883..897
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1173..1190
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT NON_TER 1457
FT /evidence="ECO:0000313|EMBL:ELR49258.1"
SQ SEQUENCE 1457 AA; 138730 MW; 983A118F2F764288 CRC64;
EKAGSCVQDG QRYNDKDVWK PEPCRICVCD TGTVLCDDII CEDMKDCLSP ETPFGECCPI
CSADLPTASG RQPGPKGQKG EPGDIKDIVG PKGPPGPQGP AGEQGPRGDR GDKGEKGAPG
PRGRDGEPGT PGNPGPPGPP GPPGPPGLGG NFAAQMAGGF DEKAGGAQMG VMQGPMGPMG
PRGPPGPAGA PGPQGFQGNP GEPGEPGVSG PMGPRGPPGP PGKPGDDGEA GKPGKSGERG
PPGPQGARGF PGTPGLPGVK GHRGYPGLDG AKGEAGAPGV KGESGSPGEN GSPGPMGPRG
LPGERGRTGP AGAAGARGND GQPGPAGPPG PVGPAGGPGF PGAPGAKGEA GPTGARGPEG
AQGPRGEPGT PGSPGPAGAA GNPGTDGIPG AKGSAGAPGI AGAPGFPGPR GPPGPQGATG
PLGPKGQTGE PGIAGFKGEQ GPKGEPGPAG PQGAPGPAGE EGKRGARGEP GGAGPAGPPG
ERGAPGNRGF PGQDGLAGPK GAPGERGPSG LAGPKGANGD PGRPGEPGLP GARGLTGRPG
DAGPQGKVGP SGAPGEDGRP GPPGPQGARG QPGVMGFPGP KGANGEPGKA GEKGLPGAPG
LRGLPGKDGE TGAAGPPGPA GPAGERGEQG APGPSGFQGL PGPPGPPGEG GKPGDQGVPG
EAGAPGLVGP RGERGFPGER GSPGSQGLQG ARGLPGTPGT DGPKGAAGPA GPPGAQGPPG
LQGMPGERGA AGIAGPKGDR GDVGEKGPEG APGKDGGRGL TGPIGPPGPA GANGEKGEVG
PPGPAGTAGA RGAPGERGET GPPGPAGFAG PPGADGQPGA KGEQGEAGQK GDAGAPGPQG
PSGAPGPQGP TGVTGPKGAR GAQGPPGATG FPGAAGRVGP PGSNGNPGPP GPPGPSGKDG
PKGARGDSGP PGRAGDPGLQ GPAGPPGEKG EPGDDGPSGP DGPPGPQGLA GQRGIVGLPG
QRGERGFPGL PGPSGEPGKQ GAPGASGDRG PPGPVGPPGL TGPAGEPGRE GSPGADGPPG
RDGAAGVKGD RGETGAVGAP GAPGPPGSPG PAGPIGKQGD RGEAGAQGPM GPAGPAGARG
MPGPQGPRGD KGETGEAGER GLKGHRGFTG LQGLPGPPGP SGDQGASGPA GPSGPRGPPG
PVGPSGKDGA NGIPGPIGPP GPRGRSGETG PAGPPGNPGP PGPPGPPGPG IDMSAFAGLG
QREKGPDPLQ YMRADEAAGN LRQHDAEVDA TLKSLNNQIE SLRSPEGSRK NPARTCRDLK
LCHPEWDYWI DPNQGCTLDA MKVFCNMETG ETCVYPNPAS VPKKNWWSSK SKDKKHIWFG
ETINGGFHFS YGDDNLAPNT ANVQMTFLRL LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA
LLIQGSNDVE IRAEGNSRFT YTVLKDGCTK HTGKWGKTMI EYRSQKTSRL PIIDIAPMDI
GGPEQEFGVD IGPVCFL
//