ID A0A452S4F3_URSAM Unreviewed; 1239 AA.
AC A0A452S4F3;
DT 08-MAY-2019, integrated into UniProtKB/TrEMBL.
DT 08-MAY-2019, sequence version 1.
DT 27-MAR-2024, entry version 23.
DE SubName: Full=Collagen type I alpha 2 chain {ECO:0000313|Ensembl:ENSUAMP00000026924.1};
GN Name=COL1A2 {ECO:0000313|Ensembl:ENSUAMP00000026924.1};
OS Ursus americanus (American black bear) (Euarctos americanus).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Carnivora; Caniformia; Ursidae; Ursus.
OX NCBI_TaxID=9643 {ECO:0000313|Ensembl:ENSUAMP00000026924.1, ECO:0000313|Proteomes:UP000291022};
RN [1] {ECO:0000313|Proteomes:UP000291022}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RA Korstanje R., Srivastava A., Sarsani V.K., Sheehan S.M., Seger R.L.,
RA Barter M.E., Lindqvist C., Brody L.C., Mullikin J.C.;
RT "De novo assembly and RNA-Seq shows season-dependent expression and editing
RT in black bear kidneys.";
RL Submitted (JUN-2016) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSUAMP00000026924.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (SEP-2023) to UniProtKB.
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000256|ARBA:ARBA00003647}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; A0A452S4F3; -.
DR Ensembl; ENSUAMT00000030021.1; ENSUAMP00000026924.1; ENSUAMG00000020167.1.
DR GeneTree; ENSGT00940000155639; -.
DR Proteomes; UP000291022; Unassembled WGS sequence.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 1.20.5.320; 6-Phosphogluconate Dehydrogenase, domain 3; 2.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 7.
DR SMART; SM00038; COLFI; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
PE 4: Predicted;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000291022};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..20
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 21..1239
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5019490562"
FT DOMAIN 1006..1239
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 36..70
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 85..138
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 156..1004
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 48..68
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 192..206
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 961..977
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1239 AA; 119134 MW; D18062EDEB359F57 CRC64;
MLSFVDTRTL LLLAVTSCLA TCQCKCLRLV WGRRGRKIRR GPPGPPGRDG DDGIPGPPGP
PGPPGPPGLG GVRCLKYYLS SSQGLMGPRG PPGASGAPGP QGFQGPAGEP GEPGQTVSTF
SPVLVDFGHP GKPGRPGERG VVGPQVSFIL VSPVTSGGAR GLPGERGRVG APGPAGARGS
DGSVGPVGPA GPIGSAGPPG FPGAPGPKGE LGPVGNPGPA GPAGPRGEVG LPGVSGPVGP
PGNPGANGLT GAKGAAGLPG VAGAPGLPGP RGIPGPVGAA GATGARGLVG EPGPAGSKGE
SGNKGEPGSV GPQGPPGPSG EEGKRGPNGE AGSAGPSGPP GLRGSPGSRG LPGADGRAGV
MGPPGPRGAT GPAGVRGPNG DSGRPGEPGL MGPRGFPGAP GNVGPAGKEG PMGLPGIDGR
PGPIGPAGAR GEPGNIGFPG PKGPSGEPGK AGEKGHAGLA GARGAPGPDG NNGAQGPPGP
QGVQGGKGEQ GPAGPPGFQG ERGPPGESGA AGPSGPIGSR GPSGPPGPDG NKGEPGVLGA
PGTAGPSGPG GLPGERGAAG IPGGKGEKGE TGLRGDVGNP GRDGARVSRI LFGERGEVGP
AGPNGFAGPA GAAGQPGAKG ERGTKGPKGE NGPVGPTGPV GSAGPSGPNG PPGPAGSRGD
GGPPRINDSV ICSQGITGPP GPPGAAGKEG LRGPRGDQGP VGRTGETGAH GPPGFAGEKG
PSGEPGTAGP PGTAGPQGLL GAPGILGLPG SRGERGLPGV SGSVGEPGPL GISGPPGARG
PPGAVGAPGV NGAPGEAGRD GNPGNDGPPG RDGQPGHKGE RGYPGNIGPV GAVGAPGPHG
PVGPTGKHGN RGEPGPAGAV GPVGAVGPRG PSGPQGVRGD KGEPGDKGPR GLPGLKGHNG
LQGLPGLAGQ HGDQGAPGSV GPAGPRGPAG PSGPAGKDGR IGHPGTVGPA GVRGSQGSQG
PAGPPGPPGP PGPPGPSGGG YDFGYEGDFY RADQPRSPPS LRPKDYEVDA TLKSLNNQIE
TLLTPEGSRK NPARTCRDLR LSHPEWSSGY YWIDPNQGCT MDAIKVHCDF STGETCIRAQ
PENIPAKNWY RNSKVKKHIW LGETINGGTQ FEYNVEGVTT KEMATQLAFM RLLANHASQN
ITYHCKNSIA YMDEETGNLN KAVILQGSND VELVAEGNSR FTYSVLVDGC SKKTNEWGKT
IIEYKTNKPS RLPILDIAPL DIGGADQEFR VDVGPVCFK
//