ID A0A2K5UWX2_MACFA Unreviewed; 1187 AA.
AC A0A2K5UWX2;
DT 28-MAR-2018, integrated into UniProtKB/TrEMBL.
DT 02-JUN-2021, sequence version 2.
DT 27-MAR-2024, entry version 26.
DE SubName: Full=Collagen type XIX alpha 1 chain {ECO:0000313|Ensembl:ENSMFAP00000016948.2};
GN Name=COL19A1 {ECO:0000313|Ensembl:ENSMFAP00000016948.2};
OS Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
OC Cercopithecidae; Cercopithecinae; Macaca.
OX NCBI_TaxID=9541 {ECO:0000313|Ensembl:ENSMFAP00000016948.2, ECO:0000313|Proteomes:UP000233100};
RN [1] {ECO:0000313|Ensembl:ENSMFAP00000016948.2, ECO:0000313|Proteomes:UP000233100}
RP NUCLEOTIDE SEQUENCE.
RA Warren W., Wilson R.K.;
RL Submitted (MAR-2013) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSMFAP00000016948.2}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (SEP-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; A0A2K5UWX2; -.
DR STRING; 9541.ENSMFAP00000016948; -.
DR Ensembl; ENSMFAT00000067486.2; ENSMFAP00000016948.2; ENSMFAG00000031633.2.
DR VEuPathDB; HostDB:ENSMFAG00000031633; -.
DR GeneTree; ENSGT00940000158276; -.
DR OrthoDB; 3809795at2759; -.
DR Proteomes; UP000233100; Chromosome 4.
DR Bgee; ENSMFAG00000031633; Expressed in adult mammalian kidney and 4 other cell types or tissues.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005788; C:endoplasmic reticulum lumen; IEA:UniProt.
DR GO; GO:0030198; P:extracellular matrix organization; IEA:Ensembl.
DR GO; GO:0007519; P:skeletal muscle tissue development; IEA:Ensembl.
DR Gene3D; 2.60.120.200; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR048287; TSPN-like_N.
DR PANTHER; PTHR37456:SF5; -; 1.
DR PANTHER; PTHR37456; SI:CH211-266K2.1; 1.
DR Pfam; PF01391; Collagen; 10.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000233100};
KW Signal {ECO:0000256|ARBA:ARBA00022729}.
FT DOMAIN 96..280
FT /note="Thrombospondin-like N-terminal"
FT /evidence="ECO:0000259|SMART:SM00210"
FT REGION 334..724
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 749..1054
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1095..1187
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 380..394
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 462..476
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 491..505
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 884..899
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1187 AA; 119754 MW; 7D7866EE78330C0D CRC64;
MRPPSPLAGS SLLGGAAALS GLHCASEGLK HEFKICGHLF TWFQGTMRLT GPWKLWLWMS
VFLLPASTSV TIRDKTEESC PILRIEGHQL TYDNINKVEV SGFDLGESFS LRRAFCESDK
TCFKLGSALL IRDTIKVFPK GLPEEYSVTA MFRVRRNAKK ERWFLWQVLN QQNIPQISIV
VDGGKKVVEF MFQATEGDVL NYIFRNRELR ALFDRQWHKL GISIQSRVLS LYMDCNLIAR
RQTDEKDTVD FHGRTVIATR VADGKPVDIE LHQLKIYCSA NLIAQETCCE ISDTKCPEQD
GFGNIASSSV TAHASKMSSY LPAKQELKDQ CQCIPNKGEA GLPGAPGSPG QKGDKGEPGE
NGLHGAPGLP GQKGEQGFEG SKGETGEKGE QGEKGDPGLA GLNGENGLKG DLGPRGLPGP
KGEKGDTGPP GPPALPGSLG IQGPQGPPGK EGQRGRRGKT GPPGKPGPPG PPGPPGIQGI
HQTLGGYYNK DNKGNDEHEA GGLKGDKGEN GLPGFPGSVG PKGQKGEPGE PFTKGEKGDR
GEPGIIGSQG VKGEPGDPGP PGLIGSPGLK GQQGSAGSMG PRGPPGDVGL PGEHGIPGKQ
GIKGEKGDPG GIIGPPGLPG PKGEAGPPGK SLPGEPGLDG NPGAPGPRGP KGERGLPGVH
GSPGDIGPPG VGIPGRTGSQ GPVGEPGIQG PRGLPGLPGT PGTPGNDGVP GRDGKPGLPG
PPGDPIALPL LGDIGVLLKN FCGNCQASVP GLKSNKGEGG AGEPGKYDSM ARKGDIGPRG
PPGIPGREGP KGSKGERGYP GIPGEKGDEG LQGIPGIPGT PGPTGPPGLL GRTGHPGPTG
AKGEKGSHGP PGKPGPPGPP GIPFNEGNGM SSLYKIKGGV NVPSYPGPPG PPGPKGDPGP
VGEPGAMGLP GLEGFPGVKG DRGPAGPPGI AGMSGKPGAP GPPGVPGEPG ERGPVGDIGF
PGPEGPSGKP GINGKDGIPG AQGIMGKPGD RGPKGERGDQ GIPGDRGPQG ERGKPGLTGM
KGAIGPIGPP GNKGSMGSPG HQGPPGSPGI PGIPADAVSF EEIKKYINQE VLRIFEERMA
VFLSQLKLPA AMLAAQAHGR PGPPGKDGLP GPPGDPGPQG YRGQKGERGE PGIGLPGSPG
LPGTSALGLP GSPGVPGPQG PPGPSGRCNP EDCLYPLSQA HQRTGGK
//