ID H3ATS3_LATCH Unreviewed; 847 AA.
AC H3ATS3;
DT 18-APR-2012, integrated into UniProtKB/TrEMBL.
DT 18-APR-2012, sequence version 1.
DT 27-MAR-2024, entry version 48.
DE RecName: Full=Collagen type XXII alpha 1 chain {ECO:0008006|Google:ProtNLM};
OS Latimeria chalumnae (Coelacanth).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Coelacanthiformes; Coelacanthidae; Latimeria.
OX NCBI_TaxID=7897 {ECO:0000313|Ensembl:ENSLACP00000013044.1, ECO:0000313|Proteomes:UP000008672};
RN [1] {ECO:0000313|Proteomes:UP000008672}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Wild caught {ECO:0000313|Proteomes:UP000008672};
RA Di Palma F., Alfoldi J., Johnson J., Berlin A., Gnerre S., Jaffe D.,
RA MacCallum I., Young S., Walker B.J., Lander E., Lindblad-Toh K.;
RT "The draft genome of Latimeria chalumnae.";
RL Submitted (AUG-2011) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSLACP00000013044.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AFYH01156626; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156627; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156628; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156629; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156630; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156631; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156632; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156633; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156634; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01156635; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR Ensembl; ENSLACT00000013140.1; ENSLACP00000013044.1; ENSLACG00000011491.1.
DR GeneTree; ENSGT00940000158302; -.
DR HOGENOM; CLU_006168_0_0_1; -.
DR Proteomes; UP000008672; Unassembled WGS sequence.
DR Bgee; ENSLACG00000011491; Expressed in mesonephros and 3 other cell types or tissues.
DR InterPro; IPR008160; Collagen.
DR PANTHER; PTHR24023:SF1019; COLLAGEN; 1.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR Pfam; PF01391; Collagen; 7.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000008672}.
FT REGION 1..130
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 149..351
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 391..423
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 455..502
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 515..584
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 610..714
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 737..834
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 554..575
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 636..650
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 808..828
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 847 AA; 84650 MW; 87DAA215C9945390 CRC64;
QGLPGEIGFS GKPGEPGKPG LPGNDGLDGL PGDPGPKGEE GERGLDGFPG KPGPQSRVGP
HRGDTGSRGP DGASGEKGEA GSPGLPGLAG LRGEKGDQGE RGRLGPLGPK GEKGDQGASG
PPGKPGLPGT VSRILFAYCC TKKSSDFPLH KNVSPFFKSA QGPRGVPGEK GHIGLLGPQG
GEGKKKVKGR GGREKKRGGR GRKGREKDYG SVQGNKGDSG VRGEAGAKGD TGVPGDPGLP
GKEGHKGSKG HQGIEGKHGL SGPPGEPGPA GIPGIPGKSK PGEPGEPGSP GTPGPQGPKG
DAGTPGLPGQ PGDRGPPGIG KPGHPGEAGP RGIPGVTGPQ GPPGQHAFKG PSLQYCRIGK
LGPSLQYCRI GKQDIYGKYC TTGPRLQYFG GGDEDLHATP GTDGVPGSTG PPGLLGDRPP
LPTRTKATNI VKKIDGAQIH STSQNIITNK TVRNWIDPPK KTGHEGKDGP PGLQGSPGLP
GLMGIAGKDG KEGEPGPPGE PYVCLCNEGV IFGFKGHRGE AGPPGPTEGE TGIPGGEPGM
MGAPGREGQP GKDPLPSPRF PQGKIGPPGP PGKTGPLGNP XGGRLHHFLK IKYSRNNSYK
YSMPFLRGVQ SFREGEPGEK GPPGKDGPPG SPGERGSKGE RGDAGIKGDK GAQGEKGTAG
EPGNPGHKGI TGMMGSQGPP GERGSPGSPG LPGQPGLPGP RGDSPSLEQL RRLIQEELGK
QLDAKLAYLM SQLQPANVKA ARGRPGPTGP PGKDGLPGRV GTPGEPGRPG ERGAKGERGE
PGIGQRGEIG LPGPPGYGKD GMPGTPGPQG ETGSQGPMGS QGPPGQAGQC DPSQCAYYAS
LAARPAN
//