ID A0A158NMZ4_ATTCE Unreviewed; 564 AA.
AC A0A158NMZ4;
DT 08-JUN-2016, integrated into UniProtKB/TrEMBL.
DT 08-JUN-2016, sequence version 1.
DT 27-MAR-2024, entry version 25.
DE RecName: Full=Collagen alpha chain CG42342 {ECO:0008006|Google:ProtNLM};
GN Name=105622078 {ECO:0000313|EnsemblMetazoa:XP_012058902.1};
OS Atta cephalotes (Leafcutter ant).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Hymenoptera; Apocrita; Aculeata; Formicoidea;
OC Formicidae; Myrmicinae; Atta.
OX NCBI_TaxID=12957 {ECO:0000313|EnsemblMetazoa:XP_012058902.1, ECO:0000313|Proteomes:UP000005205};
RN [1] {ECO:0000313|Proteomes:UP000005205}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=21347285; DOI=10.1371/journal.pgen.1002007;
RA Suen G., Teiling C., Li L., Holt C., Abouheif E., Bornberg-Bauer E.,
RA Bouffard P., Caldera E.J., Cash E., Cavanaugh A., Denas O., Elhaik E.,
RA Fave M.J., Gadau J., Gibson J.D., Graur D., Grubbs K.J., Hagen D.E.,
RA Harkins T.T., Helmkampf M., Hu H., Johnson B.R., Kim J., Marsh S.E.,
RA Moeller J.A., Munoz-Torres M.C., Murphy M.C., Naughton M.C., Nigam S.,
RA Overson R., Rajakumar R., Reese J.T., Scott J.J., Smith C.R., Tao S.,
RA Tsutsui N.D., Viljakainen L., Wissler L., Yandell M.D., Zimmer F.,
RA Taylor J., Slater S.C., Clifton S.W., Warren W.C., Elsik C.G., Smith C.D.,
RA Weinstock G.M., Gerardo N.M., Currie C.R.;
RT "The genome sequence of the leaf-cutter ant Atta cephalotes reveals
RT insights into its obligate symbiotic lifestyle.";
RL PLoS Genet. 7:e1002007-e1002007(2011).
RN [2] {ECO:0000313|EnsemblMetazoa:XP_012058902.1}
RP IDENTIFICATION.
RG EnsemblMetazoa;
RL Submitted (APR-2016) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; ADTU01020676; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADTU01020677; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADTU01020678; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADTU01020679; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADTU01020680; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADTU01020681; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADTU01020682; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; ADTU01020683; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR RefSeq; XP_012058902.1; XM_012203512.1.
DR AlphaFoldDB; A0A158NMZ4; -.
DR EnsemblMetazoa; XM_012203512.1; XP_012058902.1; LOC105622078.
DR GeneID; 105622078; -.
DR KEGG; acep:105622078; -.
DR eggNOG; KOG3544; Eukaryota.
DR InParanoid; A0A158NMZ4; -.
DR OrthoDB; 4271163at2759; -.
DR Proteomes; UP000005205; Unassembled WGS sequence.
DR GO; GO:0048856; P:anatomical structure development; IEA:UniProt.
DR InterPro; IPR008160; Collagen.
DR PANTHER; PTHR37456:SF5; -; 1.
DR PANTHER; PTHR37456; SI:CH211-266K2.1; 1.
DR Pfam; PF01391; Collagen; 8.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000005205}.
FT REGION 14..76
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 98..564
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 103..119
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 186..200
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 335..356
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 389..403
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 539..556
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 564 AA; 55564 MW; 32B0875C91D5B6D1 CRC64;
MDVFTSPNIV VNFCKTKRIQ RPPGEPGPPG KRGKKGKKGD PGEPGAPGPI GLDGPKGDPG
RPGDKGQKGE LGSPGFDVLS AVKDLQEKGV NISAQNIIPL KGEPGEPGPP GPPGPPGAEG
LPGHEGRQGI PGEVGAPGEK GPPGPIGPIG PTGTPGLAGP KGDKGDKGDR GFTTTLKGEQ
FPSGVFEGPP GPPGPPGSPG EKGELGPTGP PGLPGEKGTR GKQGKRGFKG ESGLALARGA
PGLPGPKGEP GERGYRGEKG AMGDAGLPGP KGEMGLQGLQ GLNGTDGDPG IQGPPGLPGI
SGPKGEKGDY GDIGPPGLMG PPGLPGPPGY PGLKGEKGEK GESKYKKLRR RQGDGTFEMN
AGEVIMGPPG PPGPAGTPGL QGPPGIKGDR GHDGAKGDPG EKGAKGDPGP MGLPGPMGLR
GESGKPGDSG KPGAMGSPGL DGMKGAQGEP GTKGERGDPG LPGTDGIPGA EERSANGSTG
PPGKRGRKGD KGNKGDQGVP GLDAPCPLGP DGLPLPGCGW RPPQDITSTS VPAIEGPEPA
ETDYEEPEDE YDDYTDPNGN LAEQ
//