ID Q7QD47_ANOGA Unreviewed; 932 AA.
AC Q7QD47;
DT 15-DEC-2003, integrated into UniProtKB/TrEMBL.
DT 27-JUL-2011, sequence version 4.
DT 27-MAR-2024, entry version 140.
DE SubName: Full=AGAP002954-PA {ECO:0000313|EMBL:EAA08116.4};
GN Name=1273009 {ECO:0000313|EnsemblMetazoa:AGAP002954-PA};
GN ORFNames=AgaP_AGAP002954 {ECO:0000313|EMBL:EAA08116.4};
OS Anopheles gambiae (African malaria mosquito).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae;
OC Anophelinae; Anopheles.
OX NCBI_TaxID=7165 {ECO:0000313|EMBL:EAA08116.4};
RN [1] {ECO:0000313|EMBL:EAA08116.4, ECO:0000313|Proteomes:UP000007062}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=PEST {ECO:0000313|EMBL:EAA08116.4,
RC ECO:0000313|Proteomes:UP000007062};
RX PubMed=12364791; DOI=10.1126/science.1076181;
RA Holt R.A., Subramanian G.M., Halpern A., Sutton G.G., Charlab R.,
RA Nusskern D.R., Wincker P., Clark A.G., Ribeiro J.M.C., Wides R.,
RA Salzberg S.L., Loftus B.J., Yandell M.D., Majoros W.H., Rusch D.B., Lai Z.,
RA Kraft C.L., Abril J.F., Anthouard V., Arensburger P., Atkinson P.W.,
RA Baden H., de Berardinis V., Baldwin D., Benes V., Biedler J., Blass C.,
RA Bolanos R., Boscus D., Barnstead M., Cai S., Center A., Chaturverdi K.,
RA Christophides G.K., Chrystal M.A.M., Clamp M., Cravchik A., Curwen V.,
RA Dana A., Delcher A., Dew I., Evans C.A., Flanigan M.,
RA Grundschober-Freimoser A., Friedli L., Gu Z., Guan P., Guigo R.,
RA Hillenmeyer M.E., Hladun S.L., Hogan J.R., Hong Y.S., Hoover J.,
RA Jaillon O., Ke Z., Kodira C.D., Kokoza E., Koutsos A., Letunic I.,
RA Levitsky A.A., Liang Y., Lin J.-J., Lobo N.F., Lopez J.R., Malek J.A.,
RA McIntosh T.C., Meister S., Miller J.R., Mobarry C., Mongin E., Murphy S.D.,
RA O'Brochta D.A., Pfannkoch C., Qi R., Regier M.A., Remington K., Shao H.,
RA Sharakhova M.V., Sitter C.D., Shetty J., Smith T.J., Strong R., Sun J.,
RA Thomasova D., Ton L.Q., Topalis P., Tu Z.J., Unger M.F., Walenz B.,
RA Wang A.H., Wang J., Wang M., Wang X., Woodford K.J., Wortman J.R., Wu M.,
RA Yao A., Zdobnov E.M., Zhang H., Zhao Q., Zhao S., Zhu S.C., Zhimulev I.,
RA Coluzzi M., della Torre A., Roth C.W., Louis C., Kalush F., Mural R.J.,
RA Myers E.W., Adams M.D., Smith H.O., Broder S., Gardner M.J., Fraser C.M.,
RA Birney E., Bork P., Brey P.T., Venter J.C., Weissenbach J., Kafatos F.C.,
RA Collins F.H., Hoffman S.L.;
RT "The genome sequence of the malaria mosquito Anopheles gambiae.";
RL Science 298:129-149(2002).
RN [2] {ECO:0000313|EMBL:EAA08116.4}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=PEST {ECO:0000313|EMBL:EAA08116.4};
RG The Anopheles Genome Sequencing Consortium;
RL Submitted (MAR-2002) to the EMBL/GenBank/DDBJ databases.
RN [3] {ECO:0000313|EMBL:EAA08116.4}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=PEST {ECO:0000313|EMBL:EAA08116.4};
RX PubMed=14747013; DOI=10.1016/j.pt.2003.11.003;
RA Mongin E., Louis C., Holt R.A., Birney E., Collins F.H.;
RT "The Anopheles gambiae genome: an update.";
RL Trends Parasitol. 20:49-52(2004).
RN [4] {ECO:0000313|EMBL:EAA08116.4}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=PEST {ECO:0000313|EMBL:EAA08116.4};
RX PubMed=17210077; DOI=10.1186/gb-2007-8-1-r5;
RA Sharakhova M.V., Hammond M.P., Lobo N.F., Krzywinski J., Unger M.F.,
RA Hillenmeyer M.E., Bruggner R.V., Birney E., Collins F.H.;
RT "Update of the Anopheles gambiae PEST genome assembly.";
RL Genome Biol. 8:R5.1-R5.13(2007).
RN [5] {ECO:0000313|EMBL:EAA08116.4}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=PEST {ECO:0000313|EMBL:EAA08116.4};
RG VectorBase;
RL Submitted (MAY-2011) to the EMBL/GenBank/DDBJ databases.
RN [6] {ECO:0000313|EnsemblMetazoa:AGAP002954-PA}
RP IDENTIFICATION.
RC STRAIN=PEST {ECO:0000313|EnsemblMetazoa:AGAP002954-PA};
RG EnsemblMetazoa;
RL Submitted (JAN-2021) to UniProtKB.
CC -!- SIMILARITY: Belongs to the CEF1 family.
CC {ECO:0000256|ARBA:ARBA00010506}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AAAB01008859; EAA08116.4; -; Genomic_DNA.
DR RefSeq; XP_311945.4; XM_311945.4.
DR AlphaFoldDB; Q7QD47; -.
DR STRING; 7165.Q7QD47; -.
DR PaxDb; 7165-AGAP002954-PA; -.
DR EnsemblMetazoa; AGAP002954-RA; AGAP002954-PA; AGAP002954.
DR GeneID; 1273009; -.
DR KEGG; aga:AgaP_AGAP002954; -.
DR VEuPathDB; VectorBase:AGAP002954; -.
DR eggNOG; KOG0050; Eukaryota.
DR HOGENOM; CLU_009082_0_0_1; -.
DR InParanoid; Q7QD47; -.
DR OMA; KMGMAGE; -.
DR OrthoDB; 131128at2759; -.
DR Proteomes; UP000007062; Chromosome 2R.
DR GO; GO:0000974; C:Prp19 complex; IBA:GO_Central.
DR GO; GO:0005681; C:spliceosomal complex; IBA:GO_Central.
DR GO; GO:0000981; F:DNA-binding transcription factor activity, RNA polymerase II-specific; IBA:GO_Central.
DR GO; GO:0000977; F:RNA polymerase II transcription regulatory region sequence-specific DNA binding; IBA:GO_Central.
DR GO; GO:0000398; P:mRNA splicing, via spliceosome; IBA:GO_Central.
DR GO; GO:0006357; P:regulation of transcription by RNA polymerase II; IBA:GO_Central.
DR CDD; cd00167; SANT; 1.
DR CDD; cd11659; SANT_CDC5_II; 1.
DR Gene3D; 1.10.10.60; Homeodomain-like; 2.
DR InterPro; IPR047242; CDC5L/Cef1.
DR InterPro; IPR021786; Cdc5p/Cef1_C.
DR InterPro; IPR009057; Homeobox-like_sf.
DR InterPro; IPR017930; Myb_dom.
DR InterPro; IPR001005; SANT/Myb.
DR InterPro; IPR047240; SANT_CDC5L_II.
DR PANTHER; PTHR45885; CELL DIVISION CYCLE 5-LIKE PROTEIN; 1.
DR PANTHER; PTHR45885:SF1; CELL DIVISION CYCLE 5-LIKE PROTEIN; 1.
DR Pfam; PF11831; Myb_Cef; 1.
DR Pfam; PF13921; Myb_DNA-bind_6; 1.
DR SMART; SM00717; SANT; 2.
DR SUPFAM; SSF46689; Homeodomain-like; 1.
DR PROSITE; PS51294; HTH_MYB; 2.
DR PROSITE; PS50090; MYB_LIKE; 2.
PE 3: Inferred from homology;
KW Coiled coil {ECO:0000256|SAM:Coils};
KW DNA-binding {ECO:0000256|ARBA:ARBA00023125};
KW mRNA processing {ECO:0000256|ARBA:ARBA00022664};
KW mRNA splicing {ECO:0000256|ARBA:ARBA00023187};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242};
KW Reference proteome {ECO:0000313|Proteomes:UP000007062};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Spliceosome {ECO:0000256|ARBA:ARBA00022728}.
FT DOMAIN 1..58
FT /note="HTH myb-type"
FT /evidence="ECO:0000259|PROSITE:PS51294"
FT DOMAIN 3..54
FT /note="Myb-like"
FT /evidence="ECO:0000259|PROSITE:PS50090"
FT DOMAIN 55..104
FT /note="Myb-like"
FT /evidence="ECO:0000259|PROSITE:PS50090"
FT DOMAIN 59..108
FT /note="HTH myb-type"
FT /evidence="ECO:0000259|PROSITE:PS51294"
FT REGION 108..145
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 244..290
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 812..932
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COILED 728..811
FT /evidence="ECO:0000256|SAM:Coils"
FT COMPBIAS 248..277
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 812..846
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 879..902
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 932 AA; 105108 MW; 3A0B9F9DB7F063BC CRC64;
MPRIMIKGGV WRNTEDEILK AAVMKYGKNQ WSRIASLLHR KSAKQCKARW YEWLDPSIKK
TEWSREEDEK LLHLAKLMPT QWRTIAPIIG RTAAQCLERY EYLLDQAQRK EEGEDGMDDP
RKLKPGEIDP NPETKPARPD PKDMDEDELE MLSEARARLA NTQGKKAKRK AREKQLEEAR
RLAALQKRRE LRAAGIGLGN RKRKLKGIDY NSEVPFEKTP APGFYDTTEE FVVPIAADFS
SLRQQTLDGE LRTEKEARER KKDKEKLKQR KENDIPTALL KNQEPAKKRS KLVLPEPQIS
DQELQQVVKL GRASEIAKEV ASESGVETTD ALLADYSITP QVAATPRTPA PVTDRILQEA
QNMMALTHVE TPLKGGVNTP LHQSDFSGVL PQSQTVATPN TVLATPFRSV RGPDGSATPG
GFLTPASGAM VPVGSGTQPH APGATPNFLR DKLNINTEDG MSVAETPAAY KSYQKQLKSS
LKEGLASLPA PRNDYEIVVP DNETDEAADD GSMDVEQMVP DQADVDEKRK RNKLAQEAKE
LSLRSQVIQR DLPRPLEINT TVLRPSNEMH GLTDLQKAEE LVKQEMVKML NYDALRNPIQ
QSQPASVKRP MLSQYQAYLE QHPYETIDEV ELDEARKMLA AEMGVVKHGM AHGDLSLESY
TQVWQECLSQ VLYLPSQNRY TRANLASKKD RIESAEKRLE INRKHMAKEA KRCGKIEKKL
KILTAGYQAR AQALVKQFQD TNEQIEQNSL ALSTFKFLAA QEDLAIPKRL ESLTEDVMRQ
TEREKTLQNR YAQLTEELEE LNHLLEEARV NGVQEHEEGG KEREPLVNGR LEHREVEEKE
EPVVDNPTNG PEDSNVQEES SDSNAAEDEP PENEGEGNAV AQERQSESPE ESVPGQDSCE
SEENTSEGQE NQEQQQDEPM EQHDDEEDAS NE
//