ID A0A182NUS8_9DIPT Unreviewed; 1309 AA.
AC A0A182NUS8;
DT 07-SEP-2016, integrated into UniProtKB/TrEMBL.
DT 07-SEP-2016, sequence version 1.
DT 24-JAN-2024, entry version 39.
DE SubName: Full=Uncharacterized protein {ECO:0000313|EnsemblMetazoa:ADIR011428-PA};
OS Anopheles dirus.
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae;
OC Anophelinae; Anopheles.
OX NCBI_TaxID=7168 {ECO:0000313|EnsemblMetazoa:ADIR011428-PA, ECO:0000313|Proteomes:UP000075884};
RN [1] {ECO:0000313|Proteomes:UP000075884}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=WRAIR2 {ECO:0000313|Proteomes:UP000075884};
RG The Broad Institute Genomics Platform;
RA Neafsey D.E., Walton C., Walker B., Young S.K., Zeng Q., Gargeya S.,
RA Fitzgerald M., Haas B., Abouelleil A., Allen A.W., Alvarado L.,
RA Arachchi H.M., Berlin A.M., Chapman S.B., Gainer-Dewar J., Goldberg J.,
RA Griggs A., Gujja S., Hansen M., Howarth C., Imamovic A., Ireland A.,
RA Larimer J., McCowan C., Murphy C., Pearson M., Poon T.W., Priest M.,
RA Roberts A., Saif S., Shea T., Sisk P., Sykes S., Wortman J., Nusbaum C.,
RA Birren B.;
RT "The Genome Sequence of Anopheles dirus WRAIR2.";
RL Submitted (MAR-2013) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|EnsemblMetazoa:ADIR011428-PA}
RP IDENTIFICATION.
RC STRAIN=WRAIR2 {ECO:0000313|EnsemblMetazoa:ADIR011428-PA};
RG EnsemblMetazoa;
RL Submitted (MAY-2020) to UniProtKB.
CC -!- COFACTOR:
CC Name=Mg(2+); Xref=ChEBI:CHEBI:18420;
CC Evidence={ECO:0000256|ARBA:ARBA00001946};
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000256|ARBA:ARBA00004123}.
CC -!- SIMILARITY: Belongs to the XPG/RAD2 endonuclease family. XPG subfamily.
CC {ECO:0000256|ARBA:ARBA00005283}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR STRING; 7168.A0A182NUS8; -.
DR EnsemblMetazoa; ADIR011428-RA; ADIR011428-PA; ADIR011428.
DR VEuPathDB; VectorBase:ADIR011428; -.
DR OrthoDB; 26655at2759; -.
DR Proteomes; UP000075884; Unassembled WGS sequence.
DR GO; GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
DR GO; GO:0004519; F:endonuclease activity; IEA:UniProtKB-KW.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR GO; GO:0003697; F:single-stranded DNA binding; IEA:InterPro.
DR GO; GO:0006289; P:nucleotide-excision repair; IEA:InterPro.
DR CDD; cd09904; H3TH_XPG; 1.
DR CDD; cd09868; PIN_XPG_RAD2; 2.
DR Gene3D; 1.10.150.20; 5' to 3' exonuclease, C-terminal subdomain; 1.
DR Gene3D; 3.40.50.1010; 5'-nuclease; 2.
DR InterPro; IPR036279; 5-3_exonuclease_C_sf.
DR InterPro; IPR008918; HhH2.
DR InterPro; IPR029060; PIN-like_dom_sf.
DR InterPro; IPR003903; UIM_dom.
DR InterPro; IPR006086; XPG-I_dom.
DR InterPro; IPR006084; XPG/Rad2.
DR InterPro; IPR001044; XPG/Rad2_eukaryotes.
DR InterPro; IPR019974; XPG_CS.
DR InterPro; IPR006085; XPG_DNA_repair_N.
DR PANTHER; PTHR16171:SF7; DNA EXCISION REPAIR PROTEIN ERCC-5; 1.
DR PANTHER; PTHR16171; DNA REPAIR PROTEIN COMPLEMENTING XP-G CELLS-RELATED; 1.
DR Pfam; PF00867; XPG_I; 1.
DR Pfam; PF00752; XPG_N; 1.
DR PRINTS; PR00853; XPGRADSUPER.
DR PRINTS; PR00066; XRODRMPGMNTG.
DR SMART; SM00279; HhH2; 1.
DR SMART; SM00484; XPGI; 1.
DR SMART; SM00485; XPGN; 1.
DR SUPFAM; SSF47807; 5' to 3' exonuclease, C-terminal subdomain; 1.
DR SUPFAM; SSF88723; PIN domain-like; 1.
DR PROSITE; PS50330; UIM; 1.
DR PROSITE; PS00841; XPG_1; 1.
DR PROSITE; PS00842; XPG_2; 1.
PE 3: Inferred from homology;
KW DNA damage {ECO:0000256|ARBA:ARBA00022763};
KW DNA repair {ECO:0000256|ARBA:ARBA00023204};
KW Endonuclease {ECO:0000256|ARBA:ARBA00022759};
KW Hydrolase {ECO:0000256|ARBA:ARBA00022801};
KW Magnesium {ECO:0000256|ARBA:ARBA00022842};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Nuclease {ECO:0000256|ARBA:ARBA00022722};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242}.
FT DOMAIN 1..98
FT /note="XPG N-terminal"
FT /evidence="ECO:0000259|SMART:SM00485"
FT DOMAIN 877..946
FT /note="XPG-I"
FT /evidence="ECO:0000259|SMART:SM00484"
FT REGION 737..811
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1129..1238
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1252..1293
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1180..1208
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1259..1293
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1309 AA; 145623 MW; F2986B218702C3B5 CRC64;
MGVLGLWRLV EQSGKPVPLD TLENKVLAVD ISIWLHQVIK GFQDSKGSAL PNAHVLGLFH
RLCKLMYYRI KPIFVFDGGV PLLKKQTIAK RHQSKNNYQN EADRIQQLLL ETLAKEKVVQ
QALGSATNIL ISPSKKAITN GGAGPSKQPE EEPDAIFKLP PMKAPEEPID LDRSDSSMDE
KASRNYYHLN LNAIDVTSIY FKNLPADVRH EILNDIKETR KQSSWGRLHE LPVESDSFSS
FQMNRLLKRR QVQVELEEAE KEMGGKCLSL AELESLLNEE GVETSTNRAA QQIASDENTR
FLLVRDVQKA IEKAKAREEA EKMAPKAPKI SKLAKEDFPD DLDDKEIDEE LQMAIKMSLM
QDETPHAMIE LDEEDLRMSH RQKQALGNAA QSLARGFMLE YGGLTSEEFN DLLHQTQDVD
GGDVNDSMSQ MFVHNGSSIV ERDSSPVQEI IDDEEERVEQ LEETKPDSGM DTESDSDFID
VPEDNLNDVQ QGTSVPLNNT NHFKPHYKPI VDFTIDDLKQ LSAGPSKKKE VVEVFIKPEE
IGTCDKDDLF ADIFTDEAKK VLQPMTSIPV EQANDLLEAS DAKEKSVETD SIAPTAFLGI
KIKKVDAINA QLKEELENLM KAPPAIDLGD ALSPSAMENP STDLKSISET LKQQLEQLKS
SANAFNLDEI KLDTVGQAVR DDDADSDTTI IYEVEKTPTK QSGVESKEES PKLPVPVIEI
LDSPAKKGTL DHLIIARTPG KDSQKQSDPE EPVPHVTKPF FVNKTPPSAK KAHQEDSSPS
KEPSSGKSVS KELFPAEPMP STSKQTVPPP PKLVDAEHLI TEMADTLKEA HTPHELKRMA
LDLAQTEREL EREKNKQSRL GVSITEQMRN DCMELLQMFG VPYIVAPMEA EAQCAFLNQL
EMTDGTITDD SDIWLFGGRT VYKNFFNQQK LVLEFTIESI EQQFHMDRKK LIQLALLVGS
DYTTGIHGIG AVTALEILAS FPPTPEQAGE TSEMMSMLSG LRKFRDWWQH GRNGATGARM
ALKSKLKNID IGEGFPSTGV VDAYLRPTVD CSEEAFAWGY PDAERLRDYA RQKFGWTQTK
TNDILLPVLK RLDERKSQAS IKNYFKVQSA VGHNKLKVSK RVQLAVDTMA GKIDPNEEKP
KRKSPAKPKQ PGTRKRKQAA GKDAGGETID LDTIDEENEA GEAPNTKPKN DDDDFVEPSA
KPTKRAPRKM PTDGAGTSGR KKAPAGPKSN EAKPTQSLAN IGGIIANINQ QSADSVTNET
ADARRKRVGN KMPDFDPAIP QRVKDQEEMA ERKRRAAALF KKLKATGKQ
//