ID A0A445F1Z4_GLYSO Unreviewed; 1134 AA.
AC A0A445F1Z4;
DT 08-MAY-2019, integrated into UniProtKB/TrEMBL.
DT 08-MAY-2019, sequence version 1.
DT 27-MAR-2024, entry version 14.
DE SubName: Full=DNA mismatch repair protein MSH1, mitochondrial {ECO:0000313|EMBL:RZB42802.1};
GN ORFNames=D0Y65_053404 {ECO:0000313|EMBL:RZB42802.1};
OS Glycine soja (Wild soybean).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; fabids; Fabales; Fabaceae; Papilionoideae; 50 kb inversion clade;
OC NPAAA clade; indigoferoid/millettioid clade; Phaseoleae; Glycine;
OC Glycine subgen. Soja.
OX NCBI_TaxID=3848 {ECO:0000313|EMBL:RZB42802.1, ECO:0000313|Proteomes:UP000289340};
RN [1] {ECO:0000313|EMBL:RZB42802.1, ECO:0000313|Proteomes:UP000289340}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. W05 {ECO:0000313|Proteomes:UP000289340};
RC TISSUE=Hypocotyl of etiolated seedlings {ECO:0000313|EMBL:RZB42802.1};
RA Xie M., Chung C.Y.L., Li M.-W., Wong F.-L., Chan T.-F., Lam H.-M.;
RT "A high-quality reference genome of wild soybean provides a powerful tool
RT to mine soybean genomes.";
RL Submitted (SEP-2018) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:RZB42802.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; QZWG01000020; RZB42802.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A445F1Z4; -.
DR OrthoDB; 36950at2759; -.
DR Proteomes; UP000289340; Chromosome 20.
DR GO; GO:0005524; F:ATP binding; IEA:UniProtKB-KW.
DR GO; GO:0030983; F:mismatched DNA binding; IEA:InterPro.
DR GO; GO:0006298; P:mismatch repair; IEA:InterPro.
DR CDD; cd03243; ABC_MutS_homologs; 1.
DR Gene3D; 3.40.1170.10; DNA repair protein MutS, domain I; 1.
DR Gene3D; 3.40.50.300; P-loop containing nucleotide triphosphate hydrolases; 1.
DR InterPro; IPR007695; DNA_mismatch_repair_MutS-lik_N.
DR InterPro; IPR000432; DNA_mismatch_repair_MutS_C.
DR InterPro; IPR016151; DNA_mismatch_repair_MutS_N.
DR InterPro; IPR035901; GIY-YIG_endonuc_sf.
DR InterPro; IPR027417; P-loop_NTPase.
DR PANTHER; PTHR48360; MUTL PROTEIN ISOFORM 1; 1.
DR PANTHER; PTHR48360:SF1; MUTL PROTEIN ISOFORM 1; 1.
DR Pfam; PF01624; MutS_I; 1.
DR Pfam; PF00488; MutS_V; 1.
DR SMART; SM00534; MUTSac; 1.
DR SUPFAM; SSF55271; DNA repair protein MutS, domain I; 1.
DR SUPFAM; SSF82771; GIY-YIG endonuclease; 1.
DR SUPFAM; SSF52540; P-loop containing nucleoside triphosphate hydrolases; 1.
DR PROSITE; PS00486; DNA_MISMATCH_REPAIR_2; 1.
PE 4: Predicted;
KW ATP-binding {ECO:0000256|ARBA:ARBA00022840};
KW DNA damage {ECO:0000256|ARBA:ARBA00022763};
KW DNA-binding {ECO:0000256|ARBA:ARBA00023125};
KW Nucleotide-binding {ECO:0000256|ARBA:ARBA00022741};
KW Reference proteome {ECO:0000313|Proteomes:UP000289340}.
FT DOMAIN 844..860
FT /note="DNA mismatch repair proteins mutS family"
FT /evidence="ECO:0000259|PROSITE:PS00486"
SQ SEQUENCE 1134 AA; 126358 MW; 60D5DE4A49EA34DD CRC64;
MFRLATRNVA LFLPRWCSLA RFSPSPPFPF LISSLPSRFL RINGHVKNVT SYAEKKVSRG
STKATKKPKV PNNNGLDDKD LPHILWWKER LQMCRKLSTV QLIERLEFSN LLGLNSNLKN
GSLKEGTLNW EMLQFKSKFP RQVLLCRVGE FYEAWGIDAC ILVEYVGLNP IGGLRSDSIP
RAGCPVVNLR QTLDDLTTNG YSVCIVEEAQ GPSQARSRKR RFISGHAHPG NPYVYGLATV
DHDLNFPEPM PVVGISHSAR GYCINMVLET MKTYSSEDCL TEEAVVTKLR TCQYHHLFLH
TSLRQNSSGT CDWGEFGEGG LLWGECSSRH FEWFDGNPIS DLLAKVKELY SLDEEVTFRN
ATVYSGNRAR PLTLGTSTQI GAIPTEGIPS LLKVLLSRNC NGLPALYIRD LLLNPPSYEI
ASKIQATCKL MSSVTCSIPE FTCVSSAKLV KLLEWREVNH MEFCRIKNVL DEILLMNKTS
ELNDILKHLI DPTWVATGLE IDFETLVAGC EVASTKIGDI ISLDGGNDQK INSFSLIPHE
FFEDIESKWK GRIKRIHIDD VFTAVEKAAE ALHIAVTEDF VPILSRIKAT VSPLGGPKGE
ISYAREHEAV WFKGKRFTPN LWAGSPGEEQ IKQLSHALDS KGKKAGEEWF TTLKVEAALT
RYHEANGKAK ERVLEILRGL AAELQYNINI LVFSSTLLVI AKALFAHASE GRRRRWVFPT
LVESLGFEDV KSLNKIHGMK IVGLLPYWLH VAEGVVRNDV DMQSLFLLTG PNGGGKSSLL
RSICAAALLG ICGLMVPAES AHIPYFDSIM LHMNSYDSPA DKKSSFQVEM SELRSIIGGT
TKKSLVLIDE ICRGTETAKG TCIAGSIIET LDRIGCLGIV STHLHGIFTL HLNINNTVHK
AMGTTSIDGQ TIPTWKLTDG VCRESLAFET ARREGVPELI IRRAEYIYQS VYAKEKELLS
AEKSSNEKKY STYINVSNLN GTHLPSKRFL SGANQTEVLR EEVESAVTVI CQDHIMEQKS
KKIALELTGI KCLQIRTREQ PPPSVVGSSS VYVMFRPDKK LYVGETDDLE GRVRAHRLKE
GMHDASFLYF LVPGKSLACQ LESLLINQLS SRGFQLTNTA DGKHRNFGTS NLYA
//