ID A0A1A6GF65_NEOLE Unreviewed; 699 AA.
AC A0A1A6GF65;
DT 05-OCT-2016, integrated into UniProtKB/TrEMBL.
DT 05-OCT-2016, sequence version 1.
DT 22-FEB-2023, entry version 26.
DE RecName: Full=Homeobox domain-containing protein {ECO:0000259|PROSITE:PS50071};
GN ORFNames=A6R68_06620 {ECO:0000313|EMBL:OBS64871.1};
OS Neotoma lepida (Desert woodrat).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea;
OC Cricetidae; Neotominae; Neotoma.
OX NCBI_TaxID=56216 {ECO:0000313|EMBL:OBS64871.1, ECO:0000313|Proteomes:UP000092124};
RN [1] {ECO:0000313|EMBL:OBS64871.1, ECO:0000313|Proteomes:UP000092124}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=417 {ECO:0000313|EMBL:OBS64871.1};
RC TISSUE=Liver {ECO:0000313|EMBL:OBS64871.1};
RA Campbell M., Oakeson K.F., Yandell M., Halpert J.R., Dearing D.;
RT "The Draft Genome Sequence and Annotation of the Desert Woodrat Neotoma
RT lepida.";
RL Submitted (JUN-2016) to the EMBL/GenBank/DDBJ databases.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000256|PROSITE-ProRule:PRU00108,
CC ECO:0000256|RuleBase:RU000682}.
CC -!- SIMILARITY: Belongs to the SIX/Sine oculis homeobox family.
CC {ECO:0000256|ARBA:ARBA00008161}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:OBS64871.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; LZPO01097126; OBS64871.1; -; Genomic_DNA.
DR STRING; 56216.A0A1A6GF65; -.
DR Proteomes; UP000092124; Unassembled WGS sequence.
DR GO; GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
DR GO; GO:0003677; F:DNA binding; IEA:UniProtKB-UniRule.
DR GO; GO:0000981; F:DNA-binding transcription factor activity, RNA polymerase II-specific; IEA:InterPro.
DR CDD; cd00086; homeodomain; 1.
DR Gene3D; 1.10.10.60; Homeodomain-like; 1.
DR InterPro; IPR009057; Homeobox-like_sf.
DR InterPro; IPR017970; Homeobox_CS.
DR InterPro; IPR001356; Homeobox_dom.
DR InterPro; IPR031701; SIX1_SD.
DR PANTHER; PTHR10390; HOMEOBOX PROTEIN SIX; 1.
DR PANTHER; PTHR10390:SF40; HOMEOBOX PROTEIN SIX5; 1.
DR Pfam; PF00046; Homeodomain; 1.
DR Pfam; PF16878; SIX1_SD; 1.
DR SMART; SM00389; HOX; 1.
DR SUPFAM; SSF46689; Homeodomain-like; 1.
DR PROSITE; PS00027; HOMEOBOX_1; 1.
DR PROSITE; PS50071; HOMEOBOX_2; 1.
PE 3: Inferred from homology;
KW Developmental protein {ECO:0000256|ARBA:ARBA00022473};
KW DNA-binding {ECO:0000256|ARBA:ARBA00023125, ECO:0000256|PROSITE-
KW ProRule:PRU00108};
KW Homeobox {ECO:0000256|ARBA:ARBA00023155, ECO:0000256|PROSITE-
KW ProRule:PRU00108};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242, ECO:0000256|PROSITE-
KW ProRule:PRU00108}; Reference proteome {ECO:0000313|Proteomes:UP000092124}.
FT DOMAIN 199..250
FT /note="Homeobox"
FT /evidence="ECO:0000259|PROSITE:PS50071"
FT DNA_BIND 201..251
FT /note="Homeobox"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00108"
FT REGION 1..74
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 242..286
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 348..429
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 564..614
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 16..30
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 579..606
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 699 AA; 71686 MW; 5457050E4E95E882 CRC64;
MTDSGTHVLD ALRLDIPRGA TEEEEEEARQ LLQTLQAAEG EAAAAGAAEA SAADPGSPSG
LGSPPETAAE APTGLRFSPE QVACVCEALL QAGHAGRLSR FLGALPPAER LRGSDPVLRA
RALVAFQRGE YAELYRLLES RPFPAAHHAF LQDLYLRARY HEAERARGRA LGAVDKYRLR
KKFPLPKTIW DGEETVYCFK ERSRAALKAC YRGNRYPTPD EKRRLATLTG LSLTQVSNWF
KNRRQRDRTG TGGGVPCKSE SDGNPTTEDE SSGSPEDLER GVAPVAAEAP TQSSIFLAGA
TPPATCPASS SILVNGSFLA ASSPPAVLLN GSPVIINSLA LGEASSLGPL LLTGGTAPPP
QPSPQGASEA KTSLVLDPQT GEVRLEEAPS EAPETKGVQG AVPGAAGEEV PGTLPQVVPG
PPPASTFSLT PGAVPSVAAP QVVPLSPSSG YPTGLSPTSP LLNLPQVVPT SQVVTLPQAV
GPIQLLAAGP GSPVKVAATA GPTNVHLINS GVGVTALQLP SATTPGNFLL ANPVSGSPIV
TGVLPPAPSX ALPLKQEPAI TVPEGALPVT PSPALPEGHT LGQISTQPLP PAPAVTSTTS
LPFSPDSSGL LPGFPTPLPE GLMLSHAAVP VWPAGLELST GVEGLGTQAT HTVLRLPDPD
PQGLLLGATG GTEVDEGLEA EAKVLTQLQS VPVEEPLEL
//