ID K7GF76_PELSI Unreviewed; 454 AA.
AC K7GF76;
DT 09-JAN-2013, integrated into UniProtKB/TrEMBL.
DT 09-JAN-2013, sequence version 1.
DT 27-MAR-2024, entry version 71.
DE SubName: Full=Homeobox containing 1 {ECO:0000313|Ensembl:ENSPSIP00000018937.1};
GN Name=HMBOX1 {ECO:0000313|Ensembl:ENSPSIP00000018937.1};
OS Pelodiscus sinensis (Chinese softshell turtle) (Trionyx sinensis).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Archelosauria; Testudinata; Testudines; Cryptodira; Trionychia;
OC Trionychidae; Pelodiscus.
OX NCBI_TaxID=13735 {ECO:0000313|Ensembl:ENSPSIP00000018937.1, ECO:0000313|Proteomes:UP000007267};
RN [1] {ECO:0000313|Proteomes:UP000007267}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Daiwa-1 {ECO:0000313|Proteomes:UP000007267};
RG Soft-shell Turtle Genome Consortium;
RL Submitted (OCT-2011) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000007267}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Daiwa-1 {ECO:0000313|Proteomes:UP000007267};
RX PubMed=23624526; DOI=10.1038/ng.2615;
RA Wang Z., Pascual-Anaya J., Zadissa A., Li W., Niimura Y., Huang Z., Li C.,
RA White S., Xiong Z., Fang D., Wang B., Ming Y., Chen Y., Zheng Y.,
RA Kuraku S., Pignatelli M., Herrero J., Beal K., Nozawa M., Li Q., Wang J.,
RA Zhang H., Yu L., Shigenobu S., Wang J., Liu J., Flicek P., Searle S.,
RA Wang J., Kuratani S., Yin Y., Aken B., Zhang G., Irie N.;
RT "The draft genomes of soft-shell turtle and green sea turtle yield insights
RT into the development and evolution of the turtle-specific body plan.";
RL Nat. Genet. 45:701-706(2013).
RN [3] {ECO:0000313|Ensembl:ENSPSIP00000018937.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000256|ARBA:ARBA00004123,
CC ECO:0000256|PROSITE-ProRule:PRU00108, ECO:0000256|RuleBase:RU000682}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AGCU01169202; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01169203; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01169204; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01169205; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01169206; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01169207; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01169208; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01169209; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR RefSeq; XP_006135637.1; XM_006135575.1.
DR AlphaFoldDB; K7GF76; -.
DR STRING; 13735.ENSPSIP00000018937; -.
DR Ensembl; ENSPSIT00000019023.1; ENSPSIP00000018937.1; ENSPSIG00000016826.1.
DR GeneID; 102449167; -.
DR CTD; 79618; -.
DR eggNOG; ENOG502QQSR; Eukaryota.
DR GeneTree; ENSGT00940000154928; -.
DR HOGENOM; CLU_052355_1_0_1; -.
DR OMA; CLAVMEX; -.
DR OrthoDB; 5399075at2759; -.
DR TreeFam; TF320327; -.
DR Proteomes; UP000007267; Unassembled WGS sequence.
DR GO; GO:0005813; C:centrosome; IEA:Ensembl.
DR GO; GO:0000781; C:chromosome, telomeric region; IEA:Ensembl.
DR GO; GO:0005829; C:cytosol; IEA:Ensembl.
DR GO; GO:0016604; C:nuclear body; IEA:Ensembl.
DR GO; GO:0003691; F:double-stranded telomeric DNA binding; IEA:Ensembl.
DR GO; GO:0042802; F:identical protein binding; IEA:Ensembl.
DR GO; GO:0044877; F:protein-containing complex binding; IEA:Ensembl.
DR GO; GO:1990837; F:sequence-specific double-stranded DNA binding; IEA:Ensembl.
DR GO; GO:0000122; P:negative regulation of transcription by RNA polymerase II; IEA:Ensembl.
DR GO; GO:0045893; P:positive regulation of DNA-templated transcription; IEA:InterPro.
DR GO; GO:0032212; P:positive regulation of telomere maintenance via telomerase; IEA:Ensembl.
DR CDD; cd00086; homeodomain; 1.
DR CDD; cd00093; HTH_XRE; 1.
DR Gene3D; 1.10.10.60; Homeodomain-like; 1.
DR Gene3D; 1.10.260.40; lambda repressor-like DNA-binding domains; 1.
DR InterPro; IPR001387; Cro/C1-type_HTH.
DR InterPro; IPR040363; HMBOX1.
DR InterPro; IPR006899; HNF-1_N.
DR InterPro; IPR044869; HNF-1_POU.
DR InterPro; IPR044866; HNF_P1.
DR InterPro; IPR009057; Homeobox-like_sf.
DR InterPro; IPR001356; Homeobox_dom.
DR InterPro; IPR010982; Lambda_DNA-bd_dom_sf.
DR PANTHER; PTHR14618:SF0; HOMEOBOX-CONTAINING PROTEIN 1; 1.
DR PANTHER; PTHR14618; HOMEODOX-CONTAINING PROTEIN 1 HMBOX1; 1.
DR Pfam; PF04814; HNF-1_N; 1.
DR Pfam; PF00046; Homeodomain; 1.
DR SMART; SM00389; HOX; 1.
DR SUPFAM; SSF46689; Homeodomain-like; 1.
DR SUPFAM; SSF47413; lambda repressor-like DNA-binding domains; 1.
DR PROSITE; PS51937; HNF_P1; 1.
DR PROSITE; PS50071; HOMEOBOX_2; 1.
DR PROSITE; PS51936; POU_4; 1.
PE 4: Predicted;
KW DNA-binding {ECO:0000256|ARBA:ARBA00023125, ECO:0000256|PROSITE-
KW ProRule:PRU00108};
KW Homeobox {ECO:0000256|ARBA:ARBA00023155, ECO:0000256|PROSITE-
KW ProRule:PRU00108};
KW Nucleus {ECO:0000256|ARBA:ARBA00023242, ECO:0000256|PROSITE-
KW ProRule:PRU00108}; Reference proteome {ECO:0000313|Proteomes:UP000007267};
KW Transcription {ECO:0000256|ARBA:ARBA00023163};
KW Transcription regulation {ECO:0000256|ARBA:ARBA00023015}.
FT DOMAIN 18..49
FT /note="HNF-p1"
FT /evidence="ECO:0000259|PROSITE:PS51937"
FT DOMAIN 146..242
FT /note="POU-specific atypical"
FT /evidence="ECO:0000259|PROSITE:PS51936"
FT DOMAIN 266..341
FT /note="Homeobox"
FT /evidence="ECO:0000259|PROSITE:PS50071"
FT DNA_BIND 268..342
FT /note="Homeobox"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00108"
FT REGION 56..121
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 353..424
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 68..121
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 378..417
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 454 AA; 51283 MW; B1DCD28C437D8695 CRC64;
MLRTFPVVLL ETMSHYTDEP RFTIEQIDLL QRLRRTGMTR HEILHALETL DRLDQEHSDK
FGRRSSYGGG SYGNNTNNVA ASSSTATAST QTQHSGMSPS PSNSYDTSPQ PCTTNQNGRE
SNERLSAFNG KMSPTRYPLA NSLAQRSYSF EASEEDLDVD DKVEELMRRD SSVIKEEIKA
FLANRRISQA VVAQVTGISQ SRISHWLLQQ GSDLSEQKKR AFYRWYQLEK TNPGATLSMR
PAPVPVEEPE WRQTPPPVTA TSGTFRLRRG SRFTWRKECL AVMESYFNEN QYPDEAKREE
IANACNAVIQ KPGKKLSDLE RVTSLKVYNW FANRRKEIKR RANIAAILES HGIDVQSPGG
HSNSDDVDGN DYSEQDTWQV RNGEEEGRCS EGGREAEKVE EDRRICSKQD DSTSHSDHQD
PISLAVEMAA VNHTILALAR QGTNEIKTEA IDDD
//