GenomeNet

Database: UniProt
Entry: A0A1I5Y796_9BACT
LinkDB: A0A1I5Y796_9BACT
Original site: A0A1I5Y796_9BACT 
ID   A0A1I5Y796_9BACT        Unreviewed;      1085 AA.
AC   A0A1I5Y796;
DT   22-NOV-2017, integrated into UniProtKB/TrEMBL.
DT   22-NOV-2017, sequence version 1.
DT   24-JAN-2024, entry version 13.
DE   SubName: Full=Intein N-terminal splicing region {ECO:0000313|EMBL:SFQ40068.1};
GN   ORFNames=SAMN04515674_11767 {ECO:0000313|EMBL:SFQ40068.1};
OS   Pseudarcicella hirudinis.
OC   Bacteria; Bacteroidota; Cytophagia; Cytophagales; Spirosomataceae;
OC   Pseudarcicella.
OX   NCBI_TaxID=1079859 {ECO:0000313|EMBL:SFQ40068.1, ECO:0000313|Proteomes:UP000199306};
RN   [1] {ECO:0000313|EMBL:SFQ40068.1, ECO:0000313|Proteomes:UP000199306}
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC   STRAIN=E92,LMG 26720,CCM 7988 {ECO:0000313|Proteomes:UP000199306};
RA   de Groot N.N.;
RL   Submitted (OCT-2016) to the EMBL/GenBank/DDBJ databases.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; FOXH01000017; SFQ40068.1; -; Genomic_DNA.
DR   AlphaFoldDB; A0A1I5Y796; -.
DR   STRING; 1079859.SAMN04515674_11767; -.
DR   OrthoDB; 6225685at2; -.
DR   Proteomes; UP000199306; Unassembled WGS sequence.
DR   GO; GO:0016539; P:intein-mediated protein splicing; IEA:InterPro.
DR   CDD; cd00081; Hint; 1.
DR   Gene3D; 2.170.16.10; Hedgehog/Intein (Hint) domain; 1.
DR   InterPro; IPR003587; Hint_dom_N.
DR   InterPro; IPR036844; Hint_dom_sf.
DR   InterPro; IPR030934; Intein_C.
DR   InterPro; IPR006141; Intein_N.
DR   NCBIfam; TIGR01445; intein_Nterm; 1.
DR   Pfam; PF07591; PT-HINT; 1.
DR   SMART; SM00306; HintN; 1.
DR   SUPFAM; SSF51294; Hedgehog/intein (Hint) domain; 1.
DR   PROSITE; PS50818; INTEIN_C_TER; 1.
DR   PROSITE; PS50817; INTEIN_N_TER; 1.
PE   4: Predicted;
KW   Reference proteome {ECO:0000313|Proteomes:UP000199306};
KW   Signal {ECO:0000256|SAM:SignalP}.
FT   SIGNAL          1..24
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT   CHAIN           25..1085
FT                   /evidence="ECO:0000256|SAM:SignalP"
FT                   /id="PRO_5011527552"
FT   DOMAIN          873..897
FT                   /note="Intein C-terminal splicing"
FT                   /evidence="ECO:0000259|PROSITE:PS50818"
SQ   SEQUENCE   1085 AA;  119974 MW;  4083A42E3EA325EC CRC64;
     MNKIIRYFTL ILIVCLSIVQ FVQAEVSVNV DTRDSASQTF IEKYIEVGSR RLNEMIAQNE
     ANDLNGESYK NIYVIDIAKE MVKSGFLANE EYLKSFTVRN DIVPEKVGKI SLKEYDEAMI
     SANNYLSNIN TTLGAKAKIY MGLNLYMGDV FFTKEVTGVL NPSNFDMVTP GNEKGYTSVK
     TKFNDIYSAI YNNIINKTNV VVISYGRLRE YIAFSEKDKI QAHPVIYGIW DIQKGSVAYL
     PGKNPNLGEV RKNSSNGITK NALEGKLESA VWAVADAILD LDPRFINVST VVTALNSDEK
     KKYQDFFDAL DRNLFDKENT IVLYYDDATI SAYTSKKDEL LKRGKDIIGV KLVGENVVKV
     DFILGKAKGL INNIQISGPS CYQESVVLSA QDEQEYSGFL TKLSLGTNIL PELLRPQRVQ
     YSLYKFIFRF LLCSTNEANV RASCKCAGPV DPSNGDTCDD MNKTCVFVGG AVNGLIDDLD
     IVSAVEGLGT LSLGFAGAVA SVANWGWNRV KDLESPKSWD ISDPQRIQAF CQTLSWKQTE
     DALKVLPSKV QQSAYWKKTM AVVDFINKFS NNTYNRGWAI ATFWTLGEAA AKIVGKLTKK
     QLNKFVKSGD EAIAGVTALK TIWDAEDLVT DGTQIVTKNA LGDATALLDN TPSGIRVLFN
     KFNKASGWFL NKAGEAGITK TYDEATGYLK IFSNGVGSET VAVIRKNGDE AVLEIKKYSD
     DGVDVPFDNS PSMEIIEPNG TKAQTEITVG IDANNNFVCV DGINCFSKGT EIAVEGGHRK
     IEELKPGDLV KSYNASTGRV ILNKVVGIFH KTASKLIKVF IKGDTITTTS EHPFYVNNTW
     LPAEKLYKGL RLLTLSGALI SVDSVATVQE TLSVYNFEVE HSHNYFVGNE EYLVHNACRI
     IDGASDKVQK AYNSASGSKL ANALASSKPM VSSLNKYAYN AHHLIPKEII VKLKGSLFAA
     IDAGFDFNGK VNGKWLKRYV GNGNNKYGDL DGLHTGHSKY NEAIVILFQE FEKLAKNKLP
     NGQFNPQRCF DFCNDLTQEL EKMTNNLQVK RAATPLSTKM TLDDLESEIL NLLTDTSKSL
     NQKYL
//
DBGET integrated database retrieval system