ID R9NKJ8_9FIRM Unreviewed; 780 AA.
AC R9NKJ8;
DT 24-JUL-2013, integrated into UniProtKB/TrEMBL.
DT 24-JUL-2013, sequence version 1.
DT 24-JAN-2024, entry version 36.
DE RecName: Full=Glucan-binding protein {ECO:0008006|Google:ProtNLM};
GN ORFNames=C817_01643 {ECO:0000313|EMBL:EOS80517.1};
OS Dorea sp. 5-2.
OC Bacteria; Bacillota; Clostridia; Eubacteriales; Lachnospiraceae; Dorea.
OX NCBI_TaxID=1235798 {ECO:0000313|EMBL:EOS80517.1, ECO:0000313|Proteomes:UP000014211};
RN [1] {ECO:0000313|EMBL:EOS80517.1, ECO:0000313|Proteomes:UP000014211}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=5-2 {ECO:0000313|EMBL:EOS80517.1,
RC ECO:0000313|Proteomes:UP000014211};
RG The Broad Institute Genomics Platform;
RG The Broad Institute Genome Sequencing Center for Infectious Disease;
RA Earl A., Xavier R., Elson C., Duck W., Walker B., Young S., Zeng Q.,
RA Gargeya S., Fitzgerald M., Haas B., Abouelleil A., Allen A.W., Alvarado L.,
RA Arachchi H.M., Berlin A.M., Chapman S.B., Gainer-Dewar J., Goldberg J.,
RA Griggs A., Gujja S., Hansen M., Howarth C., Imamovic A., Ireland A.,
RA Larimer J., McCowan C., Murphy C., Pearson M., Poon T.W., Priest M.,
RA Roberts A., Saif S., Shea T., Sisk P., Sykes S., Wortman J., Nusbaum C.,
RA Birren B.;
RT "The Genome Sequence of Dorea bacterium 5-2.";
RL Submitted (APR-2013) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:EOS80517.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; ASTD01000031; EOS80517.1; -; Genomic_DNA.
DR RefSeq; WP_016218384.1; NZ_KE159748.1.
DR AlphaFoldDB; R9NKJ8; -.
DR STRING; 1235798.C817_01643; -.
DR PATRIC; fig|1235798.3.peg.1750; -.
DR eggNOG; COG5263; Bacteria.
DR HOGENOM; CLU_358932_0_0_9; -.
DR OrthoDB; 9765879at2; -.
DR Proteomes; UP000014211; Unassembled WGS sequence.
DR Gene3D; 2.10.270.10; Cholin Binding; 7.
DR InterPro; IPR018337; Cell_wall/Cho-bd_repeat.
DR Pfam; PF01473; Choline_bind_1; 5.
DR Pfam; PF19127; Choline_bind_3; 3.
DR SUPFAM; SSF69360; Cell wall binding repeat; 4.
DR PROSITE; PS51170; CW; 5.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000014211};
KW Repeat {ECO:0000256|ARBA:ARBA00022737}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..27
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 28..780
FT /note="Glucan-binding protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5004477785"
FT REPEAT 262..281
FT /note="Cell wall-binding"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00591"
FT REPEAT 338..357
FT /note="Cell wall-binding"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00591"
FT REPEAT 577..596
FT /note="Cell wall-binding"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00591"
FT REPEAT 661..680
FT /note="Cell wall-binding"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00591"
FT REPEAT 703..727
FT /note="Cell wall-binding"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00591"
FT REGION 35..177
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 41..115
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 119..174
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 780 AA; 88964 MW; E5671EF7FBEEE685 CRC64;
MKRNQLRFMA AFMAAGLVFA QTPTVLAADN TAVPVENDAG TPQDPDTTVT PDTTQAPDAT
TTPDTTQEPD ATVTPDTTET PDTTETPDIT ETPDTTVTPD TTVTPDTTVT PDTTVTEEPV
VPEETEEPVV PEETEEPVVP EETEEPVVPE ETEEPVVEEE PIEEAPVEEP EVTEDDDQKL
THWSIEDGMW KYENGVWTYV YSDGSLIKDT RVEINGSMFE FDKDGKMLTG WQKTETTTTD
ADGNEVTNTT WHYYDPATGA GHTGWLLIGG TWYYLDYNGD MYDNTDGSRY IDGVEYRFHA
SGAMVTGWYA DQSERTDSEG NKVTETTWYY YDASGAPHSG WLLENGKWYY TDPDGSMAQD
TFTSIGGTNY AFDKSGAMVV GWYSKERENY YGEKEVTWYY CDASGAAHDG WLLENNTWYY
LNNGHMVSDG VREINGTDYL FNKSGAMTSG WYAHMSKEYD KTDETTGKPI YKDVATWYYA
DANGVAQKGW ILDGGKWYFL DRDTHDMYQS RDNTYARSWN IDGKEYRFDK SGAMITGWYL
NTYTYTTPEW DDEKQEYVDV EKTRSDWYYH DTSGAAHKGW LLYNGTWYYT DPDGQMYCKD
KDNEYDDGQR RIDRVLYQFD ENGAMLTGGS TGWRAQEWID EDDDNTKKIT WFYHETSGAI
RTGWLQYNNN WYYLDKYDGQ MYVDEMCGID GRIYLFDKNG IMKTGWYEDT DEGFEGSKYY
FDNSGAMVKG WQEIGGKKYY FATEDGTMYS EGTHTIGGKD YTFDKDGVCT GEVKDDSATE
//