ID A0A1Y2BUR0_9FUNG Unreviewed; 1255 AA.
AC A0A1Y2BUR0;
DT 30-AUG-2017, integrated into UniProtKB/TrEMBL.
DT 30-AUG-2017, sequence version 1.
DT 22-FEB-2023, entry version 16.
DE RecName: Full=CBM1 domain-containing protein {ECO:0000259|PROSITE:PS51164};
GN ORFNames=LY90DRAFT_672583 {ECO:0000313|EMBL:ORY38500.1};
OS Neocallimastix californiae.
OC Eukaryota; Fungi; Fungi incertae sedis; Chytridiomycota;
OC Chytridiomycota incertae sedis; Neocallimastigomycetes; Neocallimastigales;
OC Neocallimastigaceae; Neocallimastix.
OX NCBI_TaxID=1754190 {ECO:0000313|EMBL:ORY38500.1, ECO:0000313|Proteomes:UP000193920};
RN [1] {ECO:0000313|EMBL:ORY38500.1, ECO:0000313|Proteomes:UP000193920}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=G1 {ECO:0000313|EMBL:ORY38500.1,
RC ECO:0000313|Proteomes:UP000193920};
RG DOE Joint Genome Institute;
RA Haitjema C.H., Gilmore S.P., Henske J.K., Solomon K.V., De Groot R.,
RA Kuo A., Mondo S.J., Salamov A.A., Labutti K., Zhao Z., Chiniquy J.,
RA Barry K., Brewer H.M., Purvine S.O., Wright A.T., Boxma B., Van Alen T.,
RA Hackstein J.H., Baker S.E., Grigoriev I.V., O'Malley M.A.;
RT "A Parts List for Fungal Cellulosomes Revealed by Comparative Genomics.";
RL Submitted (AUG-2016) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:ORY38500.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; MCOG01000136; ORY38500.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A1Y2BUR0; -.
DR Proteomes; UP000193920; Unassembled WGS sequence.
DR GO; GO:0005576; C:extracellular region; IEA:InterPro.
DR GO; GO:0030248; F:cellulose binding; IEA:InterPro.
DR GO; GO:0005975; P:carbohydrate metabolic process; IEA:InterPro.
DR InterPro; IPR035971; CBD_sf.
DR InterPro; IPR000254; Cellulose-bd_dom_fun.
DR Pfam; PF00734; CBM_1; 1.
DR SMART; SM00236; fCBD; 1.
DR SUPFAM; SSF57180; Cellulose-binding domain; 1.
DR PROSITE; PS00562; CBM1_1; 1.
DR PROSITE; PS51164; CBM1_2; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000193920};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..20
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 21..1255
FT /note="CBM1 domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5012621207"
FT DOMAIN 1217..1253
FT /note="CBM1"
FT /evidence="ECO:0000259|PROSITE:PS51164"
FT REGION 1165..1210
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1165..1196
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1255 AA; 140828 MW; 8C0D9F2720D33B80 CRC64;
MKINTGVLFL LLGFNSFTYS YPSINTLTTS LPEKEDDSSC FVLAMVTERQ RECNKLGGYF
HDEAFLMKET QKCKTFATCL LPATDEDIEK KSQCVEVDDG FSKGKIYCSV LFDDVTSQGE
NESFLEYAQR IDSFYEVLDL KPVNTSEIES PETGIPSLPE KEDSSSCGIS AMVTERQKEC
NKLGGYFHDE AFLMKETQKC KTFATCFLPA TAEDIEKKSQ CVEVDDGYSK EKIYCSVLFD
DVTSQGENES FLEYAQRIDS FYEVLHLKPV NTSEIESPET EIPSLPEKED RSSCGISAMV
TERQKECNKL GGYFHDEAFL MKETQKCKTF ATCLLPATAE DIEKKSQCVE VDDGFSKGKI
YCSVLFDDVT SQGENESFLE YAQRIDSFYE VLDLKPVNAS EIESPETGIP SLPEKEDRSS
CGISAMVTER QKECNKLGGY FHDEAFLMKE TQKCETFATC LLPATAEDIE KKSQCVEVDD
GYSKGKIYCS VLFDDVTSQG ENESFLEYAQ RIDSFYEVLD LKPVNASEIE SPETGIPSLP
EKEDRSSCGI SAMVTERQKE CNKLGGYFHD EAFLMKETQK CKTFATCLLP ATAEDIEKKS
QCVEVDDGYS KGKIYCSVLF DDVTSQGENE SFLEYAQRID SFYEVLDLKP VNASEIESPE
TGIPSLPEKE DRSSCGISAM VTERQKECNK LGGYFHDEAF LMKETQKCKT FATCLLPATA
EDIEKKSQCV EVDDGYSKGK IYCSVLFDDV TSQGENESFL EYAQRIDSFY EVLDLKPVNA
SEIESPETGI PSLPEKEDRS SCGISAMVTE RQKECNKLGG YFHDEAFLMK ETQKCKTFAT
CFLPATDEDI EKKSQCVEVD DGYSKGKIYC SVLFDDVTSQ GENESFLEYA QRIDSFYEVL
DLKPVNASEI ESPETGIPSL PEKEDRSSCG ISAMVTERQK ECNKLGGYFH DEAFLMKETQ
KCKTFATCFL PATDEDIEKK SQCVEVDDGF SKGKIYCSVL FDDVTSQGEN ESFLEYAQRI
DSFYEVLDLK PVNTSEIESP ETGIPSLPEK EDRSSCGISA MVTERQKECN KLGGYFHDEA
FLMKETQKCK TFATCLLPAT AEDIEKKSQC VEVDDGFSKG KIYCSVLFDD VTSQGENESF
LEYAQRIDSF YEVLDLKPVN ASKLESPETK TTKTIPSSTK SLPTINNVTD EPLVKPEETN
TPNLPEKEDN SVVDDRQCAG KWAQCGGKMF NGPTCCKSGF TCHKFNEYFS QCINF
//