ID R5GJE6_9BACT Unreviewed; 970 AA.
AC R5GJE6;
DT 24-JUL-2013, integrated into UniProtKB/TrEMBL.
DT 24-JUL-2013, sequence version 1.
DT 27-MAR-2024, entry version 26.
DE RecName: Full=Sulfatase N-terminal domain-containing protein {ECO:0000259|Pfam:PF00884};
GN ORFNames=BN773_01781 {ECO:0000313|EMBL:CCY15839.1};
OS Prevotella sp. CAG:755.
OC Bacteria; Bacteroidota; Bacteroidia; Bacteroidales; Prevotellaceae;
OC Prevotella.
OX NCBI_TaxID=1262935 {ECO:0000313|EMBL:CCY15839.1, ECO:0000313|Proteomes:UP000018353};
RN [1] {ECO:0000313|EMBL:CCY15839.1, ECO:0000313|Proteomes:UP000018353}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=MGS:755 {ECO:0000313|Proteomes:UP000018353};
RA Nielsen H.B., Almeida M., Juncker A.S., Rasmussen S., Li J., Sunagawa S.,
RA Plichta D., Gautier L., Le Chatelier E., Peletier E., Bonde I., Nielsen T.,
RA Manichanh C., Arumugam M., Batto J., Santos M.B.Q.D., Blom N., Borruel N.,
RA Burgdorf K.S., Boumezbeur F., Casellas F., Dore J., Guarner F., Hansen T.,
RA Hildebrand F., Kaas R.S., Kennedy S., Kristiansen K., Kultima J.R.,
RA Leonard P., Levenez F., Lund O., Moumen B., Le Paslier D., Pons N.,
RA Pedersen O., Prifti E., Qin J., Raes J., Tap J., Tims S., Ussery D.W.,
RA Yamada T., MetaHit consortium, Renault P., Sicheritz-Ponten T., Bork P.,
RA Wang J., Brunak S., Ehrlich S.D.;
RT "Dependencies among metagenomic species, viruses, plasmids and units of
RT genetic variation.";
RL Submitted (NOV-2012) to the EMBL/GenBank/DDBJ databases.
CC -!- PTM: The conversion to 3-oxoalanine (also known as C-formylglycine,
CC FGly), of a serine or cysteine residue in prokaryotes and of a cysteine
CC residue in eukaryotes, is critical for catalytic activity.
CC {ECO:0000256|PIRSR:PIRSR600917-52}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:CCY15839.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CAXR010000100; CCY15839.1; -; Genomic_DNA.
DR AlphaFoldDB; R5GJE6; -.
DR STRING; 1262935.BN773_01781; -.
DR Proteomes; UP000018353; Unassembled WGS sequence.
DR CDD; cd16144; ARS_like; 1.
DR Gene3D; 3.30.1120.10; -; 1.
DR Gene3D; 3.40.720.10; Alkaline Phosphatase, subunit A; 1.
DR InterPro; IPR017850; Alkaline_phosphatase_core_sf.
DR InterPro; IPR000917; Sulfatase_N.
DR PANTHER; PTHR42693; ARYLSULFATASE FAMILY MEMBER; 1.
DR PANTHER; PTHR42693:SF47; N-ACETYL-GALACTOSAMINE-6-SULFATASE (GALNS); 1.
DR Pfam; PF00884; Sulfatase; 1.
DR SUPFAM; SSF53649; Alkaline phosphatase-like; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000018353};
KW Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..18
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 19..970
FT /note="Sulfatase N-terminal domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5004380451"
FT DOMAIN 29..396
FT /note="Sulfatase N-terminal"
FT /evidence="ECO:0000259|Pfam:PF00884"
FT MOD_RES 83
FT /note="3-oxoalanine (Ser)"
FT /evidence="ECO:0000256|PIRSR:PIRSR600917-52"
SQ SEQUENCE 970 AA; 108869 MW; BA9A3D2AE1F0B5F8 CRC64;
MRNTTPLRLG LLSGGAMAFA MSMGAQTRPN IILFLVDDMG WQETSLPFWT EKTPLNERYR
TPNMEKLAEA GVKFTDAYAC AISSPSRASL MSGMNAARHR VTNWTLNYNT KTDAGSSVIE
LPDWNYNGIQ PATTTNPRDT VNSALVTSLP QVLHDNGYYT IHCGKAHYGS RSTTGADPLT
MGYDVNIAGS EAGGPGSYLP PYGNSNYPVP GLDDYAKDNV FLTEALTQEA IKKLTNFLDN
NESNQPFYLY MSHYAIHVPY TEDTRFSGNY KNKVDPMLGV QLNTSEINHA ALVEGMDKSL
GDIREFLESR PGLAENTIII FMSDNGGQAV SVRQGTQNRD QNYPLRGGKG SSYMGGVREP
MIVYWPGVTD QYAGTSNDSR VMIEDFYPTI LEMAGVTDYE TVQHVDGRSI VDIIKSNTQG
RDRVNIWHFP NLWGESQSRD EGYGAYSAIM KGDYHLLYFW ETQERRLYNI KEDISEENNL
IDELPDVARE LSIELADSLR SYGAQRPSFK ATGEVAPWPD DPLLVAEPGT VLTPDDRIFQ
YSDDMQKHYY RIVDNQFSAG GIHRNAYWTQ GEHYGYKAIQ ASTTLNRTER DGLSLQLFYF
EKGSDENHFR IKTLDGQNVD YVDGTTSATW NDGKPDEATE TVVDKYLQYG TPTAGEFVIR
KSAEDNYYLI GQGDELMNNR GSDYGVDASM KWVVNDYGGS VEEVAANKGS QYKFELYDPN
ASVGEGEVAE PFEGIFRYSN DSVRYYYNIR DSRPEEFFWT QGEHYDNPTI QISSEEFTGD
DAEKQLFYFM EGRNSRFFTI YTYDGQPLIF TSGTTVSSWD EAQKPEADRT TVTQRYVQFG
DGTPSQFQLV KTSNNSTYGL QVLNTLLNNR GSANGEAANM LWVVNGYSGN SVSDAGSRYR
FIPRSSKIVT GIKGVELPVD RPMTRAELAK MGDKVQVYDL RGIRVRDIRN ADRGLYIIRT
ATDAYKVALN
//