ID A0A2P6V9B7_9CHLO Unreviewed; 902 AA.
AC A0A2P6V9B7;
DT 23-MAY-2018, integrated into UniProtKB/TrEMBL.
DT 23-MAY-2018, sequence version 1.
DT 24-JAN-2024, entry version 19.
DE RecName: Full=cysteine dioxygenase {ECO:0000256|ARBA:ARBA00013133};
DE EC=1.13.11.20 {ECO:0000256|ARBA:ARBA00013133};
GN ORFNames=C2E20_5780 {ECO:0000313|EMBL:PSC70679.1};
OS Micractinium conductrix.
OC Eukaryota; Viridiplantae; Chlorophyta; core chlorophytes; Trebouxiophyceae;
OC Chlorellales; Chlorellaceae; Chlorella clade; Micractinium.
OX NCBI_TaxID=554055 {ECO:0000313|EMBL:PSC70679.1, ECO:0000313|Proteomes:UP000239649};
RN [1] {ECO:0000313|EMBL:PSC70679.1, ECO:0000313|Proteomes:UP000239649}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=SAG 241.80 {ECO:0000313|EMBL:PSC70679.1,
RC ECO:0000313|Proteomes:UP000239649};
RX PubMed=29178410; DOI=10.1111/tpj.13789;
RA Arriola M.B., Velmurugan N., Zhang Y., Plunkett M.H., Hondzo H.,
RA Barney B.M.;
RT "Genome sequences of Chlorella sorokiniana UTEX 1602 and Micractinium
RT conductrix SAG 241.80: implications to maltose excretion by a green alga.";
RL Plant J. 93:566-586(2018).
CC -!- SIMILARITY: Belongs to the cysteine dioxygenase family.
CC {ECO:0000256|ARBA:ARBA00006622}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:PSC70679.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; LHPF02000018; PSC70679.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A2P6V9B7; -.
DR Proteomes; UP000239649; Unassembled WGS sequence.
DR GO; GO:0017172; F:cysteine dioxygenase activity; IEA:UniProtKB-EC.
DR GO; GO:0003700; F:DNA-binding transcription factor activity; IEA:InterPro.
DR GO; GO:0005506; F:iron ion binding; IEA:InterPro.
DR CDD; cd14686; bZIP; 1.
DR CDD; cd10548; cupin_CDO; 1.
DR Gene3D; 1.20.5.170; -; 1.
DR Gene3D; 2.60.120.10; Jelly Rolls; 1.
DR InterPro; IPR004827; bZIP.
DR InterPro; IPR010300; CDO_1.
DR InterPro; IPR014710; RmlC-like_jellyroll.
DR InterPro; IPR011051; RmlC_Cupin_sf.
DR PANTHER; PTHR12918; CYSTEINE DIOXYGENASE; 1.
DR PANTHER; PTHR12918:SF1; CYSTEINE DIOXYGENASE TYPE 1; 1.
DR Pfam; PF05995; CDO_I; 1.
DR SMART; SM00338; BRLZ; 1.
DR SUPFAM; SSF51182; RmlC-like cupins; 1.
PE 3: Inferred from homology;
KW Dioxygenase {ECO:0000256|ARBA:ARBA00022964, ECO:0000313|EMBL:PSC70679.1};
KW Iron {ECO:0000256|ARBA:ARBA00023004};
KW Metal-binding {ECO:0000256|ARBA:ARBA00022723};
KW Oxidoreductase {ECO:0000256|ARBA:ARBA00023002};
KW Reference proteome {ECO:0000313|Proteomes:UP000239649}.
FT DOMAIN 400..464
FT /note="BZIP"
FT /evidence="ECO:0000259|SMART:SM00338"
FT REGION 385..408
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 749..777
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 878..902
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 385..400
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 902 AA; 95029 MW; 56AA288D47E2D068 CRC64;
MAVNVQQRRA LLGALETGGG EAFKPYVHFG EEHYVRNLVF ANQDFELLVL CWRPGQGSRV
HNHADSHGWV TVLAGRVEEA RYINPVLDDS LPPSPHEAPA EPPALPGVLS ATAPCPPLAE
TGRVVGGPGA QLYINDGMAL HAVRCADDSS EAGAVTLVRL YEPEADRVVV RDMPIGAWTF
VPQRQTDEFR RSVNAAALRT DRLGLQTVGS LWMLLVVAKP LLQERLPWLA LRAASHSVLA
AALVALLAWR PATWARHREV FVSLFLLHAC ITTLDGALHG GTNILERHQG SSLLLLVLLL
VCICLQHRPV GGTLLFVPRL CVAANAVVLP AVVLVALAGS HRVCLRLLEA PGVEEPLSVL
YAWVALLHGA VTSAGVLPQA QAEEAEAAEE EWGGSEEEEE EEERRAGRRT LNRLAAARYR
QRQKAEEQQL VAQLAAGAEE RVALQAERHQ LAVDRGALLG WEAVQHDLHT CMQQLGVSGS
SPFDVAVASA QAMQPAALPA ALALAAGSGS AAAGGSGNAT GVAAAAGAGP AGGAAPEAAG
SAADAARLVA SMEREFAAEL DQVVQAEDPK AAADAVLQLL PPDVHPVLDA SAAAFVEQVK
QMQRSIQALV SAAQREYAGG GGHQPPAIAR AVEQCALLRK LLLALSLARP DPDKLLRLIA
TSAEAPPGTW ERAAEALRPS LTAEQLAGLH AARQTWVRTM GPLLAARGAQ LSRLESLQAA
LVALRAGGEA GPAEEARWTA VADAAGALAL PPSGQGQQGG AGEGVHGGGL PAAGQGSPSA
RELAAAMQVS ADVAARDYTG MWAAAQEREQ LQRELRHLSD SYKRGWTCLV RFLLQILQTI
SLLQCSLMLS ACAPHFPDLV QIACVLLEGQ GPAQPVQLQQ VAEQVEGQEQ QGQQEQDEQA
EG
//