ID A0A2P6VG52_9CHLO Unreviewed; 1321 AA.
AC A0A2P6VG52;
DT 23-MAY-2018, integrated into UniProtKB/TrEMBL.
DT 23-MAY-2018, sequence version 1.
DT 27-MAR-2024, entry version 14.
DE SubName: Full=Repetin isoform B {ECO:0000313|EMBL:PSC73063.1};
GN ORFNames=C2E20_3663 {ECO:0000313|EMBL:PSC73063.1};
OS Micractinium conductrix.
OC Eukaryota; Viridiplantae; Chlorophyta; core chlorophytes; Trebouxiophyceae;
OC Chlorellales; Chlorellaceae; Chlorella clade; Micractinium.
OX NCBI_TaxID=554055 {ECO:0000313|EMBL:PSC73063.1, ECO:0000313|Proteomes:UP000239649};
RN [1] {ECO:0000313|EMBL:PSC73063.1, ECO:0000313|Proteomes:UP000239649}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=SAG 241.80 {ECO:0000313|EMBL:PSC73063.1,
RC ECO:0000313|Proteomes:UP000239649};
RX PubMed=29178410; DOI=10.1111/tpj.13789;
RA Arriola M.B., Velmurugan N., Zhang Y., Plunkett M.H., Hondzo H.,
RA Barney B.M.;
RT "Genome sequences of Chlorella sorokiniana UTEX 1602 and Micractinium
RT conductrix SAG 241.80: implications to maltose excretion by a green alga.";
RL Plant J. 93:566-586(2018).
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:PSC73063.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; LHPF02000008; PSC73063.1; -; Genomic_DNA.
DR STRING; 554055.A0A2P6VG52; -.
DR Proteomes; UP000239649; Unassembled WGS sequence.
DR Gene3D; 1.20.1280.50; -; 1.
DR Gene3D; 2.120.10.80; Kelch-type beta propeller; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR036047; F-box-like_dom_sf.
DR InterPro; IPR001810; F-box_dom.
DR InterPro; IPR015915; Kelch-typ_b-propeller.
DR InterPro; IPR006652; Kelch_1.
DR PANTHER; PTHR46093; ACYL-COA-BINDING DOMAIN-CONTAINING PROTEIN 5; 1.
DR PANTHER; PTHR46093:SF17; MULTIPLE EGF-LIKE-DOMAINS 8; 1.
DR Pfam; PF01391; Collagen; 2.
DR Pfam; PF12937; F-box-like; 1.
DR Pfam; PF01344; Kelch_1; 1.
DR SUPFAM; SSF81383; F-box domain; 1.
DR SUPFAM; SSF117281; Kelch motif; 1.
PE 4: Predicted;
KW Kelch repeat {ECO:0000256|ARBA:ARBA00022441};
KW Reference proteome {ECO:0000313|Proteomes:UP000239649};
KW Repeat {ECO:0000256|ARBA:ARBA00022737}.
FT DOMAIN 8..47
FT /note="F-box"
FT /evidence="ECO:0000259|Pfam:PF12937"
FT REGION 136..180
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 220..247
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 275..1108
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 223..244
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 327..678
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 771..1104
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1321 AA; 147205 MW; 972C29F8AEF00CDF CRC64;
MEQLGRDACL HIAQSLDARS VLALGACSRE WQEITQDEGL WQQLCRRDQP ALAAQHADAR
AAGAGWRQAY VRAHRLQALR RVAWEEGSPA GWRPRDREGH AACPWGARSM LLHGGFGGGI
AFDLHLLIPD GAAAGGGDGS AGAGSSNAAA SGAAGGSGGA AGGGTSHRWV QPRVAGTSPV
PRYSHTLTRC GRHGEMAVVF GGLMAGGYQV PLDTVAVLRR RPSPAAEEGR SERGEEDTSL
GMRHLGGEHH GNAAVLQQML LEVGLVMIED EPWGHDVVGS DEEDGSTDEE EGEDAGMHDG
VQWLSEVEEE EEEEEEQRRQ EQQEQQEQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE
QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE
QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQGQQE
QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE
QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE
QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQG QQGQQGQQGQ QGSRAAGQQG SRAAGQQGQQ
GQQGQQGQQE QQEQQGSRAA GAAGAAGAAG AAGAAGAAGA GAAGAGAGAA GAAGAAGAAG
AAGAAGAAGA AGAAGAAGAA GAAGAAGAAG AAGAAGAAGA AGAAGAAGQQ GSRGSRGSRA
AGAAGQQEQQ EQQGQQGQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE QQEQQEQQEQ
QEQQEQQEQQ EQQEQQEQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQG SRAGAAGQQG
SRSSRSSRSR SSRSSRSSRS SRSSSRSSRS SRSSRSSRSS RSSRSSRSSR SSRSSRSSRS
SRSSRSSRSS RGQQGQQGQQ GQQGQQGQQG QQGQQGQQGQ QGQQGQQGQQ GQQGQQGQQE
QQGQQGQQGQ QGQQGQQGQQ EQQEQQEQQE QQEQQEQQEQ QEQQEQQEQQ EQQEQQEQQE
QQEQQEQQEQ QEQQEQQEQQ QRRPPRPAAA NGCKIYFFGG IGPASASATL AVLDVETWTF
SRSRLDTSGA PPCARLGHSS CVYGGRLWVV GGGTGRDLLR TGRDLGDVHC LDLETREWRR
LALPPGPLCV GKCHSSVQVG PRLLFFGGGM PTCADLAWLD LDAGTWGEPA EVEGEAPDER
LSATAVLAGD EVLLFGGYSL HQREMGDLHR LRLLPEAGDR RRQGRLGCVA RRRMAQRRGW
W
//