ID A0A2U1P090_ARTAN Unreviewed; 2332 AA.
AC A0A2U1P090;
DT 18-JUL-2018, integrated into UniProtKB/TrEMBL.
DT 18-JUL-2018, sequence version 1.
DT 28-JAN-2026, entry version 14.
DE SubName: Full=Heteroglycan glucosidase 1 {ECO:0000313|EMBL:PWA79137.1};
GN ORFNames=CTI12_AA207860 {ECO:0000313|EMBL:PWA79137.1};
OS Artemisia annua (Sweet wormwood).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC asterids; campanulids; Asterales; Asteraceae; Asteroideae; Anthemideae;
OC Artemisiinae; Artemisia.
OX NCBI_TaxID=35608 {ECO:0000313|EMBL:PWA79137.1, ECO:0000313|Proteomes:UP000245207};
RN [1] {ECO:0000313|EMBL:PWA79137.1, ECO:0000313|Proteomes:UP000245207}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Huhao1 {ECO:0000313|Proteomes:UP000245207};
RC TISSUE=Leaf {ECO:0000313|EMBL:PWA79137.1};
RX PubMed=29703587; DOI=10.1016/j.molp.2018.03.015;
RA Shen Q., Zhang L., Liao Z., Wang S., Yan T., Shi P., Liu M., Fu X., Pan Q.,
RA Wang Y., Lv Z., Lu X., Zhang F., Jiang W., Ma Y., Chen M., Hao X., Li L.,
RA Tang Y., Lv G., Zhou Y., Sun X., Brodelius P.E., Rose J.K.C., Tang K.;
RT "The genome of Artemisia annua provides insight into the evolution of
RT Asteraceae family and artemisinin biosynthesis.";
RL Mol. Plant 11:776-788(2018).
CC -!- SIMILARITY: Belongs to the glycosyl hydrolase 31 family.
CC {ECO:0000256|ARBA:ARBA00007806}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:PWA79137.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; PKPP01001897; PWA79137.1; -; Genomic_DNA.
DR STRING; 35608.A0A2U1P090; -.
DR OrthoDB; 1334205at2759; -.
DR Proteomes; UP000245207; Unassembled WGS sequence.
DR GO; GO:0030246; F:carbohydrate binding; IEA:InterPro.
DR GO; GO:0004553; F:hydrolase activity, hydrolyzing O-glycosyl compounds; IEA:InterPro.
DR GO; GO:0005975; P:carbohydrate metabolic process; IEA:InterPro.
DR CDD; cd06604; GH31_glucosidase_II_MalA; 2.
DR CDD; cd14752; GH31_N; 2.
DR Gene3D; 3.20.20.80; Glycosidases; 5.
DR Gene3D; 2.60.40.1760; glycosyl hydrolase (family 31); 2.
DR Gene3D; 2.60.40.1180; Golgi alpha-mannosidase II; 3.
DR InterPro; IPR033403; DUF5110.
DR InterPro; IPR011013; Gal_mutarotase_sf_dom.
DR InterPro; IPR017853; GH.
DR InterPro; IPR030458; Glyco_hydro_31_AS.
DR InterPro; IPR025887; Glyco_hydro_31_N_dom.
DR InterPro; IPR000322; Glyco_hydro_31_TIM.
DR InterPro; IPR013780; Glyco_hydro_b.
DR PANTHER; PTHR22762; ALPHA-GLUCOSIDASE; 1.
DR PANTHER; PTHR22762:SF120; HETEROGLYCAN GLUCOSIDASE 1; 1.
DR Pfam; PF17137; DUF5110; 2.
DR Pfam; PF13802; Gal_mutarotas_2; 2.
DR Pfam; PF01055; Glyco_hydro_31_2nd; 3.
DR SUPFAM; SSF51445; (Trans)glycosidases; 3.
DR SUPFAM; SSF74650; Galactose mutarotase-like; 2.
DR PROSITE; PS00129; GLYCOSYL_HYDROL_F31_1; 1.
PE 3: Inferred from homology;
KW Reference proteome {ECO:0000313|Proteomes:UP000245207}.
FT DOMAIN 78..152
FT /note="Glycoside hydrolase family 31 N-terminal"
FT /evidence="ECO:0000259|Pfam:PF13802"
FT DOMAIN 193..434
FT /note="Glycoside hydrolase family 31 TIM barrel"
FT /evidence="ECO:0000259|Pfam:PF01055"
FT DOMAIN 435..719
FT /note="Glycoside hydrolase family 31 TIM barrel"
FT /evidence="ECO:0000259|Pfam:PF01055"
FT DOMAIN 821..888
FT /note="DUF5110"
FT /evidence="ECO:0000259|Pfam:PF17137"
FT DOMAIN 1255..1330
FT /note="Glycoside hydrolase family 31 N-terminal"
FT /evidence="ECO:0000259|Pfam:PF13802"
FT DOMAIN 1371..1686
FT /note="Glycoside hydrolase family 31 TIM barrel"
FT /evidence="ECO:0000259|Pfam:PF01055"
FT DOMAIN 1795..1862
FT /note="DUF5110"
FT /evidence="ECO:0000259|Pfam:PF17137"
SQ SEQUENCE 2332 AA; 261294 MW; C7D3068648F8EA47 CRC64;
MGEVAGDYMT SDVKPCKMIF EPILEQGVFR FDCSTDTKNA TLPSLSFVNP KDRDTPILSN
HIPSYIPTYE YIAGQQVIYC ELPAGTSLYG TGEVSGQLER TGKRVFTWNT DAWWYGSGTA
SLYQSHPWVL AVLPNGEALG ILADTTTRCE IDLRKESIVK FTATPTFPVI TFGPFASANV
VLTSLSHAIG TVFMPPKWSL GYHQCRWSYD SDLRVREIAK TFREKGIPCD VIWMDVDHMD
GFRSFTFDQE NFPSPKSLAD DLHGIGFKAI WMVEPGIKCE KGYFVYDTGS ENDVWVQTAD
GEPFVVLRSC VSISTGEVWP GPCVFPDYTQ EKARSWWANL VKDYSTTTMP ESNIHRGDAE
LGGHQNHSHY HNVYGMLMAR ATYEGMKLAN PNKRPFVLPR AGFIGSQRYA ATWTGDNFST
WVHLHMSISM VLQLIAKTFR EKGIPCDVIW MDVDHMDGFR SFTFDQENFP SPKSLADDLH
GIGFKAIWMV EPGIKCEKGY FVYDTGSEND VWVQTADGEP FVGEVWPGPC VFPDYTQEKA
RSWWANLVKD YSTTTMPESN IHRGDAELGG HQNHSHYHNV YGMLMARATY EGMKLANPNK
RPFVLPRAGF IGSQRYAATW TGDNFSTWVH LHMSISMVLQ LGLSGQPLSG PDIGGFDGNA
TPKLFGRWFG IGAMFPFCRG HSQKETVDHE PWSFGEECEE VCRLALKRRY RLMPHIYTLF
YLAHTQGSLV AAPTFFADSK DPRLRTNENS FLLGPLLIYA STTSDLGVHE LKHELPSGIW
LSFDFEDSHP DLPALYLRGG SIIPFGPAHQ HVGEANPNDD LSLLVALDEN GKAEGVMFED
DGESYEYING VYLLTAYVAE LRSSVITVSV SKTEGLWKRP NRRLHVHILL GEGAMVDAWG
IDGEVMQIAM PSETEVSKLI SSSRNNYKIR METAKLIPDV DKLSALSLTG TDLSVSPVEV
KYGEWALKVV PWIGGRIISL EHIPTGIHWL RSRVEISGYE EYSGTEYRSA GCTEDRDLAQ
VGQIDSLEME GDVGGGLAIR RNISILKDNP KVFKIDSSLV AQNVGVGSEA YSRVACLRIH
PTFSLTHPTE SYVSFTSING SKHDLWPESG ERFYEGDFRP NGVWMLIDKC LGLGLLNKFS
IDQKLVTKSI FEIKDSRKLF CRKDNKRFIH SLVAVSIMGE VAENYTVDYV KSRKMVFEPI
LEEGVYRFDC SADARNTSFP SLSFVNTEKR DTPLMSNHKF PSYIPTYEQV AAQQITYFEF
PEGTTFYGTG EVSGPLERTG TKIYTWNTDA CEYGSGTTSL YQSHPWVLAI LPDGEALGIL
ADTTKRCEID LRNESIAKLS APLPYPVITF GPFDSATDVL TSLSHAIGTV FMPPKWSLGY
HQCRWSYDSD SRVREIAKTF RDKSIPCDVI WMDIDYMDGF RCFTFDQENY PNPKSLTDDL
HENGFKAIWM LDPGIMHDKG YFVYESGSKN DIWVQTANGK PYVGEVWPGP CVFPDFTQEK
ARLWWANLVK DFISNGVDGI WNDMNEPAVF KSITKTMPEN NIHRGDDELG GHQSHSHYHN
VYGMLMARST YEGMKLANPN KRPFVLTRAG FIGSQRYAAT WTGDNQSSWE HLHMSISMVL
QLGLSGQPLS GPDIGGFACN ATPKLFGRWF GIGAMFPFCR GHSAKETADH EPWSFGKEAN
PTSTEAWRDA GYIRITSALS DTTRFVVYCE WDSKDSRLRT NENSFLLGPL IVYASTANDL
GVHEVKHEMP SGIWKSFDFA DSHPDLPAMY LRGGSIIPFG PAHQHVGEAN PNDDLSLLVS
LDKNGKAEGI LFEDDGDGYG YINRDYLLTT YVAELNSSVI TVGVSKTEGL WKRTNRRLHV
HILLGDGAMV DAWGTDGEDL QIIMPSENKV AKLISDSKNN HKIRMETAKC IPDVENESGR
KGIQLSEIPV DIKGGEWALK VVPWIGGRII SMEHLPTGTQ WLHSKVESNG YEEYSGTEYC
SAGSAEEYIV VDRDLEETGE IESLKLEGDV GGGLVIERNI HISEDNPKVF NIVSCLVAHN
VSAGSGGFSR LVCLRIHPTF SLLHPTQSYV SFTSINGSMH DVWPESGEQL YEGDLRPNAE
WMLVDKCLGL GLVNKFSIDQ VYKCLIHWNS RTVNLELWSQ ERPVSKESPL SISHNYEVRR
IPRDLEETGE IESLKLEGDV GGGLVIERNI HISEDNRKVF NIVSSLVAHN VSAGSGGFSR
LVCLRIHPTF SLLHPTQSYV SFTSINGSMH DVWPESGQQL YEGDLRPNGE WMLVNKCLGL
GLVNKFSIDQ VYKCLIHWNS GTVNLELWSQ ERPVSKESPL SISHNYEVMR IQ
//