ID A0A401GYQ1_9APHY Unreviewed; 2407 AA.
AC A0A401GYQ1;
DT 08-MAY-2019, integrated into UniProtKB/TrEMBL.
DT 08-MAY-2019, sequence version 1.
DT 27-MAR-2024, entry version 22.
DE SubName: Full=Uncharacterized protein {ECO:0000313|EMBL:GBE87281.1};
GN ORFNames=SCP_1005280 {ECO:0000313|EMBL:GBE87281.1};
OS Sparassis crispa.
OC Eukaryota; Fungi; Dikarya; Basidiomycota; Agaricomycotina; Agaricomycetes;
OC Polyporales; Sparassidaceae; Sparassis.
OX NCBI_TaxID=139825 {ECO:0000313|EMBL:GBE87281.1, ECO:0000313|Proteomes:UP000287166};
RN [1] {ECO:0000313|EMBL:GBE87281.1, ECO:0000313|Proteomes:UP000287166}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=30375506; DOI=10.1038/s41598-018-34415-6;
RA Kiyama R., Furutani Y., Kawaguchi K., Nakanishi T.;
RT "Genome sequence of the cauliflower mushroom Sparassis crispa
RT (Hanabiratake) and its association with beneficial usage.";
RL Sci. Rep. 8:16053-16053(2018).
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:GBE87281.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; BFAD01000010; GBE87281.1; -; Genomic_DNA.
DR STRING; 139825.A0A401GYQ1; -.
DR InParanoid; A0A401GYQ1; -.
DR OrthoDB; 1653952at2759; -.
DR Proteomes; UP000287166; Unassembled WGS sequence.
DR GO; GO:0005634; C:nucleus; IEA:UniProt.
DR GO; GO:0003723; F:RNA binding; IEA:UniProtKB-KW.
DR GO; GO:0015074; P:DNA integration; IEA:InterPro.
DR CDD; cd00303; retropepsin_like; 1.
DR CDD; cd09274; RNase_HI_RT_Ty3; 1.
DR CDD; cd01647; RT_LTR; 1.
DR Gene3D; 1.10.340.70; -; 1.
DR Gene3D; 3.10.20.370; -; 1.
DR Gene3D; 3.30.70.270; -; 2.
DR Gene3D; 2.40.70.10; Acid Proteases; 1.
DR Gene3D; 3.10.10.10; HIV Type 1 Reverse Transcriptase, subunit A, domain 1; 1.
DR Gene3D; 3.30.420.10; Ribonuclease H-like superfamily/Ribonuclease H; 1.
DR InterPro; IPR016197; Chromo-like_dom_sf.
DR InterPro; IPR043502; DNA/RNA_pol_sf.
DR InterPro; IPR001584; Integrase_cat-core.
DR InterPro; IPR041588; Integrase_H2C2.
DR InterPro; IPR021109; Peptidase_aspartic_dom_sf.
DR InterPro; IPR005162; Retrotrans_gag_dom.
DR InterPro; IPR043128; Rev_trsase/Diguanyl_cyclase.
DR InterPro; IPR012337; RNaseH-like_sf.
DR InterPro; IPR036397; RNaseH_sf.
DR InterPro; IPR000477; RT_dom.
DR InterPro; IPR041577; RT_RNaseH_2.
DR PANTHER; PTHR24559:SF437; RIBONUCLEASE H; 1.
DR PANTHER; PTHR24559; TRANSPOSON TY3-I GAG-POL POLYPROTEIN; 1.
DR Pfam; PF17921; Integrase_H2C2; 1.
DR Pfam; PF03732; Retrotrans_gag; 1.
DR Pfam; PF17919; RT_RNaseH_2; 1.
DR Pfam; PF00665; rve; 1.
DR Pfam; PF00078; RVT_1; 1.
DR SUPFAM; SSF50630; Acid proteases; 1.
DR SUPFAM; SSF54160; Chromo domain-like; 1.
DR SUPFAM; SSF56672; DNA/RNA polymerases; 1.
DR SUPFAM; SSF53098; Ribonuclease H-like; 2.
DR PROSITE; PS50994; INTEGRASE; 1.
DR PROSITE; PS50878; RT_POL; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000287166};
KW RNA-binding {ECO:0000256|ARBA:ARBA00022884};
KW Transposable element {ECO:0000256|ARBA:ARBA00022464}.
FT DOMAIN 1109..1289
FT /note="Reverse transcriptase"
FT /evidence="ECO:0000259|PROSITE:PS50878"
FT DOMAIN 1659..1823
FT /note="Integrase catalytic"
FT /evidence="ECO:0000259|PROSITE:PS50994"
FT REGION 906..951
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 916..941
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 2407 AA; 273539 MW; F9F7A267AB1F482F CRC64;
MRASTAIPTI WDTLHIATTS FELPVSSDVL YLVSHGSLSF ASGSVQITDE GSAGSDVADV
EITAFFDHEH EFALTRVCAL TACGGGERRR YIRKHPLLLA GGTIKFDSRS RLSCPPHPSP
LFINKFETHL PWFAHHVGDL KDSVYFGRVS LRSAFIGYDV SSLFGEHIDV TTTLSAIKGS
FNPSSYLTLH TSNAPISVQV GMLNDPEGNH TVVEIHTSNG PIKANMDLLS AASSGTGGNF
SVDARTPQST QCCACRAHVE LARTRSPAPN VRGRIRACVV VPVHACRRPQ HGRGGPRGVR
TRAPAHIVIL QSLGIATQGN LSTLDLNFIN PTALPKQYID LLPSQAEWDA QRNVHVLNGR
PLLIRPDQHI DHPSGDYEAG LPHNVGGLVL SHIADSGIPI SYPGERPDAP WNPPVQDPVP
QPILAPIISD YIPPMATVTM DEGRHHVSQK ELGFRKPEIF NGSDHSKLRE FINQCKNYMA
GNSHVYQEDN QKIAFVLSHM QGGTAGSWAQ SFIETELTND DFLSYGSWRD FIASVNKAFG
DENIEETART LLRNIKQGTR TADDYIAEFR SLESKAKLED AGNIEYFKWG LNDPLRQRIY
GMESMPKTLD KWYEYTSRFD NQWRSAQIFK RGTTTTTRGK GRSVHRPYYS ASAKDPNAMD
VDRVSISRLS PDERQKRMKE GLCFLCGKKG HIANDRQFHP QSGSFTRART IQPGLDTDTI
TWIKKMREDL AQKKETSKEE TREEQIAYVR NVFNDMTDEE RTQLEKEDVT TEALIDSGAA
ETFMNIHFAE ENDFIRWELP KPITVMNADG TPNQMGTITH CTWKMMKIGG RKTLTRFLLT
GIGKENILLG MPWLKRLNPM INWETGDFWF GKDAKWPPKP TVEDAPDEEV SIFKEDGLPE
LITTETSPTS FRHSESIREI TADNTERGEL PAVRNDTEPD DPPSETPMDQ LDDDDLVISY
IAGEPVIGIF EPIRKESPLT NEETETTVFT IRRGPAIGRM THSTRSPLFC NAQNTVFYKP
SGKSQELAQQ AHKEASAVDK PKTVEELVPN YLHDLKSVFE KKAAERFPET RPWDHAIDLK
PDFIPHDCKV YPLSLKEQGA MDDFLEENLR KGYIRPSKSP MASPFFFVGK KDGALRPCQD
YRYLNEGTVK NAYPLPLISD LMDQFKGASI FTKMDLRSGY NNVRIKDGDQ WKGAFKTNRG
LFEPMVMFFG LCNSPATFQM MMNALFKDMI DEGWIVIYMD DILIFSNDLE EHHVRTRHVL
QRLKDSDLFL KPEKCFFDVK EVEFLGMIIR ENYIGMDPIK LKGIAEWPEP TTVKGVRSFL
GFGNFYRKFI ANFSDIAKPL TNLTRTVAGS PPFEWTTECQ TAFDTLKQRF STAPVLLLPD
KAKPFIVESD ASKFATGAVL RQADINGDLH PCAYISQTLN PAERNYEIYD RELLGIIRAL
TEWRHYLEGS PHPVEVRSDH KNLTYFRTAQ KLNRRQARWS LKLSQFDLHL IHVPGTQMIQ
SDALSRRNGL DDSESDNEDR ILLPNALFVR SISPTLFDEI RNHAAKDLIV HEALEAIAHK
GPFPMKSSLS DWEVRDGVIL YKGKIYVPPS ETLRRDLVRL HHDSPAMGHP GKFNTLELLR
REFWWPGMYT FVYNYVEGCA ACQQMKPNTH PTRIPLEPIP TDPHALPFSC CTTDFITDLP
VSNGFDSIMV VVDHDLTKGV ILTPCLKTIT AEGTAKIFHD KVYSRFGLPD RIISDRGPQY
ASKVFQELNR LLQIRSSMST AYHPQTDGET ERVNQELEIY LRLYCGNNPE TWADRLPDLE
FCHNTREHSA RKMSPFRIMM GYEPRGLPSV FPTTNIPSVE SRLDMLQKIR LEALAMHELA
RQQMADRVRQ GSPKFTLGQK VWLDSRNLKV NYASRKIAPK REGPFEIAEV IGPANYRLKL
PKTWRVHSVF HAALLSPYRE NDIHGPNYMN PPPDVVDNEE EYEVEAILNH KTYRGHLRQL
WNGSLPLIKY VALEHSAFMN MINIAARALD GVTIPNCKCT REEIMDMFKC RLMELQQKLN
SDAVQGLVNL MCNTWQASNT DDYFAITDLW VEETGPGLWS VQTALLSFMQ LSNAHNKKCL
SQALFKIVAQ LQITHKMKFV KHIRHEMGKI YDAKRCRIRC LAHIINLAMQ AVISTYSKSK
HFDPEKLNED LTAPQVTQHD EIGLIRLICV KKHSLAKCKR LFKDIQVKED TKNPLQLLLD
MLVRWSSILL QCINTFIFKL GCKEKDQSKR KKLDELELSD QEWGHVKSFL DLLAHADNAQ
QAFSTDQGLT LHLALPTLKA LHKAWSSRAE CTKYVDFMEA LEADIAKIVE YYERFADSET
YTIAMRASPY HHHSILILTL RHARSTVLDP SEKINYIRKY WNEDLLKEAL EHAEVMVHVH
FTSLHTH
//