ID A0A267G9P6_9PLAT Unreviewed; 2011 AA.
AC A0A267G9P6;
DT 22-NOV-2017, integrated into UniProtKB/TrEMBL.
DT 22-NOV-2017, sequence version 1.
DT 27-MAR-2024, entry version 20.
DE RecName: Full=RNA-directed DNA polymerase {ECO:0000256|ARBA:ARBA00012493};
DE EC=2.7.7.49 {ECO:0000256|ARBA:ARBA00012493};
GN ORFNames=BOX15_Mlig005135g2 {ECO:0000313|EMBL:PAA82760.1};
OS Macrostomum lignano.
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes;
OC Rhabditophora; Macrostomorpha; Macrostomida; Macrostomidae; Macrostomum.
OX NCBI_TaxID=282301 {ECO:0000313|EMBL:PAA82760.1, ECO:0000313|Proteomes:UP000215902};
RN [1] {ECO:0000313|EMBL:PAA82760.1, ECO:0000313|Proteomes:UP000215902}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=DV1 {ECO:0000313|EMBL:PAA82760.1};
RC TISSUE=Whole organism {ECO:0000313|EMBL:PAA82760.1};
RA Berezikov E.;
RT "A platform for efficient transgenesis in Macrostomum lignano, a flatworm
RT model organism for stem cell research.";
RL Submitted (JUN-2017) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:PAA82760.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; NIVC01000452; PAA82760.1; -; Genomic_DNA.
DR STRING; 282301.A0A267G9P6; -.
DR Proteomes; UP000215902; Unassembled WGS sequence.
DR GO; GO:0004190; F:aspartic-type endopeptidase activity; IEA:InterPro.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR GO; GO:0015074; P:DNA integration; IEA:InterPro.
DR GO; GO:0006508; P:proteolysis; IEA:InterPro.
DR CDD; cd00303; retropepsin_like; 1.
DR CDD; cd09274; RNase_HI_RT_Ty3; 1.
DR CDD; cd01647; RT_LTR; 1.
DR Gene3D; 1.10.340.70; -; 1.
DR Gene3D; 3.10.20.370; -; 1.
DR Gene3D; 3.30.70.270; -; 2.
DR Gene3D; 2.40.70.10; Acid Proteases; 1.
DR Gene3D; 3.10.10.10; HIV Type 1 Reverse Transcriptase, subunit A, domain 1; 1.
DR Gene3D; 2.40.50.140; Nucleic acid-binding proteins; 1.
DR Gene3D; 3.30.420.10; Ribonuclease H-like superfamily/Ribonuclease H; 1.
DR InterPro; IPR043502; DNA/RNA_pol_sf.
DR InterPro; IPR001584; Integrase_cat-core.
DR InterPro; IPR041588; Integrase_H2C2.
DR InterPro; IPR012340; NA-bd_OB-fold.
DR InterPro; IPR019103; Peptidase_aspartic_DDI1-type.
DR InterPro; IPR021109; Peptidase_aspartic_dom_sf.
DR InterPro; IPR043128; Rev_trsase/Diguanyl_cyclase.
DR InterPro; IPR012337; RNaseH-like_sf.
DR InterPro; IPR036397; RNaseH_sf.
DR InterPro; IPR000477; RT_dom.
DR InterPro; IPR041373; RT_RNaseH.
DR PANTHER; PTHR37984:SF7; INTEGRASE CATALYTIC DOMAIN-CONTAINING PROTEIN; 1.
DR PANTHER; PTHR37984; PROTEIN CBG26694; 1.
DR Pfam; PF09668; Asp_protease; 1.
DR Pfam; PF17921; Integrase_H2C2; 1.
DR Pfam; PF17917; RT_RNaseH; 1.
DR Pfam; PF00665; rve; 1.
DR Pfam; PF00078; RVT_1; 1.
DR SUPFAM; SSF50630; Acid proteases; 1.
DR SUPFAM; SSF56672; DNA/RNA polymerases; 1.
DR SUPFAM; SSF50249; Nucleic acid-binding proteins; 1.
DR SUPFAM; SSF53098; Ribonuclease H-like; 1.
DR PROSITE; PS50994; INTEGRASE; 1.
DR PROSITE; PS50878; RT_POL; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000215902}.
FT DOMAIN 806..985
FT /note="Reverse transcriptase"
FT /evidence="ECO:0000259|PROSITE:PS50878"
FT DOMAIN 1474..1632
FT /note="Integrase catalytic"
FT /evidence="ECO:0000259|PROSITE:PS50994"
FT REGION 1..66
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 297..369
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1274..1329
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1784..1881
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1911..1963
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1..51
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 297..340
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 341..355
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1788..1813
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1829..1859
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1913..1931
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1945..1959
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 2011 AA; 224010 MW; CCAC34111FAEA9BB CRC64;
MSSSNEESDE ALSPNRAAGQ GQQVLQLQTE SSTSTQPEAG ASSQQAAQVG DTSMDAKPPH
GMAEQSDLSK VLQTLTMILE KTCDREASAA AEKLSNSSPK LRPPPPFQPQ MNFELWETQF
LTYVEAHHVP TPQQPAVLLS MLAPECQEFI FNNSLHRESI QTALKSLKDR FTDSGTEHKS
YLELAVAEQL NSETTQEFFD RLTRLSKSAN LKKPTAEPLV RARFCYGLRD KGLAERLILL
GQENPTWSAE DLAKKAREIA GVKKLLSSEV TPVIRAAAPD RTAELEARIR ELEKLVQQNQ
TTQSSDHNSK DFQNFPSQVP PVNLGQSGQY KNSSRWSKSP KRCNRCSDHD HLTADCPKPP
PSNGVGAGGR PTGMAPMMEI GTITVWKGGF GFIQPSQMPD SLFCHYSSLD NELKQVANSH
QLVGTTVQFL RRADPGHKHD RAVGVCRYQM PVRCDQSWAA DTPVESSMFV QLGIGDNKVI
ACLDSGCNAS VMGTKVRDQL IANDPINNTF IPARKYQAAT AFNGTSTQIL GTLHCEVKLG
KISQTHVFLV TEGQVTDVLL GIDFMRGFNV NLLVADGQVT VAGEVVPVIE GDRGVFQARR
VTAACNVQLP PRSETMVPSR VIGADPGSPG FLGTKLFCSD SEFFVAHSVD SVSSTGHVLV
RLLNSGNQPS EIAAGQEIAS FQSLRVGETI SHTALTEKDF SVSSSKDLGT KLARLPDPDL
GTASWLKKQF KLEESTLSET QLTITLKLLN YFLSAISKSD TDLGKTHLHE MAIELNEPTC
RPISQPSRRT TPAQKEFLDR HINQLIEQGV IEESNSEWAS PIVLVRKKDG SQRLCIDYRS
VNKVIKPCTF PLPLIEESFD SLAGSKVFSS LDLTSAYWQV GIREQDRDLT AFTCPQGTYR
WKVVPFGIKT APANFAKLMH KVFRPLLMRV SLIYLDDIII KARDNDEMIV HLALVLWQLK
KANLKIKPSK CELFKDRVRF LGFVVSSQGI EADPTKIDAV RNWCKPTDKR QLLAFLGFAN
YYRKFVARFS VIASPLYSLA KGSSKFCWLE HHDRAFEELK ANLCAPPVLA YPDLKPSAGP
FILDCDSSLE GVGGVLSQEG PDGQEHVVAY ASKKFTHSQR NYCATMRELL GLVIMLEHFR
TYLLDRSFLV RSDAAALQWL HTKKHSTGML AQWLATIDVY NFKVCQTPLE RLAEYDFSVE
HRPGRDHVNA DMLSRNPRFR SDHHDCPTCS KIPAFQDKFR KLEEQRCQDE DEDTDSMPAV
QLRTAQIDAC CQTEPHQSNQ EPKYPSEILT SHEGSDEVRS HRSLSLPSTA QEPDPEGETS
EDEMSENSEN LWAREAQESS TDLVRLIEAV EGKSPRLTKV EVQDCSREIR TLWAMFDELE
VVNGLLTRVK PAKQGQVSQR LIIIPETADV DELVSCYHRD INHCGINKTV RALKQRFQIT
NLEVFVKGII AECEVCCRTK KNKRKNKEPL EPILSGYPNQ IVHVDHAGPF PEQEGNRYLL
VLVDNFSGYV EIVPVPDVSA ATTATVILTE WIVRYGTMEK LVSDNGTAFI NKTVAELTRL
LEIDMARITA HRPQSNGKAE RTIQSIKAQI RAICLEKRVN WVYAARLAGL SIRTTVAEST
GWTPARLFFG RELVLPLDLL FNPPRHDRYD PDCYAHKLHK LLVEVSAIAR MERGKAQQRQ
KRNYDRNTIH SNFEIGELVW VLNPDHRGLD SAVWLGPYKV VQKVSERNYR LIPEFARINP
ICQTPNPSHN IYNVDRMKRC IRKQTYQDIA DRFELRVPAL IPPSRLGQLD DQSESRSDSW
GSSATTCSEP EPEAEVPNTQ DDEAPDPPHT SAPDPPHTPA PDPPRTSAPD PPHTSAPDPP
RTVAPDFLCT SSPDRLARNQ SPSLPSAFVE VADSGNLQQD YQVPEIGAEV EVPQNNTEEQ
LPESRVHESS TTMPAMDGLR PESPTSGSHC IQETPPAQDT VDSDKVEAFP AGPVTTSITT
TRSGRLVRKP QRFLNTIVRL CSRNGHLSNS A
//