ID A0A267GC51_9PLAT Unreviewed; 1883 AA.
AC A0A267GC51;
DT 22-NOV-2017, integrated into UniProtKB/TrEMBL.
DT 22-NOV-2017, sequence version 1.
DT 24-JAN-2024, entry version 16.
DE RecName: Full=CPSF_A domain-containing protein {ECO:0008006|Google:ProtNLM};
DE Flags: Fragment;
GN ORFNames=BOX15_Mlig025587g1 {ECO:0000313|EMBL:PAA83601.1};
OS Macrostomum lignano.
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes;
OC Rhabditophora; Macrostomorpha; Macrostomida; Macrostomidae; Macrostomum.
OX NCBI_TaxID=282301 {ECO:0000313|EMBL:PAA83601.1, ECO:0000313|Proteomes:UP000215902};
RN [1] {ECO:0000313|EMBL:PAA83601.1, ECO:0000313|Proteomes:UP000215902}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=DV1 {ECO:0000313|EMBL:PAA83601.1};
RC TISSUE=Whole organism {ECO:0000313|EMBL:PAA83601.1};
RA Berezikov E.;
RT "A platform for efficient transgenesis in Macrostomum lignano, a flatworm
RT model organism for stem cell research.";
RL Submitted (JUN-2017) to the EMBL/GenBank/DDBJ databases.
CC -!- SIMILARITY: Belongs to the CPSF1 family.
CC {ECO:0000256|ARBA:ARBA00038446}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:PAA83601.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; NIVC01000413; PAA83601.1; -; Genomic_DNA.
DR STRING; 282301.A0A267GC51; -.
DR Proteomes; UP000215902; Unassembled WGS sequence.
DR GO; GO:0005634; C:nucleus; IEA:InterPro.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR Gene3D; 2.130.10.10; YVTN repeat-like/Quinoprotein amine dehydrogenase; 3.
DR InterPro; IPR004871; Cleavage/polyA-sp_fac_asu_C.
DR InterPro; IPR018846; Cleavage/polyA-sp_fac_asu_N.
DR InterPro; IPR015943; WD40/YVTN_repeat-like_dom_sf.
DR PANTHER; PTHR10644:SF2; CLEAVAGE AND POLYADENYLATION SPECIFICITY FACTOR SUBUNIT 1; 1.
DR PANTHER; PTHR10644; DNA REPAIR/RNA PROCESSING CPSF FAMILY; 1.
DR Pfam; PF03178; CPSF_A; 1.
DR Pfam; PF10433; MMS1_N; 2.
PE 3: Inferred from homology;
KW Reference proteome {ECO:0000313|Proteomes:UP000215902}.
FT DOMAIN 372..702
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit N-terminal"
FT /evidence="ECO:0000259|Pfam:PF10433"
FT DOMAIN 914..1089
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit N-terminal"
FT /evidence="ECO:0000259|Pfam:PF10433"
FT DOMAIN 1531..1848
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit C-terminal"
FT /evidence="ECO:0000259|Pfam:PF03178"
FT REGION 118..201
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 230..258
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 706..882
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 124..138
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 165..201
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 750..764
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 765..780
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 781..804
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 805..876
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT NON_TER 1
FT /evidence="ECO:0000313|EMBL:PAA83601.1"
SQ SEQUENCE 1883 AA; 207835 MW; 1510DA78E45B8296 CRC64;
SQVAHLLTSS SMYNMNNPYG MPPPQHMGMP PPPPPHQMPQ MQMPPMMMQH QQQQHQQQYM
PPMPPMHMAY QQMMQQQPQH HMMAPPPVQS PMSMAGFPMQ MQQPPMSMPM STMPTQMQMP
PPGQMSMPAM PMSVPPPGQP IMHQMQQQRP MPPTSMPPPP MQFPTQSQSS QQQQQQTPGS
QQQNPSYQQQ LSSSSMPSTL QSLQQVSIPS VVADSAAAAT VASPIAVAME TQQQQRQNQQ
IPSPPPSPTS ILATDHVSAS QTTQQYIQQS IFKSYSPDFR TVLKELHPPT AVTHCLYCRL
MSPRDQQLLV FSGHLLRVYR LVEAGTDELG ANADASHNSS GGAALDFVYG VEFYSPVLDA
LALPATCLDK RDSLLLAFEE AKLSVVEFDD STRELRATSL HTYEDVKYKD ARRQFTRPPM
LRLDPQRSCV VMLVYDKHLA VLPLRRQLVG VPELDAKPAV LPSYVLPLWD QAEAIVNVAD
VQFLNGYYDP TVCVLHEPIG TWSGRVTARH DTYRITAMSL NPKDRTNPII WSQAGLPFDC
FALLPVPKPI GGLVVLARNS VIYLNQSQRP TGLALNSNIN NSTSFPLRRV YPPTPRLSLD
GGKAIFLPNT GGRQFLVGVR DGSFYVFTLY IDSDMRQLNG FHWEPVGKSS PLSCFAMLTA
MEAETAGLTI GANPADVTDS GGMLFVGSFV SDSELVRLVG TGPMPTPVQF TNGDASAAAA
APTAPATPPP PPSPPTIDDD DDNENVEEDD DMRDNSKATE HNDEDEGDGD ANEEPQNPQT
DEADVEIKQE EEDNHSEATV AKRRRLESSF SNTSLVSPPP HTSAAQDQDE SVDKPTVNPA
DSLNEDSQQA PPSLMSQSSV TPSEQQQQQE EPPQQPLGVP QVSDPELDEI YETEEAASRS
DPTKRFQLDR CDYLPNVGPI GELTAAYIQG MFDNFLVKES ANLELLTSYG HAEHGGINLL
QRSVRPLIHS SFEIPDCSSL WSLYGDRRRP WQFESNAEED EECEGHGYLV LSREETSLVF
QVTDEIAELE DSGFSTSEPT LWAADVGDDL SLQVCPSSVR LLRGSQQVLV QPLEERARLV
TVCDPFVLIV DETDGFLLLE VIGETLRTER PTVQQYSFAT AIAITDHPFL RLHRKPTAAD
DGAADFADSA ATAAAATFDP LEEEDVLLYG DALSERARRP AAPPPPPTYD SDCQTTHWAI
VCYESGALEI YQLPDFVCCF AAKAFSDGPR LVTDASLLAA TQVGAGGVDT ATGASLPTAA
AADDDVPPEV EELSLMPIGR HLDRLLLLCR SRDEVSAYEA YPAPASDVAV LPPGRLAVRL
RRLDDLGVLL RTGKKPSKQR RTGKQQQQQQ LLLQQQSELQ QQLHGKHARL LVPFEDIGGY
RGLFVGGVRP HFLVVSPSGQ AYSHPMFVDG SVCAFSPFDR HFCRHGFLYL TPERDLRVAS
LPDDYDYASP WPRKRVPLGR TVHCVQFHRA TSTYLVASSA PELSNAICRL SSDGDKEIDT
REVPPTHCLP FRDRYYFEAY TPDWQAIPAV RLDMQVWERV ACCRIVRLQS EETAEGFKEL
VAVATNLSYN EEITCKGTIT LMDVINVVPE PGQPLTAYKM KLAFREEQKG PVTALASCHG
LLVSAIGQKV YLWQLKDDRL VGIAFVDSEI FIHSINCVKN LIVTSDLAKS VQLLRYQPSV
RVLSIVARDS ARRQVFTSNF LVDGQNLGFL LCDSRRNLLA FAYDPSEKLS RGGRNLVRKR
EARLPSSVHC SLRVHNCLRG GAGLAKTRDI QQGHSVVMGT AEGGLYLLTP VRRPVYTRLI
MLEKHLSHAV LHPAGLHPRA SRIYSPANHD LEPAKSGIID GDLMYRYVSL GHSERVEVAK
KAGLSADAIL DDLAEIQVST LHF
//