ID A0A2P6TL13_CHLSO Unreviewed; 1223 AA.
AC A0A2P6TL13;
DT 23-MAY-2018, integrated into UniProtKB/TrEMBL.
DT 23-MAY-2018, sequence version 1.
DT 24-JAN-2024, entry version 17.
DE RecName: Full=Cleavage/polyadenylation specificity factor A subunit N-terminal domain-containing protein {ECO:0000259|Pfam:PF10433};
GN ORFNames=C2E21_6344 {ECO:0000313|EMBL:PRW44974.1};
OS Chlorella sorokiniana (Freshwater green alga).
OC Eukaryota; Viridiplantae; Chlorophyta; core chlorophytes; Trebouxiophyceae;
OC Chlorellales; Chlorellaceae; Chlorella clade; Chlorella.
OX NCBI_TaxID=3076 {ECO:0000313|EMBL:PRW44974.1, ECO:0000313|Proteomes:UP000239899};
RN [1] {ECO:0000313|EMBL:PRW44974.1, ECO:0000313|Proteomes:UP000239899}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=UTEX 1602 {ECO:0000313|Proteomes:UP000239899};
RX PubMed=29178410; DOI=10.1111/tpj.13789;
RA Arriola M.B., Velmurugan N., Zhang Y., Plunkett M.H., Hondzo H.,
RA Barney B.M.;
RT "Genome sequences of Chlorella sorokiniana UTEX 1602 and Micractinium
RT conductrix SAG 241.80: implications to maltose excretion by a green alga.";
RL Plant J. 93:566-586(2018).
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:PRW44974.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; LHPG02000012; PRW44974.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A2P6TL13; -.
DR STRING; 3076.A0A2P6TL13; -.
DR OrthoDB; 5695745at2759; -.
DR Proteomes; UP000239899; Unassembled WGS sequence.
DR Gene3D; 2.130.10.10; YVTN repeat-like/Quinoprotein amine dehydrogenase; 2.
DR InterPro; IPR018846; Cleavage/polyA-sp_fac_asu_N.
DR InterPro; IPR015943; WD40/YVTN_repeat-like_dom_sf.
DR PANTHER; PTHR10644:SF6; CLEAVAGE AND POLYADENYLATION SPECIFICITY FACTOR (CPSF) A SUBUNIT PROTEIN; 1.
DR PANTHER; PTHR10644; DNA REPAIR/RNA PROCESSING CPSF FAMILY; 1.
DR Pfam; PF10433; MMS1_N; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000239899}.
FT DOMAIN 145..339
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit N-terminal"
FT /evidence="ECO:0000259|Pfam:PF10433"
FT REGION 280..302
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 857..893
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 926..946
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 977..1002
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 977..991
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1223 AA; 123670 MW; EE8092CFD7524964 CRC64;
MTAGQSYVIR RCCPTRAFCR AVAGLFLPGA RPALVAAGDT TVALLVASEG AQDSEGQVEV
VCEQPVFAAV RAVAALPSGT AGEQDLLALL TDSGSLSILR FDAALCRFVA VQQLALPPEP
PGDPFRLLAV HPAGAAVAVA SLLGSAGQAQ LYAASCDTTG SGGRLCVLYP DPKPETVFEL
PGAAEGITGL WGLRQRTADR HHSLALLSFL GGSRLLAAQG GIFRDVTDCY PLHASTQTLA
AGNLRCPGTA GIAVQVTSHA VVLFATQALG GLSRQASDAL PSSAAAPTSS TDASSSSGSL
QPAWQPPAGC SIGAAVVAED AVLLFCTGSS RQLVVLLLSP FAAVDSPGQA GEAALPQLVA
AAHMPLQADV SCISNLQEQE QPSAVAAEQQ PEQQLGQQQQ QQQQQQQQQQ QQQQSAVFAV
GTYSSTVLLV QLSWQSGQPA AASLSLLQAV DLSASSIAGS VPTAAAAAAA AESSVPFSPR
AFQREQLTPE SLLVLPAPAA AVDGGTEAAA GAAAEATGAA AARLAGGAAL LVVGLRTGSL
LQLRCSWGPE QEQQEELLAV ALGHMPAVLL PLPAQPSLAT LAAGVPPPAA AALSDRVSLL
HCPTSAAAGG GRLRCVPLAL PQVQVAAPLL LESDNLANST CVLPEASWGI SEAVLDTAVG
SGGAAGGVQP PAQQLQLFLL CAAADGCLRM VSLDAQQVAG SRCWPLPHDL QPSRLVVHAA
SGCVAVAGAS SSLPGLRLRR PRLAWETEEE EEEEEGPAAA VQLLDPATGD LLASYTGFMP
EERVTALAVW DPAHPLPAAA GATAQPAEPG AAAAAASVEA AAVAAAAAAE APTEAAAAPA
EAAAAAAASD GSGEAPWGGF IVAGTTVGSG EPPGDAKPRW GGAPQGPPRW DDETSLRLQG
RLLLLQVVTP TAASAGASRG SGAAAAAAPA EAAGHASWDQ GQPEEPAPRR RLQLLAVAEM
HLPNRVLAVC PGSTRVLGDR TSNSGGGGSG GGGGGSSTPD AAEPRLFATV GRRLVSLEWR
ARQQMLRRVA WMPTARPITS LQLSGGLLVA ADSREGATVY RYYEPEPSAE LPPPHPEQRG
RLPRRYRLAP PAAGELRHRL GLAPQQHMRD GGGIAGAKMY SAAGQGKEPE EQLELGFSVV
AADTSCRPAV GALVVPAAAR LRAAGSGSSS KLLQQFLLLP PAEQRAVVAG MGGSGAAPAS
ELAAAELRRA AQRLNNLVAS LLL
//