ID Q4N971_THEPA Unreviewed; 1504 AA.
AC Q4N971;
DT 02-AUG-2005, integrated into UniProtKB/TrEMBL.
DT 02-AUG-2005, sequence version 1.
DT 24-JAN-2024, entry version 75.
DE RecName: Full=Cleavage/polyadenylation specificity factor A subunit C-terminal domain-containing protein {ECO:0008006|Google:ProtNLM};
GN OrderedLocusNames=TP01_0243 {ECO:0000313|EMBL:EAN33487.1};
OS Theileria parva (East coast fever infection agent).
OC Eukaryota; Sar; Alveolata; Apicomplexa; Aconoidasida; Piroplasmida;
OC Theileriidae; Theileria.
OX NCBI_TaxID=5875 {ECO:0000313|EMBL:EAN33487.1, ECO:0000313|Proteomes:UP000001949};
RN [1] {ECO:0000313|EMBL:EAN33487.1, ECO:0000313|Proteomes:UP000001949}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Muguga {ECO:0000313|EMBL:EAN33487.1,
RC ECO:0000313|Proteomes:UP000001949};
RX PubMed=15994558; DOI=10.1126/science.1110439;
RA Gardner M.J., Bishop R., Shah T., de Villiers E.P., Carlton J.M., Hall N.,
RA Ren Q., Paulsen I.T., Pain A., Berriman M., Wilson R.J.M., Sato S.,
RA Ralph S.A., Mann D.J., Xiong Z., Shallom S.J., Weidman J., Jiang L.,
RA Lynn J., Weaver B., Shoaibi A., Domingo A.R., Wasawo D., Crabtree J.,
RA Wortman J.R., Haas B., Angiuoli S.V., Creasy T.H., Lu C., Suh B.,
RA Silva J.C., Utterback T.R., Feldblyum T.V., Pertea M., Allen J.,
RA Nierman W.C., Taracha E.L.N., Salzberg S.L., White O.R., Fitzhugh H.A.,
RA Morzaria S., Venter J.C., Fraser C.M., Nene V.;
RT "Genome sequence of Theileria parva, a bovine pathogen that transforms
RT lymphocytes.";
RL Science 309:134-137(2005).
CC -!- SIMILARITY: Belongs to the DDB1 family.
CC {ECO:0000256|ARBA:ARBA00007453}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:EAN33487.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AAGK01000001; EAN33487.1; -; Genomic_DNA.
DR RefSeq; XP_765770.1; XM_760677.1.
DR STRING; 5875.Q4N971; -.
DR EnsemblProtists; EAN33487; EAN33487; TP01_0243.
DR GeneID; 3502504; -.
DR KEGG; tpv:TP01_0243; -.
DR VEuPathDB; PiroplasmaDB:TpMuguga_01g00243; -.
DR eggNOG; KOG1897; Eukaryota.
DR InParanoid; Q4N971; -.
DR OMA; FFTCATT; -.
DR Proteomes; UP000001949; Unassembled WGS sequence.
DR GO; GO:0005634; C:nucleus; IEA:InterPro.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR Gene3D; 1.10.150.910; -; 1.
DR Gene3D; 2.130.10.10; YVTN repeat-like/Quinoprotein amine dehydrogenase; 2.
DR InterPro; IPR004871; Cleavage/polyA-sp_fac_asu_C.
DR InterPro; IPR018846; Cleavage/polyA-sp_fac_asu_N.
DR InterPro; IPR015943; WD40/YVTN_repeat-like_dom_sf.
DR PANTHER; PTHR10644:SF3; DNA DAMAGE-BINDING PROTEIN 1; 1.
DR PANTHER; PTHR10644; DNA REPAIR/RNA PROCESSING CPSF FAMILY; 1.
DR Pfam; PF03178; CPSF_A; 1.
DR Pfam; PF10433; MMS1_N; 1.
PE 3: Inferred from homology;
KW Reference proteome {ECO:0000313|Proteomes:UP000001949}.
FT DOMAIN 343..577
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit N-terminal"
FT /evidence="ECO:0000259|Pfam:PF10433"
FT DOMAIN 1229..1460
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit C-terminal"
FT /evidence="ECO:0000259|Pfam:PF03178"
FT REGION 191..226
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 394..419
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 583..611
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 626..651
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 999..1038
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1178..1201
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 590..604
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1504 AA; 168929 MW; 3B9739123F69A441 CRC64;
MYGYFVTTAA SGATLKAIRC RLVKDSLKEY LVCLKRHSIE VYSLENNENR NVDNDLDHFN
PPVLISVLNA SSTFVDFVEY RPPSESQTHL LVLTRNFTLF LLTYHDNLAK FTSKTVSSLQ
ELHGRPNYEN IIFKVDSGYN LLVFYGYNRV LKCIVLDPNN YFNFSDLVTI RTNDTIFVDF
EFISSEYIEK PDASKSSSSP RRRNTSTGTP TRSFSGSSSS SVSKDIQFKS PTRNSSFHFL
ESKLIILGED NAYGSNKTPV RWLYGMKLMF EIEILNGCKR FNSYNHTPLF GDPIKLPIPY
SKFIPLNLVN KANRYDSVML LGPGSTGFIN FKSPRHVKQF KIDNSITEIT CFCMYRDNKF
IFGDDNGGLY LLRLSVSTMR KSANTVKRRA VYTPGMSSSS NTLSSSTVSS QQSNRNVSGD
GDFMINEVMA IRLGSFPVPS SLIKLGDHHI FYTSKMGNSS IISIYSILNS RNNEEQTLEQ
GDFKSSEWAQ TNLGPITDFA YREEASSGEN TILACCGMGN SASFCEIYFG LSSEVIHTSD
VPGVHDLFSF PMKSPDDSSS SLLCISFFRF TRFYTVSFSK PDVSDPPELI SIESTPQPAQ
NRRVNARRVN KRQKARTALI NALENNRRPS ENPSQVVNNH NNSQRNTNTN RTGKITKFQN
NALLCNEKTI LFTKLRDGNI LQVTPKNIIL VNDSFKSVRR AKISQIVTLP GDKYALSSLV
CSPYILLLMS TNCIVVLEYD FKNFNSRCLD FTVSAMGCIS KNDLLNSDLG IFASGGGLVG
VSSWANNNMI LFLTVKDLKV VYSHKVNLDY DVYVVSIKFA KINSNVYLLL SLSNGFLYIY
QLTRADRRIK MTLSNKSKLS FWSFKLLELK VGSGEGEDTC DLSKVNLITT GPKSYVIHPK
NEKITYTKIN IDNLHTVTNI VNLGTVAEKE KEELLVIYGN HKSVVVGRLN LLNNFNVQKI
LKGSNFNKVI YDSRTKLAVI STIPQYIINP NNLYTYTPSN SQSTDVPGYC ETPSTNTQSQ
PSPYISQHSN NSQNSNSQLT DSDNLLLCMS DDQILSGTDT INVPSEALLV DIETKEVVYR
LNMPQGHLIS SMHKYTHDDL GKDYILLGTS KVSEANDVPT EGYLYFLEVY KEADACTVVV
NRNAIPLGGG VVEITNLNKF IVIAVNSNVM VISLTASNEN SSPNDPSSSS AVNNGSHRQG
TRSKLELVDL KDSRLSIPKQ DETTLFIDVV ANYDSNTFVV SLDTKDDVIF VGDLMTSVKM
LKFRDNRLLE TCRDFNTLWT TSLAAVDNSS CLVSDDLGNF LLFKKVQHPT TDQQSIRFDK
QGLFHHGEVV NKILKRTQMS VQHVANSRMS RSNPREFMVS NRVVSESESN NPSETLNVNE
YTNLFKSFFT CATTSGSLLQ VCFFDDLNMF LKLSLLEHTM HLVQKDLGNI PSRNQRNFED
LHSNIPTKGF VDGDLVEAFL KLPDSLKKWV FETMLINSRH LGVKLSSLES LLYEVDHIKH
LRLE
//