ID A0A0F4YHZ1_TALEM Unreviewed; 1414 AA.
AC A0A0F4YHZ1;
DT 24-JUN-2015, integrated into UniProtKB/TrEMBL.
DT 24-JUN-2015, sequence version 1.
DT 27-MAR-2024, entry version 29.
DE RecName: Full=Protein CFT1 {ECO:0000256|ARBA:ARBA00039443};
DE AltName: Full=Cleavage factor two protein 1 {ECO:0000256|ARBA:ARBA00041264};
DE AltName: Full=Protein cft1 {ECO:0000256|ARBA:ARBA00039187};
GN ORFNames=T310_8221 {ECO:0000313|EMBL:KKA17834.1};
OS Rasamsonia emersonii CBS 393.64.
OC Eukaryota; Fungi; Dikarya; Ascomycota; Pezizomycotina; Eurotiomycetes;
OC Eurotiomycetidae; Eurotiales; Trichocomaceae; Rasamsonia.
OX NCBI_TaxID=1408163 {ECO:0000313|EMBL:KKA17834.1, ECO:0000313|Proteomes:UP000053958};
RN [1] {ECO:0000313|EMBL:KKA17834.1, ECO:0000313|Proteomes:UP000053958}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=CBS 393.64 {ECO:0000313|EMBL:KKA17834.1,
RC ECO:0000313|Proteomes:UP000053958};
RA Heijne W.H., Fedorova N.D., Nierman W.C., Vollebregt A.W., Zhao Z., Wu L.,
RA Kumar M., Stam H., van den Berg M.A., Pel H.J.;
RL Submitted (APR-2015) to the EMBL/GenBank/DDBJ databases.
CC -!- FUNCTION: RNA-binding component of the cleavage and polyadenylation
CC factor (CPF) complex, which plays a key role in polyadenylation-
CC dependent pre-mRNA 3'-end formation and cooperates with cleavage
CC factors including the CFIA complex and NAB4/CFIB. Involved in poly(A)
CC site recognition. May be involved in coupling transcription termination
CC and mRNA 3'-end formation. {ECO:0000256|ARBA:ARBA00037232}.
CC -!- SIMILARITY: Belongs to the CFT1 family.
CC {ECO:0000256|ARBA:ARBA00038304}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:KKA17834.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; LASV01000544; KKA17834.1; -; Genomic_DNA.
DR RefSeq; XP_013324446.1; XM_013468992.1.
DR STRING; 1408163.A0A0F4YHZ1; -.
DR GeneID; 25320481; -.
DR OrthoDB; 149432at2759; -.
DR Proteomes; UP000053958; Unassembled WGS sequence.
DR GO; GO:0005634; C:nucleus; IEA:InterPro.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR Gene3D; 1.10.150.910; -; 1.
DR Gene3D; 2.130.10.10; YVTN repeat-like/Quinoprotein amine dehydrogenase; 2.
DR InterPro; IPR004871; Cleavage/polyA-sp_fac_asu_C.
DR InterPro; IPR018846; Cleavage/polyA-sp_fac_asu_N.
DR InterPro; IPR015943; WD40/YVTN_repeat-like_dom_sf.
DR PANTHER; PTHR10644:SF2; CLEAVAGE AND POLYADENYLATION SPECIFICITY FACTOR SUBUNIT 1; 1.
DR PANTHER; PTHR10644; DNA REPAIR/RNA PROCESSING CPSF FAMILY; 1.
DR Pfam; PF03178; CPSF_A; 1.
DR Pfam; PF10433; MMS1_N; 1.
PE 3: Inferred from homology;
KW Reference proteome {ECO:0000313|Proteomes:UP000053958}.
FT DOMAIN 123..743
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit N-terminal"
FT /evidence="ECO:0000259|Pfam:PF10433"
FT DOMAIN 1055..1379
FT /note="Cleavage/polyadenylation specificity factor A
FT subunit C-terminal"
FT /evidence="ECO:0000259|Pfam:PF03178"
FT REGION 211..237
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 470..496
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 217..237
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 481..496
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1414 AA; 153844 MW; 25A97E79AF56CB9F CRC64;
MLSSVESRSP LGLCVCDSLQ LDPDSRMQCY TELLPPSGVT HALSAPFLSP TANNLVVVKT
SLLQIFTLVN VASELDAREA EDKASRIETS QDTKLHLIAE YDLSGTVTDI CRVKILNSKS
GGDALLLAFR NAKLSLIEWD PDRHGISTIS IHYYEKEDVT RSPWVPNLST CGSHLTVDPS
SRCAVLNFGL RNLAILPFHQ AGDDLVMDDY DPDLDEEPAD HEMKDVVQSE KKDERKDSLT
YQTPYAASFV LPMTALDPAL LHPISLAFLH EYREPTFGIL YSQVATSSAL LSERKDVVFY
SVFTLDLEQR ASTTLLSVAR LPSDLFKIVA LPPPVGGALL VGSNELIHVD QAGKTNAVGV
NEFARQVSSF SMADQSDLAL RLEGCVVEHL GNENGDVILL LSSGEMMLVS FKLDGRSVSG
LSIHPIANGG TIMNAAASCS AVLGSGKLFF GSEDADSILV EWSNVSSSTK RARVQDADQQ
PEFSDEDGDD NDAYEDDLYS AAPKVSSDHR PSVDNSIAGG YNFRVLDTLT NIGPLRDIAL
GKASAKAADG DRNVSSVTAD LELVASQGSD KSGGLVLLKR EIDPSVISSF EIDNAESVWS
VSVSEAKSKT SNDGGSSVQK GADSYVILAK STSTDKEESV VHAVKGKSLE AFRAPEFNPN
EDCTVDVGTL AGGTRVVQVL TGEVRIYDSS LGLAQIYPVW DEDTSDERIA VSASFADPYL
LILRDDASVL LLQADESGDL DEVPLNDEIS SKSWLSGCLY ADNSGMFSPI ETDSQGNIHL
FLLNAECKLF IFRLPSTDLV SVIEGVDYVL PILSAEPPPR RSSTRETITE ILVADIGESY
CKSPYLILRT GTDDLVIYRP FRINDDLGKD PSSLKFLKEV NHTLPKVPSV ASSKTSSGGQ
RRTKSLRRLP DISGLSAVFM PGASPSFVIR TSKSMPHLVS LRGGFVRGLS DFNAAGCEKG
FIYVDSHNVV RTCQLPDDTQ FDFPWTVRRV PLGEQVDHLT YSTSSDTYVL GTSYKADFRL
PEDDELHPEW RNEDLPKLGE TPEPEELVCH RQVCYPLSAA ERVMAVENIN LEISEQTRER
KDVIVVGTTF AQGEDVAARG CVYVFDVIEV VPDPERPETN LKLKLIGKES VKGAVTAISG
IGGQGFLIVA QGQKCMVRGL KNDGSLLPVA FIDVQCYVSV LKELRGTGMC IIGDALKGLW
FTGYSEEPYK MTLFGKDMDE LEVVAADFLP DGKKLYIIVC DGDCNLHVLQ YDPEDPKSSN
GDRLLNRSTF HMGHFASTVT LLPRTAVSSE LAMSDSDEMS IDAYVPLHQV LITTQTGSVG
LVTSVSEESY RRLSALQSQL SNTLEHPCGL NPRAYRAVES DGIGGRGMID GTLLYRWLDL
SRQRKTEIAS RVGADEWEIR ADLEAIGGSG LGYL
//