Affinage

CPSF2

Cleavage and polyadenylation specificity factor subunit 2 · UniProt Q9P2I0

Round 2 corrected
Length
782 aa
Mass
88.5 kDa
Annotated
2026-04-28
42 papers in source corpus 15 papers cited in narrative 14 extracted findings

Mechanistic narrative

Synthesis pass · prose summary of the discoveries below

CPSF2 (CPSF100) is a catalytically inactive metallo-β-lactamase-fold subunit of the cleavage and polyadenylation specificity factor (CPSF) that serves as an essential scaffold for pre-mRNA 3′-end processing. It forms a stable core cleavage complex with CPSF73 and Symplekin that is shared between canonical polyadenylation and replication-dependent histone mRNA 3′-end processing; conserved MBL-motif residues in CPSF100 are required together with those of CPSF73 for heterodimeric endonuclease assembly and activity (PMID:18688255, PMID:19450530). The C-terminal domains of CPSF100 and CPSF73 form an extensive heterodimeric interface (NMR-resolved), with a separate CPSF73 CTD3 domain contacting Symplekin to complete the trimeric core (PMID:37989222). Beyond canonical 3′-end cleavage, CPSF2 is recruited by THOC5 to immediate-early gene 3′UTRs for alternative cleavage/polyadenylation (PMID:25274738) and functions as an RBFOX2 cofactor in alternative splicing regulation, directly binding pre-mRNA as shown by iCLIP (PMID:26697379).

Mechanistic history

Synthesis pass · year-by-year structured walk · 10 steps
  1. 1993 High

    Establishing that CPSF (containing CPSF100) participates in a processive polyadenylation complex revealed the functional context in which CPSF100 operates — as part of a quaternary assembly with PAP and PAB II that stabilizes PAP on the RNA 3′-end.

    Evidence In vitro reconstitution of polyadenylation with non-denaturing gel electrophoresis

    PMID:8440247

    Open questions at the time
    • Role of individual CPSF subunits within the complex not resolved
    • No direct assay for CPSF100 contribution to complex stability
  2. 1995 High

    Mapping the intra-CPSF interaction network showed that CPSF160 directly binds CPSF100, placing CPSF100 as a central subunit linking the RNA-recognition module to the broader polyadenylation machinery.

    Evidence Recombinant protein binding assays with immunoprecipitation and antibody inhibition/rescue of in vitro polyadenylation

    PMID:7590244

    Open questions at the time
    • Stoichiometry of the CPSF160–CPSF100 interaction unknown
    • No structural data on the interface
  3. 1997 High

    Demonstrating that CPSF is recruited to the pre-initiation complex via TFIID and then transfers to elongating RNA Pol II established that CPSF100, as part of CPSF, is cotranscriptionally loaded — coupling transcription to 3′-end processing.

    Evidence TFIID immunopurification identifying CPSF; Pol II CTD affinity binding; CTD truncation impairing 3′ processing in vivo

    PMID:9002523 PMID:9311784

    Open questions at the time
    • Whether CPSF100 directly contacts TFIID or Pol II CTD not determined
    • Mechanism of CPSF hand-off from TFIID to Pol II unclear
  4. 2000 High

    Identifying Symplekin as a bridging factor that co-purifies with both CPSF (including CPSF100) and CstF provided the first evidence that CPSF100 resides in a larger polyadenylation supercomplex.

    Evidence Reciprocal co-immunoprecipitation from cell extracts with domain mapping of CstF-64–Symplekin interaction

    PMID:10669729

    Open questions at the time
    • Direct CPSF100–Symplekin contact not demonstrated at this stage
    • Functional consequence of Symplekin bridging on cleavage not tested
  5. 2005 High

    Discovery of a paralogous RC-68/RC-74 (CPSF73L/CPSF100-like) complex that is distinct from canonical CPSF and whose depletion arrests cells in G1 revealed that metallo-β-lactamase-fold heterodimers can function in specialized RNA processing pathways linked to cell cycle control.

    Evidence Co-immunoprecipitation in HeLa/mouse cells; RNAi knockdown with cell cycle analysis

    PMID:15684398

    Open questions at the time
    • Substrate specificity of the RC-68/RC-74 complex not defined
    • Whether CPSF100 itself contributes to G1 progression independent of its paralog not tested
  6. 2008 High

    Mutagenesis of conserved MBL-motif residues in both CPSF73 and CPSF100 abolished histone pre-mRNA cleavage, establishing that CPSF100 is an obligate but catalytically inactive partner in a heterodimeric endonuclease — analogous to homodimeric RNases Z and J.

    Evidence In vitro histone pre-mRNA cleavage assay with point mutations in MBL active-site residues of both subunits

    PMID:18688255

    Open questions at the time
    • No crystal structure of the heterodimer to explain why CPSF100 is catalytically dead
    • Whether CPSF100 MBL residues also contribute to polyadenylated mRNA cleavage not directly shown
  7. 2009 High

    Defining the CPSF73–CPSF100–Symplekin trimeric core as shared between histone and polyadenylated mRNA 3′-end processing unified two pathways under a common catalytic platform, with pathway specificity conferred by additional factors (CstF, CPSF160).

    Evidence Co-IP, ChIP at histone gene loci, and RNAi of individual subunits with histone pre-mRNA processing readout

    PMID:19450530

    Open questions at the time
    • Structural basis of trimeric assembly unknown at this point
    • How the core complex is differentially recruited to the two pathways not mechanistically resolved
  8. 2014 High

    THOC5-dependent recruitment of CPSF100 to immediate-early gene 3′UTRs linked CPSF100 to stimulus-responsive alternative cleavage and polyadenylation, expanding its role beyond constitutive 3′-end processing.

    Evidence AP-MS with THOC5 bait; THOC5 depletion plus transcriptome analysis; CPSF100 ChIP at 3′UTRs

    PMID:25274738

    Open questions at the time
    • Direct THOC5–CPSF100 binding interface not mapped
    • Whether CPSF100 catalytic-dead MBL domain is required for this function unknown
  9. 2015 Medium

    iCLIP revealed that CPSF2 directly binds pre-mRNA, and its knockdown altered alternative splicing globally as an RBFOX2 cofactor — establishing a non-canonical function for CPSF100 in splicing regulation.

    Evidence iCLIP at single-nucleotide resolution; siRNA knockdown with paired-end RNA-seq

    PMID:26697379

    Open questions at the time
    • Mechanism by which CPSF100 cooperates with RBFOX2 not defined
    • Whether the splicing function requires the CPSF73–CPSF100 heterodimer or CPSF100 alone not resolved
    • Single study; independent replication awaited
  10. 2023 High

    NMR solution structure of the CPSF73–CPSF100 C-terminal heterodimer revealed the atomic-level interface (CTD1–CTD2 contacts) and a TBP-like fold in CTD2, while a separate CPSF73 CTD3 domain was shown to bind Symplekin — providing the first structural model for trimeric core cleavage complex assembly.

    Evidence NMR structure determination of minimal heterodimer from Encephalitozoon cuniculi; biochemical binding assays for CTD3–Symplekin interaction

    PMID:36723825 PMID:37989222

    Open questions at the time
    • No full-length mammalian CPSF73–CPSF100 structure available
    • Role of the TBP-fold domain in RNA binding or catalysis not functionally tested

Open questions

Synthesis pass · forward-looking unresolved questions
  • A complete structural model of the mammalian CPSF73–CPSF100–Symplekin trimer bound to substrate RNA, the mechanism by which CPSF100 allosterically activates CPSF73 endonuclease activity, and the molecular basis for CPSF100's non-canonical role in splicing regulation remain unresolved.
  • No mammalian full-length trimer structure with RNA substrate
  • Allosteric contribution of CPSF100 MBL domain to CPSF73 catalysis not mechanistically defined
  • CPSF100–RBFOX2 interaction interface and splicing mechanism unknown

Mechanism profile

Synthesis pass · controlled-vocabulary classification · explore literature graph →
Molecular activity
GO:0005198 structural molecule activity 3 GO:0140098 catalytic activity, acting on RNA 2 GO:0003723 RNA binding 1
Localization
GO:0005654 nucleoplasm 2 GO:0005634 nucleus 1
Pathway
R-HSA-8953854 Metabolism of RNA 5 R-HSA-74160 Gene expression (Transcription) 2
Complex memberships
CPSF (cleavage and polyadenylation specificity factor)CPSF73–CPSF100–Symplekin core cleavage complex

Evidence

Reading pass · 14 per-paper findings extracted from the source corpus
Year Finding Method Journal Conf PMIDs
1993 CPSF (containing CPSF100/CPSF2), poly(A) polymerase (PAP), and poly(A) binding protein II (PAB II) form a processive quaternary complex with substrate RNA, transiently stabilizing PAP binding to the RNA 3'-end and enabling processive poly(A) tail synthesis. Non-denaturing gel electrophoresis analysis of RNA-protein interactions; in vitro reconstitution of polyadenylation The EMBO journal High 8440247
1995 Within the CPSF complex, CPSF160 (the largest subunit) binds specifically to CPSF100/CPSF2 as part of a multisubunit assembly; CPSF160 also contacts CstF-77 and poly(A) polymerase, establishing the protein-protein interaction network that coordinates cleavage and polyadenylation. Recombinant protein binding assays; immunoprecipitation; in vitro polyadenylation with antibody inhibition and CPSF rescue Genes & development High 7590244
1997 The CPSF complex (including CPSF100/CPSF2) is recruited to the transcription pre-initiation complex via TFIID, dissociates from TFIID after transcription initiation, and then associates with the elongating RNA polymerase II, linking transcription initiation to mRNA 3'-end processing. Immunopurification of TFIID complex followed by identification of CPSF; functional assays showing overexpression of TBP reduces polyadenylation Nature High 9311784
1997 CPSF (containing CPSF100/CPSF2) specifically binds to the CTD of RNA polymerase II large subunit and co-purifies with pol II in a high-molecular-mass complex, demonstrating that CPSF is physically coupled to the transcription elongation machinery. CTD affinity column binding; co-purification of CPSF with pol II; truncation of CTD showing loss of 3'-processing in vivo Nature High 9002523
2000 CPSF (including CPSF100/CPSF2), CstF, and symplekin can be co-isolated from cells as part of a large multiprotein complex, suggesting that symplekin functions in the assembly of the polyadenylation machinery by bridging CstF and CPSF. Co-immunoprecipitation and co-purification from cell extracts; identification of CstF-64–symplekin interaction domain Molecular and cellular biology High 10669729
2004 hFip1 is an integral subunit of CPSF that interacts with both CPSF160 and poly(A) polymerase; hFip1, CPSF160, and PAP form a ternary complex in vitro, positioning CPSF100/CPSF2 within a larger CPSF assembly that stimulates polyadenylation activity. Affinity purification and mass spectrometry identification of hFip1; recombinant protein interaction assays; in vitro polyadenylation stimulation assay The EMBO journal High 14749727
2005 A CPSF-100 homolog (RC-74) forms a stable complex with a CPSF-73 homolog (RC-68) in HeLa and mouse cells, independent of the canonical CPSF complex (does not interact with CPSF-160 or CPSF-73). RC-74 is exclusively nuclear whereas RC-68 is cytoplasmic and nuclear. RNAi depletion of RC-68 arrests cells in G1, indicating this RC-68/RC-74 (CPSF73b/CPSF100b) complex processes a subset of pre-mRNAs required for G1 progression and S-phase entry. Co-immunoprecipitation in HeLa and mouse cells; subcellular fractionation/immunofluorescence; RNAi knockdown with cell cycle analysis Molecular and cellular biology High 15684398
2006 Crystallization of yeast CPSF-100 (Ydh1p) required proteolytic removal of an internal segment of ~200 highly charged and hydrophilic residues, revealing that this region renders the protein highly soluble and prevents crystallization, providing the first structural foothold into CPSF100 domain architecture. Protein crystallization with fungal protease-mediated in situ proteolysis; X-ray crystallography Acta crystallographica Section F Medium 17012808
2008 Conserved residues within the metallo-beta-lactamase (MBL) motifs of both CPSF73 and CPSF100/CPSF2 are required for assembly of the endonuclease activity that cleaves histone pre-mRNAs, indicating that CPSF73 and CPSF100 act together as a heterodimeric endonuclease analogous to homodimeric RNases Z and J. In vitro histone pre-mRNA cleavage assay with point mutations in conserved MBL residues of both CPSF73 and CPSF100 EMBO reports High 18688255
2009 CPSF73, CPSF100/CPSF2, and Symplekin form a stable core cleavage complex that associates with histone pre-mRNA processing factors and cotranscriptionally associates with histone genes (shown by ChIP for Symplekin and CPSF73). This core complex is shared between canonical polyadenylation and histone mRNA 3'-end processing pathways; CstF50 and CPSF160 are not required for histone pre-mRNA processing, indicating they define pathway-specific complexes built on this shared core. Co-immunoprecipitation; chromatin immunoprecipitation (ChIP); RNAi knockdown of individual subunits with histone pre-mRNA processing readout Molecular cell High 19450530
2014 THOC5 (a TREX complex member) forms a complex with CPSF100/CPSF2 upon serum stimulation, and THOC5 is required for recruitment of CPSF100 to the 3'UTR of a subset of immediate early genes (including Myc and Smad7), controlling their alternative cleavage and polyadenylation. Interactome analysis by affinity purification (THOC5 as bait) followed by mass spectrometry; THOC5 depletion with transcriptome analysis; CPSF100 ChIP at 3'UTR of target genes Nucleic acids research High 25274738
2014 CPSF2/CPSF100 knockdown in thyroid cancer cells increases cellular invasion 1.8–3.2-fold and expands markers of cancer stem cells (CD44 and CD133 expression), placing CPSF2 as a regulator of invasive behavior. siRNA knockdown; invasion assay; flow cytometry for CD44/CD133 markers; immunohistochemistry The Journal of clinical endocrinology and metabolism Medium 24654752
2015 Genome-wide iCLIP demonstrates that CPSF2 directly binds pre-mRNA, and CPSF2 knockdown combined with RNA-seq identifies CPSF2 as a cofactor of RBFOX2 in regulating alternative splicing on a global scale, distinct from its canonical role in cleavage and polyadenylation. iCLIP (individual-nucleotide resolution UV cross-linking and immunoprecipitation); siRNA knockdown; paired-end RNA-seq Genomics data Medium 26697379
2023 The solution structure of the CPSF73–CPSF100/CPSF2 C-terminal heterodimer (CTD1 and CTD2 of both proteins) was determined by NMR using a minimal heterodimer from Encephalitozoon cuniculi, revealing extensive inter-protein contacts particularly between CTD1 and CTD2, and a similarity of CTD2 to TATA-box binding protein (TBP) domains. A separate CTD3 domain of CPSF73 (also TBP-fold) binds Symplekin, establishing the structural basis for trimeric core cleavage complex formation. NMR resonance assignment and solution structure determination; biochemical binding assays for CTD3–Symplekin interaction; structural modeling of trimeric complex Open biology High 36723825 37989222

Source papers

Stage 0 corpus · 42 papers · ranked by NIH iCite citations
Year Title Journal Citations PMID
2006 Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell 2861 17081983
2002 Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proceedings of the National Academy of Sciences of the United States of America 1479 12477932
2004 Large-scale characterization of HeLa cell nuclear phosphoproteins. Proceedings of the National Academy of Sciences of the United States of America 1159 15302935
2017 Architecture of the human interactome defines protein communities and disease networks. Nature 1085 28514442
2015 A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 1015 26496610
1997 The C-terminal domain of RNA polymerase II couples mRNA processing to transcription. Nature 764 9002523
2003 Complete sequencing and characterization of 21,243 full-length human cDNAs. Nature genetics 754 14702039
2007 Large-scale mapping of human protein-protein interactions by mass spectrometry. Molecular systems biology 733 17353931
2002 Comprehensive proteomic analysis of the human spliceosome. Nature 725 12226669
2021 Dual proteome-scale networks reveal cell-specific remodeling of the human interactome. Cell 705 33961781
2012 A census of human soluble protein complexes. Cell 689 22939629
2011 Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. Briefings in bioinformatics 656 21873635
2018 High-Density Proximity Mapping Reveals the Subcellular Organization of mRNA-Associated Granules and Bodies. Molecular cell 580 29395067
2017 Anticancer sulfonamides target splicing by inducing RBM39 degradation via recruitment to DCAF15. Science (New York, N.Y.) 533 28302793
2008 Many sequence variants affecting diversity of adult human height. Nature genetics 520 18391951
2004 The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome research 438 15489334
2022 OpenCell: Endogenous tagging for the cartography of human cellular organization. Science (New York, N.Y.) 432 35271311
2005 Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes. Genome research 409 16344560
2015 Panorama of ancient metazoan macromolecular complexes. Nature 407 26344197
2000 Direct coupling of transcription and mRNA processing through the thermogenic coactivator PGC-1. Molecular cell 328 10983978
2010 Dynamics of cullin-RING ubiquitin ligase network revealed by systematic quantitative proteomics. Cell 318 21145461
2012 Novel genetic loci identified for the pathophysiology of childhood obesity in the Hispanic population. PloS one 312 23251661
1997 Transcription factor TFIID recruits factor CPSF for formation of 3' end of mRNA. Nature 267 9311784
2017 A Compendium of RNA-Binding Proteins that Regulate MicroRNA Biogenesis. Molecular cell 248 28431233
2008 A PtdIns4,5P2-regulated nuclear poly(A) polymerase controls expression of select mRNAs. Nature 225 18288197
1995 The 160-kD subunit of human cleavage-polyadenylation specificity factor coordinates pre-mRNA 3'-end formation. Genes & development 221 7590244
2004 Human Fip1 is a subunit of CPSF that binds to U-rich RNA elements and stimulates poly(A) polymerase. The EMBO journal 209 14749727
2000 Complex protein interactions within the human polyadenylation machinery identify a novel component. Molecular and cellular biology 200 10669729
1993 Assembly of a processive messenger RNA polyadenylation complex. The EMBO journal 200 8440247
2020 Systems analysis of RhoGEF and RhoGAP regulatory proteins reveals spatially organized RAC1 signalling from integrin adhesions. Nature cell biology 194 32203420
2009 A core complex of CPSF73, CPSF100, and Symplekin may form two different cleavage factors for processing of poly(A) and histone mRNAs. Molecular cell 107 19450530
2005 A CPSF-73 homologue is required for cell cycle progression but not cell growth and interacts with a protein having features of CPSF-100. Molecular and cellular biology 85 15684398
2008 Conserved motifs in both CPSF73 and CPSF100 are required to assemble the active endonuclease for histone mRNA 3'-end maturation. EMBO reports 66 18688255
2014 THOC5 controls 3'end-processing of immediate early genes via interaction with polyadenylation specific factor 100 (CPSF100). Nucleic acids research 33 25274738
2006 A serendipitous discovery that in situ proteolysis is essential for the crystallization of yeast CPSF-100 (Ydh1p). Acta crystallographica. Section F, Structural biology and crystallization communications 22 17012808
2014 Loss of CPSF2 expression is associated with increased thyroid cancer cellular invasion and cancer stem cell population, and more aggressive disease. The Journal of clinical endocrinology and metabolism 17 24654752
2021 Serum anti-DIDO1, anti-CPSF2, and anti-FOXJ2 antibodies as predictive risk markers for acute ischemic stroke. BMC medicine 16 34103026
2015 Negative Expression of CPSF2 Predicts a Poorer Clinical Outcome in Patients with Papillary Thyroid Carcinoma. Thyroid : official journal of the American Thyroid Association 14 26148673
2024 SIZ1-mediated SUMOylation of CPSF100 promotes plant thermomorphogenesis by controlling alternative polyadenylation. Molecular plant 4 39066483
2015 Global analysis of CPSF2-mediated alternative splicing: Integration of global iCLIP and transcriptome profiling data. Genomics data 4 26697379
2023 Molecular details of the CPSF73-CPSF100 C-terminal heterodimer and interaction with Symplekin. Open biology 2 37989222
2023 1H, 15N and 13C resonance assignments of a minimal CPSF73-CPSF100 C-terminal heterodimer. Biomolecular NMR assignments 1 36723825