{"gene":"CPSF2","run_date":"2026-06-09T22:57:19","timeline":{"discoveries":[{"year":2009,"finding":"CPSF100 (CPSF2), CPSF73, and Symplekin form a stable core subcomplex that interacts with histone-specific processing factors and is required for both histone pre-mRNA 3'-end processing and polyadenylated pre-mRNA processing. Chromatin immunoprecipitation showed Symplekin and CPSF73 (but not CstF50) co-transcriptionally associate with histone genes, and depletion of CPSF160 or CstF64 downregulates Symplekin without affecting histone pre-mRNA processing.","method":"Co-immunoprecipitation, RNAi knockdown, chromatin immunoprecipitation (ChIP), in vivo processing assays in Drosophila","journal":"Molecular cell","confidence":"High","confidence_rationale":"Tier 2 / Strong — reciprocal co-IP establishing stable complex, ChIP for co-transcriptional association, RNAi knockdown with specific processing phenotype; replicated across multiple approaches in one study","pmids":["19450530"],"is_preprint":false},{"year":2008,"finding":"Conserved residues within the metallo-beta-lactamase (MBL) motifs of both CPSF73 and CPSF100 are required to assemble the active endonuclease that cleaves histone pre-mRNAs. CPSF100, though catalytically inactive itself (due to substitutions in the histidine motif), contributes structurally to the active endonuclease, analogous to RNase Z and RNase J homodimers.","method":"In vitro point mutagenesis of conserved MBL residues in both proteins, in vitro histone pre-mRNA cleavage assay","journal":"EMBO reports","confidence":"High","confidence_rationale":"Tier 1 / Moderate — in vitro reconstituted cleavage assay combined with active-site mutagenesis of both subunits in a single rigorous study","pmids":["18688255"],"is_preprint":false},{"year":2005,"finding":"CPSF100 (CPSF2) is exclusively nuclear, does not interact with CPSF73 or CPSF160, and forms a distinct complex with RC-68 (a CPSF73 homolog) that is independent of the canonical CPSF complex, suggesting a role in 3'-end processing of a subset of pre-mRNAs distinct from bulk polyadenylation.","method":"Co-immunoprecipitation, subcellular fractionation/localization, RNAi knockdown with cell-cycle phenotype readout in HeLa cells","journal":"Molecular and cellular biology","confidence":"Medium","confidence_rationale":"Tier 2 / Moderate — co-IP demonstrating complex formation and lack of interaction with canonical CPSF partners, plus direct localization by fractionation; single lab","pmids":["15684398"],"is_preprint":false},{"year":2014,"finding":"CPSF100 (CPSF2) forms a serum-stimulation-dependent complex with THOC5 (a TREX complex member), and THOC5 is required for recruitment of CPSF100 to the 3'-UTR of immediate early gene targets (including Myc and Smad7), controlling their 3'-end processing and alternative cleavage.","method":"Co-immunoprecipitation (interactome analysis using THOC5 as bait), chromatin/RNA immunoprecipitation, RNAi knockdown of THOC5 with transcriptome analysis","journal":"Nucleic acids research","confidence":"Medium","confidence_rationale":"Tier 2 / Moderate — co-IP identifying the THOC5–CPSF100 interaction, supported by RNAi knockdown and RNA-seq/ChIP demonstrating functional consequence; single lab","pmids":["25274738"],"is_preprint":false},{"year":2023,"finding":"The C-terminal domains (CTD1 and CTD2) of CPSF73 and CPSF100 form a stable heterodimer with extensive inter-protein contacts; CTD2 of both proteins resembles TATA-box binding protein (TBP) domains. The CTD3 domain of CPSF73 (also a TBP-fold domain, connected by a flexible linker) is required for binding Symplekin, defining the molecular architecture of the trimeric core cleavage complex.","method":"NMR solution structure determination of minimal CPSF73–CPSF100 C-terminal heterodimer from E. cuniculi, biochemical binding assays for Symplekin interaction, comparative structural modeling","journal":"Open biology","confidence":"High","confidence_rationale":"Tier 1 / Moderate — NMR structure determination combined with biochemical assays for Symplekin binding; multiple orthogonal methods in one study","pmids":["37989222"],"is_preprint":false},{"year":2014,"finding":"Knockdown of CPSF2 in thyroid cancer cells increased cellular invasion 1.8- to 3.2-fold and expanded cancer stem cell markers (CD44 and CD133 expression), establishing a functional role for CPSF2 in suppressing invasiveness.","method":"RNAi knockdown in thyroid cancer cell lines, invasion assay, flow cytometry/immunostaining for stem cell markers","journal":"The Journal of clinical endocrinology and metabolism","confidence":"Low","confidence_rationale":"Tier 3 / Weak — single lab, phenotypic knockdown assay without pathway placement or molecular mechanism","pmids":["24654752"],"is_preprint":false},{"year":2015,"finding":"Genome-wide iCLIP identified direct RNA-binding targets of CPSF2, and CPSF2 knockdown altered alternative splicing events genome-wide, indicating CPSF2 acts as a direct RNA-binding cofactor of RBFOX2 in regulating alternative splicing.","method":"Individual-nucleotide resolution UV cross-linking and immunoprecipitation (iCLIP), paired-end RNA-seq after CPSF2 RNAi knockdown","journal":"Genomics data","confidence":"Medium","confidence_rationale":"Tier 2 / Weak — iCLIP directly identifies RNA contacts of CPSF2 and RNA-seq shows splicing changes upon KD; single lab, limited mechanistic detail in abstract","pmids":["26697379"],"is_preprint":false}],"current_model":"CPSF100/CPSF2 is a catalytically inactive metallo-beta-lactamase-fold subunit of the CPSF complex that forms a stable heterodimer with the endonuclease CPSF73 through extensive C-terminal domain contacts, contributes structurally (via its MBL motifs) to assembly of the active cleavage endonuclease for both histone and polyadenylated pre-mRNA 3'-end processing, scaffolds the trimeric core cleavage complex together with Symplekin, and also participates in alternative splicing regulation as an RBFOX2 cofactor and in immediate early gene 3'-end processing through a stimulus-dependent interaction with the TREX component THOC5."},"narrative":{"mechanistic_narrative":"CPSF2 (CPSF100) is a catalytically inactive metallo-beta-lactamase (MBL)-fold subunit of the cleavage machinery for both histone and polyadenylated pre-mRNA 3'-end processing [PMID:19450530, PMID:18688255]. Although its own histidine motif is degenerate and it cannot cleave RNA, conserved residues within its MBL motifs are required to build the active endonuclease together with CPSF73, contributing structurally in a manner analogous to RNase Z and RNase J [PMID:18688255]. Structurally, the C-terminal domains of CPSF100 and CPSF73 form a stable heterodimer through extensive inter-protein contacts, and CPSF73's CTD3 binds Symplekin, defining the trimeric core cleavage complex that, together with histone-specific factors, is required for 3'-end processing and associates co-transcriptionally with histone genes [PMID:19450530, PMID:37989222]. Beyond canonical 3'-end formation, CPSF2 participates in alternative cleavage of immediate early genes through a serum-stimulation-dependent interaction with the TREX component THOC5, which recruits CPSF2 to 3'-UTR targets including Myc and Smad7 [PMID:25274738], and acts as a direct RNA-binding cofactor influencing genome-wide alternative splicing [PMID:26697379].","teleology":[{"year":2005,"claim":"Established that CPSF2 has a nuclear-restricted localization and can engage processing partners outside the canonical CPSF complex, raising the possibility of substrate-specific 3'-end processing roles.","evidence":"Co-IP, subcellular fractionation, and RNAi with cell-cycle readout in HeLa cells","pmids":["15684398"],"confidence":"Medium","gaps":["The reported lack of interaction with CPSF73/CPSF160 conflicts with later heterodimer structures and was not reconciled","Functional substrate set of the RC-68 complex not defined","Single-lab observation without reciprocal validation"]},{"year":2008,"claim":"Resolved how a catalytically dead subunit contributes to catalysis by showing CPSF100's MBL motifs are structurally required to assemble the active CPSF73 endonuclease.","evidence":"In vitro MBL active-site mutagenesis of both subunits with reconstituted histone pre-mRNA cleavage assay","pmids":["18688255"],"confidence":"High","gaps":["Does not resolve the atomic geometry of the assembled active site","Contribution to polyadenylated substrate cleavage tested less directly"]},{"year":2009,"claim":"Defined the CPSF100-CPSF73-Symplekin core as a stable subcomplex required for both histone and polyadenylated pre-mRNA processing and showed it associates co-transcriptionally with target genes.","evidence":"Reciprocal co-IP, RNAi knockdown with processing phenotype, and ChIP in Drosophila","pmids":["19450530"],"confidence":"High","gaps":["Mechanism distinguishing histone versus polyadenylated substrate routing not defined","Stoichiometry and assembly order of the trimeric core not established"]},{"year":2014,"claim":"Identified a stimulus-dependent route for CPSF2 recruitment, linking it to TREX-mediated 3'-end processing of immediate early genes.","evidence":"Co-IP interactome with THOC5 as bait, RNA/chromatin IP, and THOC5 RNAi with transcriptome analysis","pmids":["25274738"],"confidence":"Medium","gaps":["Whether the THOC5 interaction is direct is not established","How serum stimulation triggers complex formation is unknown","Single lab"]},{"year":2014,"claim":"Suggested a cellular consequence of CPSF2 loss in cancer, linking depletion to increased invasion and stem cell marker expansion.","evidence":"RNAi knockdown in thyroid cancer cell lines with invasion assay and stem cell marker flow cytometry","pmids":["24654752"],"confidence":"Low","gaps":["Phenotypic knockdown without molecular mechanism or pathway placement","No connection drawn to 3'-end processing function","Single lab, no in vivo validation"]},{"year":2015,"claim":"Extended CPSF2 function beyond 3'-end cleavage by mapping its direct RNA contacts and showing it influences genome-wide alternative splicing as an RBFOX2 cofactor.","evidence":"iCLIP for direct RNA targets paired with RNA-seq after CPSF2 knockdown","pmids":["26697379"],"confidence":"Medium","gaps":["Direct physical RBFOX2 interaction not biochemically detailed in abstract","Mechanism coupling splicing regulation to 3'-end processing role unknown","Single lab"]},{"year":2023,"claim":"Provided the molecular architecture of the core cleavage complex by solving the CPSF73-CPSF100 C-terminal heterodimer and defining the CPSF73 surface that binds Symplekin.","evidence":"NMR solution structure of the minimal C-terminal heterodimer from E. cuniculi with biochemical Symplekin binding assays and comparative modeling","pmids":["37989222"],"confidence":"High","gaps":["Structure of the full-length assembled endonuclease on RNA substrate not determined","Solved in a microsporidian ortholog rather than the human complex"]},{"year":null,"claim":"How CPSF2's distinct activities — core endonuclease assembly, THOC5-dependent immediate early gene processing, and RBFOX2-associated splicing regulation — are coordinated and substrate-selected remains unresolved.","evidence":"","pmids":[],"confidence":"Medium","gaps":["No structure of the catalytically active complex engaged with substrate RNA","Mechanism selecting histone versus polyadenylated versus alternatively spliced targets unknown","Direct versus indirect nature of THOC5 and RBFOX2 associations unsettled"]}],"mechanism_profile":{"molecular_activity":[{"term_id":"GO:0140098","term_label":"catalytic activity, acting on RNA","supporting_discovery_ids":[0,1]},{"term_id":"GO:0003723","term_label":"RNA binding","supporting_discovery_ids":[6]},{"term_id":"GO:0005198","term_label":"structural molecule activity","supporting_discovery_ids":[1,4]}],"localization":[{"term_id":"GO:0005634","term_label":"nucleus","supporting_discovery_ids":[2]}],"pathway":[{"term_id":"R-HSA-8953854","term_label":"Metabolism of RNA","supporting_discovery_ids":[0,1,6]},{"term_id":"R-HSA-74160","term_label":"Gene expression (Transcription)","supporting_discovery_ids":[0,3]}],"complexes":["CPSF cleavage core (CPSF100-CPSF73-Symplekin)"],"partners":["CPSF73","SYMPLEKIN","THOC5","RBFOX2","RC-68"],"other_free_text":[]}},"prefetch_data":{"uniprot":{"accession":"Q9P2I0","full_name":"Cleavage and polyadenylation specificity factor subunit 2","aliases":["Cleavage and polyadenylation specificity factor 100 kDa subunit","CPSF 100 kDa subunit"],"length_aa":782,"mass_kda":88.5,"function":"Component of the cleavage and polyadenylation specificity factor (CPSF) complex that play a key role in pre-mRNA 3'-end formation, recognizing the AAUAAA signal sequence and interacting with poly(A) polymerase and other factors to bring about cleavage and poly(A) addition. Involved in the histone 3' end pre-mRNA processing","subcellular_location":"Nucleus","url":"https://www.uniprot.org/uniprotkb/Q9P2I0/entry"},"depmap":{"release":"DepMap","has_data":true,"is_common_essential":true,"resolved_as":"","url":"https://depmap.org/portal/gene/CPSF2","classification":"Common Essential","n_dependent_lines":1200,"n_total_lines":1208,"dependency_fraction":0.9933774834437086},"opencell":{"profiled":false,"resolved_as":"","ensg_id":"","cell_line_id":"","localizations":[],"interactors":[{"gene":"CPSF6","stoichiometry":0.2},{"gene":"RBM14","stoichiometry":0.2},{"gene":"SNRPA","stoichiometry":0.2},{"gene":"SNRPC","stoichiometry":0.2},{"gene":"TOP1","stoichiometry":0.2}],"url":"https://opencell.sf.czbiohub.org/search/CPSF2","total_profiled":1310},"omim":[{"mim_id":"611274","title":"GLAUCOMA 1, OPEN ANGLE, N; GLC1N","url":"https://www.omim.org/entry/611274"},{"mim_id":"606029","title":"CLEAVAGE AND POLYADENYLATION SPECIFICITY FACTOR 3; CPSF3","url":"https://www.omim.org/entry/606029"},{"mim_id":"606028","title":"CLEAVAGE AND POLYADENYLATION SPECIFICITY FACTOR 2; CPSF2","url":"https://www.omim.org/entry/606028"},{"mim_id":"602388","title":"SYMPLEKIN; SYMPK","url":"https://www.omim.org/entry/602388"},{"mim_id":"104155","title":"ZINC FINGER HOMEOBOX 3; ZFHX3","url":"https://www.omim.org/entry/104155"}],"hpa":{"profiled":true,"resolved_as":"","reliability":"Approved","locations":[{"location":"Nucleoplasm","reliability":"Approved"},{"location":"Vesicles","reliability":"Additional"}],"tissue_specificity":"Low tissue specificity","tissue_distribution":"Detected in all","driving_tissues":[],"url":"https://www.proteinatlas.org/search/CPSF2"},"hgnc":{"alias_symbol":["KIAA1367","CPSF100"],"prev_symbol":[]},"alphafold":{"accession":"Q9P2I0","domains":[{"cath_id":"3.60.15.10","chopping":"4-207_539-597","consensus_level":"medium","plddt":95.0362,"start":4,"end":597},{"cath_id":"3.40.50.10890","chopping":"212-386_524-536","consensus_level":"high","plddt":94.0989,"start":212,"end":536},{"cath_id":"-","chopping":"614-631_698-782","consensus_level":"medium","plddt":84.2339,"start":614,"end":782}],"viewer_url":"https://alphafold.ebi.ac.uk/entry/Q9P2I0","model_url":"https://alphafold.ebi.ac.uk/files/AF-Q9P2I0-F1-model_v6.cif","pae_url":"https://alphafold.ebi.ac.uk/files/AF-Q9P2I0-F1-predicted_aligned_error_v6.png","plddt_mean":80.81},"mouse_models":{"mgi_url":"https://www.informatics.jax.org/marker/summary?nomen=CPSF2","jax_strain_url":"https://www.jax.org/strain/search?query=CPSF2"},"sequence":{"accession":"Q9P2I0","fasta_url":"https://rest.uniprot.org/uniprotkb/Q9P2I0.fasta","uniprot_url":"https://www.uniprot.org/uniprotkb/Q9P2I0/entry","alphafold_viewer_url":"https://alphafold.ebi.ac.uk/entry/Q9P2I0"}},"corpus_meta":[{"pmid":"19450530","id":"PMC_19450530","title":"A core complex of CPSF73, CPSF100, and Symplekin may form two different cleavage factors for processing of poly(A) and histone mRNAs.","date":"2009","source":"Molecular cell","url":"https://pubmed.ncbi.nlm.nih.gov/19450530","citation_count":107,"is_preprint":false},{"pmid":"15684398","id":"PMC_15684398","title":"A CPSF-73 homologue is required for cell cycle progression but not cell growth and interacts with a protein having features of CPSF-100.","date":"2005","source":"Molecular and cellular biology","url":"https://pubmed.ncbi.nlm.nih.gov/15684398","citation_count":85,"is_preprint":false},{"pmid":"18688255","id":"PMC_18688255","title":"Conserved motifs in both CPSF73 and CPSF100 are required to assemble the active endonuclease for histone mRNA 3'-end maturation.","date":"2008","source":"EMBO reports","url":"https://pubmed.ncbi.nlm.nih.gov/18688255","citation_count":66,"is_preprint":false},{"pmid":"25274738","id":"PMC_25274738","title":"THOC5 controls 3'end-processing of immediate early genes via interaction with polyadenylation specific factor 100 (CPSF100).","date":"2014","source":"Nucleic acids research","url":"https://pubmed.ncbi.nlm.nih.gov/25274738","citation_count":33,"is_preprint":false},{"pmid":"17012808","id":"PMC_17012808","title":"A serendipitous discovery that in situ proteolysis is essential for the crystallization of yeast CPSF-100 (Ydh1p).","date":"2006","source":"Acta crystallographica. Section F, Structural biology and crystallization communications","url":"https://pubmed.ncbi.nlm.nih.gov/17012808","citation_count":22,"is_preprint":false},{"pmid":"24654752","id":"PMC_24654752","title":"Loss of CPSF2 expression is associated with increased thyroid cancer cellular invasion and cancer stem cell population, and more aggressive disease.","date":"2014","source":"The Journal of clinical endocrinology and metabolism","url":"https://pubmed.ncbi.nlm.nih.gov/24654752","citation_count":18,"is_preprint":false},{"pmid":"34103026","id":"PMC_34103026","title":"Serum anti-DIDO1, anti-CPSF2, and anti-FOXJ2 antibodies as predictive risk markers for acute ischemic stroke.","date":"2021","source":"BMC medicine","url":"https://pubmed.ncbi.nlm.nih.gov/34103026","citation_count":17,"is_preprint":false},{"pmid":"26148673","id":"PMC_26148673","title":"Negative Expression of CPSF2 Predicts a Poorer Clinical Outcome in Patients with Papillary Thyroid Carcinoma.","date":"2015","source":"Thyroid : official journal of the American Thyroid Association","url":"https://pubmed.ncbi.nlm.nih.gov/26148673","citation_count":15,"is_preprint":false},{"pmid":"26697379","id":"PMC_26697379","title":"Global analysis of CPSF2-mediated alternative splicing: Integration of global iCLIP and transcriptome profiling data.","date":"2015","source":"Genomics data","url":"https://pubmed.ncbi.nlm.nih.gov/26697379","citation_count":5,"is_preprint":false},{"pmid":"39066483","id":"PMC_39066483","title":"SIZ1-mediated SUMOylation of CPSF100 promotes plant thermomorphogenesis by controlling alternative polyadenylation.","date":"2024","source":"Molecular plant","url":"https://pubmed.ncbi.nlm.nih.gov/39066483","citation_count":4,"is_preprint":false},{"pmid":"37989222","id":"PMC_37989222","title":"Molecular details of the CPSF73-CPSF100 C-terminal heterodimer and interaction with Symplekin.","date":"2023","source":"Open biology","url":"https://pubmed.ncbi.nlm.nih.gov/37989222","citation_count":3,"is_preprint":false},{"pmid":"36723825","id":"PMC_36723825","title":"1H, 15N and 13C resonance assignments of a minimal CPSF73-CPSF100 C-terminal heterodimer.","date":"2023","source":"Biomolecular NMR assignments","url":"https://pubmed.ncbi.nlm.nih.gov/36723825","citation_count":1,"is_preprint":false}],"cost":{"stage1":{"model":"claude-sonnet-4-6","input_tokens":7792,"output_tokens":1997,"usd":0.026666,"stage1_stop_reason":"end_turn"},"stage2":{"model":"claude-opus-4-8","input_tokens":8859,"output_tokens":2659,"usd":0.055385,"stage2_stop_reason":"end_turn"},"total_usd":0.082051,"stage1_batch_id":"msgbatch_01ACAxJQabDsht39JXeWG22d","stage2_batch_id":"msgbatch_01QEU22A9U5sbPTgsMzTmvSf","note":"batch pricing = 50% of standard"},"stage1_raw":"```json\n{\n  \"discoveries\": [\n    {\n      \"year\": 2009,\n      \"finding\": \"CPSF100 (CPSF2), CPSF73, and Symplekin form a stable core subcomplex that interacts with histone-specific processing factors and is required for both histone pre-mRNA 3'-end processing and polyadenylated pre-mRNA processing. Chromatin immunoprecipitation showed Symplekin and CPSF73 (but not CstF50) co-transcriptionally associate with histone genes, and depletion of CPSF160 or CstF64 downregulates Symplekin without affecting histone pre-mRNA processing.\",\n      \"method\": \"Co-immunoprecipitation, RNAi knockdown, chromatin immunoprecipitation (ChIP), in vivo processing assays in Drosophila\",\n      \"journal\": \"Molecular cell\",\n      \"confidence\": \"High\",\n      \"confidence_rationale\": \"Tier 2 / Strong — reciprocal co-IP establishing stable complex, ChIP for co-transcriptional association, RNAi knockdown with specific processing phenotype; replicated across multiple approaches in one study\",\n      \"pmids\": [\"19450530\"],\n      \"is_preprint\": false\n    },\n    {\n      \"year\": 2008,\n      \"finding\": \"Conserved residues within the metallo-beta-lactamase (MBL) motifs of both CPSF73 and CPSF100 are required to assemble the active endonuclease that cleaves histone pre-mRNAs. CPSF100, though catalytically inactive itself (due to substitutions in the histidine motif), contributes structurally to the active endonuclease, analogous to RNase Z and RNase J homodimers.\",\n      \"method\": \"In vitro point mutagenesis of conserved MBL residues in both proteins, in vitro histone pre-mRNA cleavage assay\",\n      \"journal\": \"EMBO reports\",\n      \"confidence\": \"High\",\n      \"confidence_rationale\": \"Tier 1 / Moderate — in vitro reconstituted cleavage assay combined with active-site mutagenesis of both subunits in a single rigorous study\",\n      \"pmids\": [\"18688255\"],\n      \"is_preprint\": false\n    },\n    {\n      \"year\": 2005,\n      \"finding\": \"CPSF100 (CPSF2) is exclusively nuclear, does not interact with CPSF73 or CPSF160, and forms a distinct complex with RC-68 (a CPSF73 homolog) that is independent of the canonical CPSF complex, suggesting a role in 3'-end processing of a subset of pre-mRNAs distinct from bulk polyadenylation.\",\n      \"method\": \"Co-immunoprecipitation, subcellular fractionation/localization, RNAi knockdown with cell-cycle phenotype readout in HeLa cells\",\n      \"journal\": \"Molecular and cellular biology\",\n      \"confidence\": \"Medium\",\n      \"confidence_rationale\": \"Tier 2 / Moderate — co-IP demonstrating complex formation and lack of interaction with canonical CPSF partners, plus direct localization by fractionation; single lab\",\n      \"pmids\": [\"15684398\"],\n      \"is_preprint\": false\n    },\n    {\n      \"year\": 2014,\n      \"finding\": \"CPSF100 (CPSF2) forms a serum-stimulation-dependent complex with THOC5 (a TREX complex member), and THOC5 is required for recruitment of CPSF100 to the 3'-UTR of immediate early gene targets (including Myc and Smad7), controlling their 3'-end processing and alternative cleavage.\",\n      \"method\": \"Co-immunoprecipitation (interactome analysis using THOC5 as bait), chromatin/RNA immunoprecipitation, RNAi knockdown of THOC5 with transcriptome analysis\",\n      \"journal\": \"Nucleic acids research\",\n      \"confidence\": \"Medium\",\n      \"confidence_rationale\": \"Tier 2 / Moderate — co-IP identifying the THOC5–CPSF100 interaction, supported by RNAi knockdown and RNA-seq/ChIP demonstrating functional consequence; single lab\",\n      \"pmids\": [\"25274738\"],\n      \"is_preprint\": false\n    },\n    {\n      \"year\": 2023,\n      \"finding\": \"The C-terminal domains (CTD1 and CTD2) of CPSF73 and CPSF100 form a stable heterodimer with extensive inter-protein contacts; CTD2 of both proteins resembles TATA-box binding protein (TBP) domains. The CTD3 domain of CPSF73 (also a TBP-fold domain, connected by a flexible linker) is required for binding Symplekin, defining the molecular architecture of the trimeric core cleavage complex.\",\n      \"method\": \"NMR solution structure determination of minimal CPSF73–CPSF100 C-terminal heterodimer from E. cuniculi, biochemical binding assays for Symplekin interaction, comparative structural modeling\",\n      \"journal\": \"Open biology\",\n      \"confidence\": \"High\",\n      \"confidence_rationale\": \"Tier 1 / Moderate — NMR structure determination combined with biochemical assays for Symplekin binding; multiple orthogonal methods in one study\",\n      \"pmids\": [\"37989222\"],\n      \"is_preprint\": false\n    },\n    {\n      \"year\": 2014,\n      \"finding\": \"Knockdown of CPSF2 in thyroid cancer cells increased cellular invasion 1.8- to 3.2-fold and expanded cancer stem cell markers (CD44 and CD133 expression), establishing a functional role for CPSF2 in suppressing invasiveness.\",\n      \"method\": \"RNAi knockdown in thyroid cancer cell lines, invasion assay, flow cytometry/immunostaining for stem cell markers\",\n      \"journal\": \"The Journal of clinical endocrinology and metabolism\",\n      \"confidence\": \"Low\",\n      \"confidence_rationale\": \"Tier 3 / Weak — single lab, phenotypic knockdown assay without pathway placement or molecular mechanism\",\n      \"pmids\": [\"24654752\"],\n      \"is_preprint\": false\n    },\n    {\n      \"year\": 2015,\n      \"finding\": \"Genome-wide iCLIP identified direct RNA-binding targets of CPSF2, and CPSF2 knockdown altered alternative splicing events genome-wide, indicating CPSF2 acts as a direct RNA-binding cofactor of RBFOX2 in regulating alternative splicing.\",\n      \"method\": \"Individual-nucleotide resolution UV cross-linking and immunoprecipitation (iCLIP), paired-end RNA-seq after CPSF2 RNAi knockdown\",\n      \"journal\": \"Genomics data\",\n      \"confidence\": \"Medium\",\n      \"confidence_rationale\": \"Tier 2 / Weak — iCLIP directly identifies RNA contacts of CPSF2 and RNA-seq shows splicing changes upon KD; single lab, limited mechanistic detail in abstract\",\n      \"pmids\": [\"26697379\"],\n      \"is_preprint\": false\n    }\n  ],\n  \"current_model\": \"CPSF100/CPSF2 is a catalytically inactive metallo-beta-lactamase-fold subunit of the CPSF complex that forms a stable heterodimer with the endonuclease CPSF73 through extensive C-terminal domain contacts, contributes structurally (via its MBL motifs) to assembly of the active cleavage endonuclease for both histone and polyadenylated pre-mRNA 3'-end processing, scaffolds the trimeric core cleavage complex together with Symplekin, and also participates in alternative splicing regulation as an RBFOX2 cofactor and in immediate early gene 3'-end processing through a stimulus-dependent interaction with the TREX component THOC5.\"\n}\n```","stage2_raw":"{\n  \"mechanistic_narrative\": \"CPSF2 (CPSF100) is a catalytically inactive metallo-beta-lactamase (MBL)-fold subunit of the cleavage machinery for both histone and polyadenylated pre-mRNA 3'-end processing [#0, #1]. Although its own histidine motif is degenerate and it cannot cleave RNA, conserved residues within its MBL motifs are required to build the active endonuclease together with CPSF73, contributing structurally in a manner analogous to RNase Z and RNase J [#1]. Structurally, the C-terminal domains of CPSF100 and CPSF73 form a stable heterodimer through extensive inter-protein contacts, and CPSF73's CTD3 binds Symplekin, defining the trimeric core cleavage complex that, together with histone-specific factors, is required for 3'-end processing and associates co-transcriptionally with histone genes [#0, #4]. Beyond canonical 3'-end formation, CPSF2 participates in alternative cleavage of immediate early genes through a serum-stimulation-dependent interaction with the TREX component THOC5, which recruits CPSF2 to 3'-UTR targets including Myc and Smad7 [#3], and acts as a direct RNA-binding cofactor influencing genome-wide alternative splicing [#6].\",\n  \"teleology\": [\n    {\n      \"year\": 2005,\n      \"claim\": \"Established that CPSF2 has a nuclear-restricted localization and can engage processing partners outside the canonical CPSF complex, raising the possibility of substrate-specific 3'-end processing roles.\",\n      \"evidence\": \"Co-IP, subcellular fractionation, and RNAi with cell-cycle readout in HeLa cells\",\n      \"pmids\": [\"15684398\"],\n      \"confidence\": \"Medium\",\n      \"confidence_rationale\": \"single lab\",\n      \"gaps\": [\"The reported lack of interaction with CPSF73/CPSF160 conflicts with later heterodimer structures and was not reconciled\", \"Functional substrate set of the RC-68 complex not defined\", \"Single-lab observation without reciprocal validation\"]\n    },\n    {\n      \"year\": 2008,\n      \"claim\": \"Resolved how a catalytically dead subunit contributes to catalysis by showing CPSF100's MBL motifs are structurally required to assemble the active CPSF73 endonuclease.\",\n      \"evidence\": \"In vitro MBL active-site mutagenesis of both subunits with reconstituted histone pre-mRNA cleavage assay\",\n      \"pmids\": [\"18688255\"],\n      \"confidence\": \"High\",\n      \"gaps\": [\"Does not resolve the atomic geometry of the assembled active site\", \"Contribution to polyadenylated substrate cleavage tested less directly\"]\n    },\n    {\n      \"year\": 2009,\n      \"claim\": \"Defined the CPSF100-CPSF73-Symplekin core as a stable subcomplex required for both histone and polyadenylated pre-mRNA processing and showed it associates co-transcriptionally with target genes.\",\n      \"evidence\": \"Reciprocal co-IP, RNAi knockdown with processing phenotype, and ChIP in Drosophila\",\n      \"pmids\": [\"19450530\"],\n      \"confidence\": \"High\",\n      \"gaps\": [\"Mechanism distinguishing histone versus polyadenylated substrate routing not defined\", \"Stoichiometry and assembly order of the trimeric core not established\"]\n    },\n    {\n      \"year\": 2014,\n      \"claim\": \"Identified a stimulus-dependent route for CPSF2 recruitment, linking it to TREX-mediated 3'-end processing of immediate early genes.\",\n      \"evidence\": \"Co-IP interactome with THOC5 as bait, RNA/chromatin IP, and THOC5 RNAi with transcriptome analysis\",\n      \"pmids\": [\"25274738\"],\n      \"confidence\": \"Medium\",\n      \"gaps\": [\"Whether the THOC5 interaction is direct is not established\", \"How serum stimulation triggers complex formation is unknown\", \"Single lab\"]\n    },\n    {\n      \"year\": 2014,\n      \"claim\": \"Suggested a cellular consequence of CPSF2 loss in cancer, linking depletion to increased invasion and stem cell marker expansion.\",\n      \"evidence\": \"RNAi knockdown in thyroid cancer cell lines with invasion assay and stem cell marker flow cytometry\",\n      \"pmids\": [\"24654752\"],\n      \"confidence\": \"Low\",\n      \"gaps\": [\"Phenotypic knockdown without molecular mechanism or pathway placement\", \"No connection drawn to 3'-end processing function\", \"Single lab, no in vivo validation\"]\n    },\n    {\n      \"year\": 2015,\n      \"claim\": \"Extended CPSF2 function beyond 3'-end cleavage by mapping its direct RNA contacts and showing it influences genome-wide alternative splicing as an RBFOX2 cofactor.\",\n      \"evidence\": \"iCLIP for direct RNA targets paired with RNA-seq after CPSF2 knockdown\",\n      \"pmids\": [\"26697379\"],\n      \"confidence\": \"Medium\",\n      \"gaps\": [\"Direct physical RBFOX2 interaction not biochemically detailed in abstract\", \"Mechanism coupling splicing regulation to 3'-end processing role unknown\", \"Single lab\"]\n    },\n    {\n      \"year\": 2023,\n      \"claim\": \"Provided the molecular architecture of the core cleavage complex by solving the CPSF73-CPSF100 C-terminal heterodimer and defining the CPSF73 surface that binds Symplekin.\",\n      \"evidence\": \"NMR solution structure of the minimal C-terminal heterodimer from E. cuniculi with biochemical Symplekin binding assays and comparative modeling\",\n      \"pmids\": [\"37989222\"],\n      \"confidence\": \"High\",\n      \"gaps\": [\"Structure of the full-length assembled endonuclease on RNA substrate not determined\", \"Solved in a microsporidian ortholog rather than the human complex\"]\n    },\n    {\n      \"year\": null,\n      \"claim\": \"How CPSF2's distinct activities — core endonuclease assembly, THOC5-dependent immediate early gene processing, and RBFOX2-associated splicing regulation — are coordinated and substrate-selected remains unresolved.\",\n      \"evidence\": \"\",\n      \"pmids\": [],\n      \"confidence\": \"Medium\",\n      \"gaps\": [\"No structure of the catalytically active complex engaged with substrate RNA\", \"Mechanism selecting histone versus polyadenylated versus alternatively spliced targets unknown\", \"Direct versus indirect nature of THOC5 and RBFOX2 associations unsettled\"]\n    }\n  ],\n  \"mechanism_profile\": {\n    \"molecular_activity\": [\n      {\"term_id\": \"GO:0140098\", \"supporting_discovery_ids\": [0, 1]},\n      {\"term_id\": \"GO:0003723\", \"supporting_discovery_ids\": [6]},\n      {\"term_id\": \"GO:0005198\", \"supporting_discovery_ids\": [1, 4]}\n    ],\n    \"localization\": [\n      {\"term_id\": \"GO:0005634\", \"supporting_discovery_ids\": [2]}\n    ],\n    \"pathway\": [\n      {\"term_id\": \"R-HSA-8953854\", \"supporting_discovery_ids\": [0, 1, 6]},\n      {\"term_id\": \"R-HSA-74160\", \"supporting_discovery_ids\": [0, 3]}\n    ],\n    \"complexes\": [\n      \"CPSF cleavage core (CPSF100-CPSF73-Symplekin)\"\n    ],\n    \"partners\": [\n      \"CPSF73\",\n      \"Symplekin\",\n      \"THOC5\",\n      \"RBFOX2\",\n      \"RC-68\"\n    ],\n    \"other_free_text\": []\n  }\n}","audit_flag":null,"evaluation":{"pairwise":"win","faith_supported":4,"faith_total":4,"faith_pct":100.0}}