• Xeuron logo
Discover
  • Home
  • Popular
  • Hot & Trending
  • Explore
  • My Extractions
Create
  • SubXeurons
    • iPSC-Cardio Cells
    • HALO: A Unified Visio
  • Publications
    • Adoption and Use of LLMs at an Academic Medical Center
    • Toward AI-Driven Digital Organism
    • You Can Run, You Can Hide: The Epidemiology and Statistical Mechanics of Zombies
    • embryonic stem cell-derived cardiac organoids via synthetic guidance
    • In vitro generation of human pluripotent stem cell derived lung organoids
    • Generating Self-Assembling Human Heart Organoids Derived from Pluripotent Stem Cells
    • SMAD4: A Critical Regulator of Cardiac Neural Crest Cell Fate and Vascular Smooth Muscle Differentiation. bioRxiv
    • Insights into AI Agent Security from a Large-Scale Red-Teaming Competition
    • TxPert: using multiple knowledge graphs for prediction of transcriptomic perturbation effects
    • Self-organizing human heart assembloids with autologous and developmentally relevant cardiac neural crest-derived tissues
    • Path Planning of Cleaning Robot with Reinforcement Learning
    • Reinforcement Learning Approaches in Social Robotics
    • Robotic Packaging Optimization with Reinforcement Learning
    • A Concise Introduction to Reinforcement Learning in Robotics
    • Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
    • Robotic Surgery With Lean Reinforcement Learning
    • Residual Reinforcement Learning for Robot Control
    • Autonomous robotic nanofabrication with reinforcement learning
    • Heterogeneous Multi-Robot Reinforcement Learning
    • Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
    • Reinforcement learning for freeform robot design
    • Geometric Reinforcement Learning For Robotic Manipulation
    • On-Robot Bayesian Reinforcement Learning for POMDPs
    • Efficient Content-Based Sparse Attention with Routing Transformers
    • A foundation model of transcription across human cell types
    • Transformer AI
    • HALO, a unified VLA model that enables embodied multimodal chain-of-thought (EM-CoT) reasoning through a sequential process of textual task reasoning, visual subgoal prediction for fine-grained guidan
    • HALO: A Unified Vision-Language-Action Model for Embodied Multimodal Chain-of-Thought Reasoning
  • Events
    • No events yet
HomeSearchEventsProfileCreate

TxPert: using multiple knowledge graphs for prediction of transcriptomic perturbation effects

arXiv:10.1038/s41587-026-03113-4[2026]
u/george16152·DOI·Source·PDF|

AI Summary

Accurately predicting cellular responses to genetic perturbations is essential for understanding disease mechanisms and designing effective therapies. Yet, exhaustively exploring the space of possible perturbations (for example, multigene perturbations or across tissues and cell types) is prohibitively expensive, motivating methods that can generalize to unseen conditions. We present TxPert, a latent-transfer-based deep learning method that uses multiple knowledge graphs of gene (product)–gene (product) relationships to predict transcriptomic perturbation effects. Different knowledge graphs encode complementary information and we show that a combination of graphs derived from biological databases and high-throughput perturbation screens yields the best performance. For predictions of single unseen perturbations, TxPert approaches the performance of split-half experimental reproducibility. For double unseen perturbations and single perturbations in a different cell line, its predictions increase Person Δ for unseen single perturbations by 8–25% over existing methods.

AI Metadata Extraction

Extract authors, key findings, references, and an executive summary using AI.

Version:· 3 versions extracted
Extraction v3google/gemini-3.1-flash-lite5/15/2026

Executive Summary

TxPert is a novel deep learning framework designed to solve the problem of predicting cellular responses to genetic perturbations across diverse out-of-distribution (OOD) contexts. By employing a latent-transfer mechanism, TxPert combines a basal state representation with perturbation-specific embeddings learned through Graph Neural Networks (GNNs). The integration of multiple knowledge graphs, including curated biological databases and internal proprietary screen data (PxMap, TxMap), allows the model to leverage complementary biological knowledge for robust generalization. The framework excels in three primary OOD tasks: predicting unseen single perturbations in known cell lines, forecasting effects for novel combinatorial (double) perturbations, and generalising to entirely new biological contexts (cell lines) not seen during training. Systematic benchmarking against previous models like GEARS and scLAMBDA, as well as a nonlearned general baseline, shows that TxPert consistently achieves superior predictive performance. Notably, its accuracy on single unseen perturbations approaches the theoretical ceiling of split-half experimental reproducibility. Beyond model performance, this work contributes a modular framework and best practices for transcriptomic perturbation analysis, including batch-appropriate control matching and evaluation using retrieval-based metrics. Although the model exhibits robust performance, analysis reveals a specific failure mode regarding the accurate prediction of the downregulation of perturbation targets. Overall, TxPert provides a strong foundation for future research, moving the field toward more reliable, scalable virtual assays that could significantly accelerate therapeutic discovery and personalized medicine.

Authors (15)

Frederik WenkelFirst Author

Valence Labs, Montréal, Quebec, Canada

frederik@valencelabs.com

Wilson Tu

Valence Labs, Montréal, Quebec, Canada

ali@valencelabs.com

Cassandra Masschelein

Valence Labs, Montréal, Quebec, Canada

Abstract

Accurately predicting cellular responses to genetic perturbations is essential for understanding disease mechanisms and designing effective therapies. Yet, exhaustively exploring the space of possible perturbations (for example, multigene perturbations or across tissues and cell types) is prohibitively expensive, motivating methods that can generalize to unseen conditions. We present TxPert, a latent-transfer-based deep learning method that uses multiple knowledge graphs of gene (product)–gene (product) relationships to predict transcriptomic perturbation effects. Different knowledge graphs encode complementary information and we show that a combination of graphs derived from biological databases and high-throughput perturbation screens yields the best performance. For predictions of single unseen perturbations, TxPert approaches the performance of split-half experimental reproducibility. For double unseen perturbations and single perturbations in a different cell line, its predictions increase Person Δ for unseen single perturbations by 8–25% over existing methods.

Fields of Study

Transcriptomic Perturbation PredictionComputational BiologyDeep LearningFunctional GenomicsDrug DiscoverySystems BiologyBioinformaticsArtificial IntelligenceGenomicsBiomedical Research

Key Findings (20)

1.TxPert effectively integrates multiple biological knowledge graphs (KGs) to improve perturbation effect prediction.

2.The model utilizes a latent-transfer-based deep learning architecture.

3.TxPert outperforms established baselines such as GEARS and scLAMBDA across various out-of-distribution (OOD) tasks.

Discussion & Future Directions

The discussion emphasizes the necessity of rigorous benchmarking in the field of transcriptomics-focused foundation models, as many prior models failed to outperform basic baselines. TxPert demonstrates the success of integrating diverse curated databases with large-scale high-throughput screening data via advanced graph modeling. Future directions include leveraging newly released single-cell datasets, extending to few-shot or active learning settings, and improving model capabilities for human primary tissues beyond immortalized cell lines. A crucial next step is the adoption of metrics that specifically evaluate the conditionality and specificity of perturbation effects in novel, unseen contexts.

References (47)

  1. [1]Adduri, A. K. et al. Predicting cellular responses to perturbation across diverse contexts with State. Preprint at bioRxiv https://doi.org/10.1101/2025.06.26.661135 (2025).
    Create publication
  2. [2]Ahlmann-Eltze, C., Huber, W. & Anders, S. Deep learning-based predictions of gene perturbation effects do not yet outperform simple linear methods. Nat. Methods 22, 1657–1661 (2025).
    Create publication
  3. [3]Bendidi, I. et al. Benchmarking transcriptomics foundation models for perturbation analysis: one PCA still rules them all. Preprint at https://doi.org/10.48550/arXiv.2410.13956 (2024).
    Create publication

Sections

Executive SummaryAuthorsAbstractFields of StudyKey FindingsDiscussionReferences