Pragya Srivastava

I am a researcher at Google DeepMind, working on Robust and Sample Efficient Reinforcement Learning for post-training Gemini with Prof. Doina Precup. Dr. Karthikeyan Shanmugam, and Dr. Aravindan Raghuveer.

Previously, I was a Research Intern at VITA, UT Austin, where I worked on the theoretical analysis of Structured State Space Models. I also had the pleasure to work with Dr. Amit Sharma and Dr. Amit Deshpande at Microsoft Research, where I worked on In-context Learning and OOD detection in Reward models. Before this, I completed my undergrad from Indian Institute of Technology, Delhi, with a major in Engineering Physics. During this time, I was a research intern at UCSD, advised by Prof. Pengtao Xie, and at the Ermon Group, Stanford University.

Email  /  CV  /  Google Scholar  /  X  /  Github

profile photo

Research

My research focuses on identifying and characterizing the fundamental failure modes of existing AI alignment methods. I am keen in developing principled approaches to enable reasoning models to explore effectively through interventional feedback, and generalize robustly out-of-domains. Broadly, I aim to bridge the gap between theoretical learning algorithms and practical deployment to build models that are highly capable yet fundamentally safe.

Multi-Objective GEOMA: Geometric and Econometric Objectives for Multi-Reward Alignment
Pragya Srivastava*, Taneesh Gupta*, Rahul Madhavan, Karthikeyan Shanmugam, Aravindan Raghuveer
Under Review, 2026
[Paper Coming Soon]

We study the effects of covariance sphering and welfare-based aggregation to convert reward vectors into scalars for policy alignment. This resolves signal redundancy and reward hacking.

Causal Rubrics Robust Reward Modeling via Causal Rubrics
Pragya Srivastava*, Harman Singh*, Rahul Madhavan*, Gandharv Patil, Sravanti Addepalli, Arun Suggala, Rengarajan Aravamudhan, Soumya Sharma, Anirban Laha, Aravindan Raghuveer, Karthikeyan Shanmugam, Doina Precup
ICLR 2026

Developing robust reward models with reduced reliance on spurious attributes and higher sensitivity to causal rubrics.

Anomaly Detection Benchmarking and Improving Monitors for Out-Of-Distribution Alignment Failure in LLMs
Pragya Srivastava*, Dylan Feng*, Anca Dragan, Cassidy Laidlaw
Under Review, 2026
[Paper Coming Soon]

We systematically study monitoring for out-of-distribution alignment failures of LLMs using guard models and OOD detectors.

Uncertain RM Outlier-Aware Preference Optimization for Large Language Models
Pragya Srivastava, Sai Soumya Nalli, Amit Deshpande, Amit Sharma
ICLR 2025
BiAlign Workshop and Quantify Uncertainty Workshop

Dynamic alignment method using energy-based OOD scoring to identify misjudgments and refine both policy and reward model.

SSM Bottlenecks Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang, Jiajun Zhu, Pragya Srivastava, Zhangyang Wang, Pan Li
ICLR 2025

We reveal two core limitations of state space models—recency bias and over-smoothing—and propose a polarization technique improving long-range memory and stability.

NICE NICE: To Optimize In-Context Examples or Not?
Pragya Srivastava*, Satvik Golechha*, Amit Deshpande, Amit Sharma
ACL 2024 (Main Conference)

We introduce a task-specific metric, NICE (Normalized Invariability to Choice of Examples), that quantifies task learnability and guides whether to optimize instructions or ICEs.

Semi-Structured Reasoning Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering
Pragya Srivastava*, Manuj Malik*, Vivek Gupta, Tanuja Ganu, Dan Roth
ACL 2024 (Findings)

LLMs perform well on simple tasks but degrade sharply with table complexity, requiring deeper reasoning and multi-row inference.

Reviewer for ICLR 2026, ICML 2025 MoFA Workshop
Website template by Jon Barron