Pragya Srivastava

I am a researcher at Google DeepMind, working on Robust and Sample Efficient Reinforcement Learning for post-training Gemini with Prof. Doina Precup. Dr. Karthikeyan Shanmugam, and Dr. Aravindan Raghuveer.

Previously, I was a Research Intern at VITA, UT Austin, where I worked on the theoretical analysis of Structured State Space Models. I also had the pleasure to work with Dr. Amit Sharma and Dr. Amit Deshpande at Microsoft Research, where I worked on In-context Learning and OOD detection in Reward models. Before this, I completed my undergrad from Indian Institute of Technology, Delhi, with a major in Engineering Physics. During this time, I was a research intern at UCSD, advised by Prof. Pengtao Xie, and at the Ermon Group, Stanford University.

Email / CV / Google Scholar / X / Github

Research

My research focuses on identifying and characterizing the fundamental failure modes of existing AI alignment methods. I am keen in developing principled approaches to enable reasoning models to explore effectively through interventional feedback, and generalize robustly out-of-domains. Broadly, I aim to bridge the gap between theoretical learning algorithms and practical deployment to build models that are highly capable yet fundamentally safe.

	Robust Reward Modeling via Causal Rubrics Pragya Srivastava, Harman Singh, Rahul Madhavan, Gandharv Patil, Sravanti Addepalli, Arun Suggala, Rengarajan Aravamudhan, Soumya Sharma, Anirban Laha, Aravindan Raghuveer, Karthikeyan Shanmugam, Doina Precup Under Review, 2025* DataWorld Workshop and MoFA Workshop at ICML 2025 Developing robust reward models with reduced reliance on spurious attributes and higher sensitivity to causal rubrics.
	Benchmarking Anomaly Detection for Large Language Model Alignment Pragya Srivastava, Dylan Feng, Anca Dragan, Cassidy Laidlaw Under Review, 2025 [Paper Coming Soon] Benchmarking detection of unforeseen alignment failures using scalable oversight.
	Outlier-Aware Preference Optimization for Large Language Models Pragya Srivastava, Sai Soumya Nalli, Amit Deshpande, Amit Sharma [Paper Coming Soon] BiAlign Workshop and Quantify Uncertainty Workshop at ICLR 2025 Dynamic alignment method using energy-based OOD scoring to identify misjudgments and refine both policy and reward model.
	Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing Peihao Wang, Ruisi Cai, Yuehao Wang, Jiajun Zhu, Pragya Srivastava, Zhangyang Wang, Pan Li ICLR 2025 We reveal two core limitations of state space models—recency bias and over-smoothing—and propose a polarization technique improving long-range memory and stability.
	NICE: To Optimize In-Context Examples or Not? Pragya Srivastava, Satvik Golechha, Amit Deshpande, Amit Sharma [Equal Contribution] ACL 2024 (Main Conference)* We introduce a task-specific metric, NICE (Normalized Invariability to Choice of Examples), that quantifies task learnability and guides whether to optimize instructions or ICEs.
	Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth [Equal Contribution] ACL 2024 (Findings)* LLMs perform well on simple tasks but degrade sharply with table complexity, requiring deeper reasoning and multi-row inference.

Reviewer for ICLR 2026, ICML 2025 MoFA Workshop

Website template by Jon Barron