Pragya Srivastava

I am a researcher at Google DeepMind, working on Robust and Sample Efficient Reinforcement Learning for post-training Gemini with Prof. Doina Precup. Dr. Karthikeyan Shanmugam, and Dr. Aravindan Raghuveer.

Previously, I was a Research Intern at VITA, UT Austin, where I worked on the theoretical analysis of Structured State Space Models. I also had the pleasure to work with Dr. Amit Sharma and Dr. Amit Deshpande at Microsoft Research, where I worked on In-context Learning and OOD detection in Reward models. Before this, I completed my undergrad from Indian Institute of Technology, Delhi, with a major in Engineering Physics. During this time, I was a research intern at UCSD, advised by Prof. Pengtao Xie, and at the Ermon Group, Stanford University.

Email  /  CV  /  Google Scholar  /  X  /  Github

profile photo

Research

My research focuses on identifying and characterizing the fundamental failure modes of existing AI alignment methods. I am keen in developing principled approaches to enable reasoning models to explore effectively through interventional feedback, and generalize robustly out-of-domains. Broadly, I aim to bridge the gap between theoretical learning algorithms and practical deployment to build models that are highly capable yet fundamentally safe.

Causal Rubrics Robust Reward Modeling via Causal Rubrics
Pragya Srivastava*, Harman Singh*, Rahul Madhavan*, Gandharv Patil, Sravanti Addepalli, Arun Suggala, Rengarajan Aravamudhan, Soumya Sharma, Anirban Laha, Aravindan Raghuveer, Karthikeyan Shanmugam, Doina Precup

Under Review, 2025
DataWorld Workshop and MoFA Workshop at ICML 2025

Developing robust reward models with reduced reliance on spurious attributes and higher sensitivity to causal rubrics.

Anomaly Detection Benchmarking Anomaly Detection for Large Language Model Alignment
Pragya Srivastava*, Dylan Feng*, Anca Dragan, Cassidy Laidlaw

Under Review, 2025
[Paper Coming Soon]

Benchmarking detection of unforeseen alignment failures using scalable oversight.

Uncertain RM Outlier-Aware Preference Optimization for Large Language Models
Pragya Srivastava, Sai Soumya Nalli, Amit Deshpande, Amit Sharma

[Paper Coming Soon]
BiAlign Workshop and Quantify Uncertainty Workshop at ICLR 2025

Dynamic alignment method using energy-based OOD scoring to identify misjudgments and refine both policy and reward model.

SSM Bottlenecks Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang, Jiajun Zhu, Pragya Srivastava, Zhangyang Wang, Pan Li

ICLR 2025

We reveal two core limitations of state space models—recency bias and over-smoothing—and propose a polarization technique improving long-range memory and stability.

NICE NICE: To Optimize In-Context Examples or Not?
Pragya Srivastava*, Satvik Golechha*, Amit Deshpande, Amit Sharma [*Equal Contribution]

ACL 2024 (Main Conference)

We introduce a task-specific metric, NICE (Normalized Invariability to Choice of Examples), that quantifies task learnability and guides whether to optimize instructions or ICEs.

Semi-Structured Reasoning Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering
Pragya Srivastava*, Manuj Malik*, Vivek Gupta, Tanuja Ganu, Dan Roth [*Equal Contribution]

ACL 2024 (Findings)

LLMs perform well on simple tasks but degrade sharply with table complexity, requiring deeper reasoning and multi-row inference.

Reviewer for ICLR 2026, ICML 2025 MoFA Workshop
Website template by Jon Barron