light_mode

Som Sagar

Hi! I am a third-year computer science PhD student at Arizona State University advised by Dr. Ransalu Senanayake and affiliated with the Laboratory for Learning Evaluation and Naturalization of Systems (LENS Lab). My research focuses on developing robust and adaptable machine learning models, with an emphasis on reinforcement learning and uncertainty estimation. I aim to create systems that can effectively handle distribution shifts and new information in dynamic environments. By integrating foundational models, reinforcement learning, and real-world applicability, I seek to enhance model interpretability, improve generalization, and optimize decision-making processes in AI systems.

Previously, I received a B.Tech Honors in Computer Science from the Indian Institute of Information Technology (IIIT), Kottayam.

I am originally from Kerala, India, and outside of research, I enjoy spending time outdoors, especially playing soccer, hiking, and swimming.

I'm always happy to connect about research, collaborate on ideas, or share advice. Please feel free to get in touch!

Profile Picture

News

May '26 Joined Google LogoGoogle as a student researcher.
Jan '26 Our paper "Strategic Vantage Selection for Learning Viewpoint-Agnostic Manipulation Policies" was accepted to ICRA 2026.
Jan '26 Our paper "Uncovering Robot Vulnerabilities through Semantic Potential Fields" was accepted to ICLR 2026.
Jan '26 Our paper "ExpressivityArena: Can LLMs Express Information Implicitly?" was accepted to EACL-Findings 2026.
Sept '25 Our paper "PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?" was accepted to NeurIPS 2025.
Jun '25 Our paper "Trustworthy Explanations for Robot Behaviors" was accepted to IROS 2025.
May '25 Joined LinkedIn LogoLinkedIn as a research intern.
Apr '25 Our paper "Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations" was accepted to ICML 2025.
Dec '24 Presented four workshop papers at NeurIPS 2024!
Apr '24 Our ICML 2024 paper "Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models" was accepted as a spotlight (top 3.5%).
Aug '23 Started my PhD in Computer Science at Arizona State University, joining the LENS Lab.
May '23 Graduated from IIIT Kottayam with a B.Tech (Honors) in Computer Science.

Research

Please see my CV or Google Scholar for a full list of work.

RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields
Som Sagar, Jiafei Duan, Sreevisakh V, Yifan Zhou, Heni Ben'Amor, Dieter Fox, Ransalu Senanayake
International Conference on Learning Representations (ICLR), 2026
We train a deep RL policy to navigate a learned vision-language embedding space, structured as a potential field of successes and failures, to efficiently diagnose vulnerabilities in manipulation policies without costly real-world trials.
picture_as_pdf PDF format_quote BibTeX
@article{sagar2024uncovering,
            title={Uncovering Robot Vulnerabilities through Semantic Potential Fields},
            author={Sagar, Som and Duan, Jiafei and Vasudevan, Sreevisakh and Zhou, Yifan and Ben'Amor, Heni and Fox, Dieter and Senanayake, Ransalu},
            journal={arXiv:2412.02818},
            year={2026}
          }
Strategic Vantage Selection for Learning Viewpoint-Agnostic Manipulation Policies
Sreevisakh V, Som Sagar, Ransalu Senanayake
International Conference on Robotics & Automation (ICRA), 2026
We propose a method that strategically selects camera viewpoints during training to learn manipulation policies that generalize across novel viewpoints at deployment, without requiring multi-view data collection.
picture_as_pdf PDF format_quote BibTeX
@article{vasudevan2026strategic,
            title={Strategic Vantage Selection for Learning Viewpoint-Agnostic Manipulation Policies},
            author={Vasudevan, Sreevisakh and Sagar, Som and Senanayake, Ransalu},
            journal={arXiv:2506.12261},
            year={2026}
          }
PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
Atharva Gundawar*, Som Sagar*, Ransalu Senanayake
Conference on Neural Information Processing Systems (NeurIPS), 2025
We introduce PAC Bench, a benchmark with 30,000+ annotations to evaluate whether vision-language models understand the object Properties, action Affordances, and physical Constraints needed for reliable robot manipulation.
picture_as_pdf PDF dataset Dataset format_quote BibTeX
@article{gundawar2025pac,
            title={PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?},
            author={Gundawar, Atharva and Sagar, Som and Senanayake, Ransalu},
            journal={arXiv:2506.23725},
            year={2025}
          }
Trustworthy Explanations for Robot Behaviors
Som Sagar*, Aditya Taparia*, Harsh Mankodiya, Pranav Bidare, Yifan Zhou, Ransalu Senanayake
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025
We propose BaTCAVe, a Bayesian concept-based explainability method that provides human-interpretable explanations for robot decisions along with calibrated uncertainty scores, enabling trustworthy post-hoc diagnosis of neural network policies.
picture_as_pdf PDF play_circle Video code Code format_quote BibTeX
@article{sagar2024trustworthy,
            title={Trustworthy Conceptual Explanations for Neural Networks in Robot Decision-Making},
            author={Sagar, Som and Taparia, Aditya and Mankodiya, Harsh and Bidare, Pranav and Zhou, Yifan and Senanayake, Ransalu},
            journal={arXiv preprint arXiv:2409.10733},
            year={2024}
          }
Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations
Aditya Taparia, Som Sagar, Ransalu Senanayake
International Conference on Machine Learning (ICML), 2025
We frame concept-based explanation as an image generation problem and propose RLPO, a deep RL algorithm that fine-tunes a vision-language generative model to automatically produce concept images that reveal a neural network's internal representations—including ones humans cannot anticipate.
picture_as_pdf PDF play_circle Video code Code format_quote BibTeX
@article{taparia2024explainable,
                title={Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations},
                author={Taparia, Aditya and Sagar, Som and Senanayake, Ransalu},
                journal={arXiv preprint arXiv:2408.13438},
                year={2024}
              }
From Mystery to Mastery: Failure Diagnosis for Improving Manipulation Policies
Som Sagar, Jiafei Duan, Sreevisakh V, Yifan Zhou, Heni Ben'Amor, Dieter Fox, Ransalu Senanayake
Robotics: Science and Systems (RSS) Workshop on Out-of-Distribution Generalization in Robotics, 2025
We propose a deep RL framework that systematically diagnoses failure modes in robot manipulation policies under unseen environmental variations, and uses the discovered vulnerabilities to fine-tune and improve policy robustness.
picture_as_pdf PDF language Website format_quote BibTeX
@inproceedings{sagar2024mystery,
            title={From Mystery to Mastery: Failure Diagnosis for Improving Manipulation Policies},
            author={Sagar, Som and Duan, Jiafei and Vasudevan, Sreevisakh and Zhou, Yifan and Ben'Amor, Heni and Fox, Dieter and Senanayake, Ransalu},
            booktitle={RSS Workshop on Out-of-Distribution Generalization in Robotics},
            year={2025}
          }
Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models
Som Sagar, Aditya Taparia, Ransalu Senanayake
International Conference on Machine Learning (ICML), 2024 — spotlight (top 3.5%)
We propose a deep RL method to efficiently explore and map the failure landscape of large-scale vision and language models, then restructure it using limited human feedback to mitigate unwanted behaviors such as accuracy drops, social biases, and misalignment.
picture_as_pdf PDF play_circle Video code Code format_quote BibTeX
@inproceedings{sagar2024failures,
            title={Failures are fated, but can be faded: characterizing and mitigating unwanted behaviors in large-scale vision and language models},
            author={Sagar, Som and Taparia, Aditya and Senanayake, Ransalu},
            booktitle={Proceedings of the 41st International Conference on Machine Learning},
            pages={42999--43023},
            year={2024}
          }
ExpressivityArena: Can LLMs Express Information Implicitly?
Joshua Tint, Som Sagar, Aditya Taparia, Caleb Liu, Kelly Raines, Bimsara Pathiraja, Ransalu Senanayake
NeurIPS Workshop on Behavioral Machine Learning, 2024
We introduce an information-theoretic framework for evaluating how well LLMs can implicitly communicate tone, emotion, identity, and intent without explicit mention, revealing that models excel at affective content but lag behind humans on sociolinguistic signals.
picture_as_pdf PDF format_quote BibTeX
@inproceedings{tint2024expressivityarena,
            title={ExpressivityArena: Can LLMs Express Information Implicitly?},
            author={Tint, Joshua and Sagar, Som and Taparia, Aditya and Liu, Caleb and Raines, Kelly and Pathiraja, Bimsara and Senanayake, Ransalu},
            booktitle={NeurIPS 2024 Workshop on Behavioral Machine Learning},
            year={2024}
          }
LLM-Assisted Red Teaming of Diffusion Models through "Failures Are Fated, But Can Be Faded"
Som Sagar, Aditya Taparia, Ransalu Senanayake
NeurIPS Workshop on Red Teaming GenAI, 2024
We extend the Failures Are Fated framework to diffusion models by introducing LLM-generated rewards and states, action screening inspired by design of experiments, and a comparison of DQN, PPO, and A2C for red teaming text-to-image models.
picture_as_pdf PDF format_quote BibTeX
@inproceedings{sagar2024llm,
            title={LLM-Assisted Red Teaming of Diffusion Models through “Failures Are Fated, But Can Be Faded”},
            author={Sagar, Som and Taparia, Aditya and Senanayake, Ransalu},
            booktitle={Red Teaming GenAI: What Can We Learn from Adversaries?},
            year={2024}
          }