I am a computer science researcher at The University of North Carolina at Chapel Hill. My research focuses on embodied agentic systems and computer vision, with a particular emphasis on building intelligent agents that can perceive, reason about, and interact with the physical world. I am interested in how large vision-language models can be leveraged to enable embodied AI agents to perform complex tasks in dynamic environments.

🔥 News

2026.03: 📄 Prune-Then-Plan accepted to CVPR 2026 Findings!
2025.05: 📄 Released VIN-NBV on arXiv
2024.05: 🎉 Accepted to the PhD program in Computer Science at UNC Chapel Hill!
2024.05: 🎓 Completed Masters of Science in Computer Science at UNC
2024.04: 📄 Monitor Illumination paper accepted to the CVPR 2024 Workshop on Multimedia Forensics!
2023.01: 📚 Started Masters of Science in Computer Science at UNC
2023.12: 🎓 Completed Bachelor of Science in Computer Science at UNC

📝 Publications

CVPR 2026 Findings

Prune-Then-Plan: Step-Level Calibration for Stable Frontier Exploration in Embodied Question Answering

Noah Frahm, Prakrut Patel, Yue Zhang, Shoubin Yu, Mohit Bansal, Roni Sengupta

We propose Prune-Then-Plan, a framework that stabilizes VLM-driven embodied exploration through step-level calibration. Our method prunes implausible frontier choices using a Holm-Bonferroni inspired pruning procedure and delegates final decisions to a coverage-based planner, achieving up to 49% and 33% relative improvements in visually grounded SPL and LLM-Match metrics.

ArXiv 2025

VIN-NBV: A View Introspection Network for Next-Best-View Selection

Noah Frahm, Dongxu Zhao, Andrea Dunn Beltran, Ron Alterovitz, Jan-Michael Frahm, Junier Oliva, Roni Sengupta

We introduce the View Introspection Network (VIN), a lightweight neural network that predicts the Relative Reconstruction Improvement of a potential next viewpoint without making new acquisitions. VIN-NBV achieves ~30% gain in reconstruction quality over coverage-based criteria and outperforms deep RL methods by ~40%.

CVPR 2024 Workshop on Multimedia Forensics

Building Secure and Engaging Video Communication by Using Monitor Illumination

Jun Myeong Choi, Johnathan Chi-Ho Leung, Noah Frahm, Max Christman, Gedas Bertasius, Roni Sengupta

We use light reflected from the monitor to detect if a person in a video call is real/live (on) or deepfake (off).

📖 Education

2024.05 - (present), Doctor of philosphy Computer Science, The University of North Carolina
2023.01 - 2024.05, Masters of Science Computer Science, The University of North Carolina
2020.08 - 2023.12, Bachelor of Science Computer science, The University of North Carolina

💻 Internships/work

Summer 2025, Applied Scientist, EveryPoint
Summer 2024, Software Engineering Intern, Capital One
Summer 2023, Software Engineering Intern, Capital One