Ishita Gupta

I'm a second-year Master's student in Robotic Systems Development (MRSD) at Carnegie Mellon University, specializing in Robotics and Deep Learning. Previously, I was at Addverb for 2.5 years, where I engineered autonomous navigation systems, physics-based simulators, and multi-robot applications. I also interned at Google on the Nest Devices team, automating cloud infrastructure pipelines.

I'm passionate about building intelligent systems that perceive, reason, and act in complex, dynamic environments. My focus is on Deep Learning, and I am currently working with 3D Vision and Vision-Language Agents (VLAs). I'm especially interested in tackling complex, long-horizon planning problems, with applications for robotics and the broader AI landscape.

Email  /  LinkedIn  /  Github

News

Oct 2025 Secured 3rd place in the CMU VLA Challenge and will be presenting our work at IROS 2025! 🏆
Sep 2025 Conducted an in-person lab at CMU on Recurrent Neural Networks and GRUs. 📚
Aug 2025 Completed my summer internship at Nissan Advanced Technology Center in the Bay Area, focusing on Humanoid Robotics and fine-tuning robotic foundation models. 🌁
Apr 2025 Demonstrated our Autonomous Humanoid Loco-Manipulation for Tote Logistics Capstone Project. 📦 🦾
Mar 2025 Published tutorial videos on How to read Research Papers, Python Basics, and Distributed Training for students at CMU. 📚
Aug 2024 Started my Master's at CMU's Robotics Institute! 🤖
July 2024 Completed 2.5 years at Addverb as a Robotics Software Engineer. 🚀
May 2022 Completed my undergraduate studies at LNMIIT with a B.Tech in Computer Science and Engineering. 🎓
Aug 2021 Completed my internship at Google as a Software Engineering Intern.

Publications

FALCON Demo FALCON: Learning Force-Adaptive Humanoid Loco-Manipulation
Yuanhang Zhang, Yifu Yuan, Prajwal Gurunath, Tairan He, Ishita Gupta, et al., Guanya Shi
In submission

project page / paper / code

TL;DR: FALCON enables various heavy-duty humanoid loco-manipulation tasks via a new dual-agent force-adaptive RL framework.

Industry Experience

Nissan NATC-SV Logo Robotics Research Intern
Humanoid Robotics Team
Addverb Technologies Logo
Addverb January 2022 - July 2024
Robotics Engineer
Advanced Robotics & Industrial Automation
Google Logo
Google May 2021 - August 2021
Software Engineering Intern
Nest Devices, Cloud Infrastructure

Projects & Research

Autonomous Humanoid Loco-Manipulation for Tote Logistics
CMU MRSD Capstone Project, 2024 - Present
Sponsored by Nissan & Field AI
Advised by Prof. Guanya Shi

project page / code

Built an autonomy system that fuses 6D pose estimation (NVIDIA FoundationPose) with motion-capture localization for precise perception and navigation. This allowed the Unitree G1 to autonomously manipulate totes and operate effectively in real-world factory workflows.

LLM Self-Correction Project Training Language Models to Self-Correct via Reinforcement Learning
CMU Course Project, 2024

project report

Led benchmarking of self-correction in LLMs, evaluating performance across Llama 3.2 1B, Llama 3.1 8B, and Mathstral 7B on MATH dataset. Achieved accuracy rates of up to 41.8%, identifying high Correct-to-Incorrect rates (46.9%) and low Incorrect-to-Correct rate improvements (3.78%). Engineered a multi-turn reinforcement learning framework (SCoRe) for fine-tuning.

Deepfake Detection Slide 1 Deepfake Detection Slide 2 Deepfake Detection Slide 3 Deepfake Detection Slide 4 Deepfake Detection Slide 5 Deepfake Detection Slide 6 Deepfake Detection Slide 7
Spatio-Temporal Transformer for Video Anomaly Detection
CMU , Spring 2024

project report

Re-implemented the LIPINC-V2 Vision Temporal Transformer, a SOTA architecture for detecting deepfakes by analyzing spatio-temporal video. I first validated my from-scratch model by reproducing the paper's 0.98 AP results on the LipSyncTIMIT benchmark. To expand the project, I then engineered a custom data generation pipeline, creating a new 2,300-sample dataset. On this new dataset, I established a strong CNN-LSTM baseline model, achieving 94.9% accuracy.

Music Segmentation Project Indian Classical Music Segmentation Using Machine Learning
LNMIIT B.Tech Project, 2021 - 2022
Advised by Prof. Dr. Sakthi Balan

project page

Developed an onset detection technique to isolate the Percussion Solo section in concert audio, applying clustering algorithms (K-means, DBSCAN) using audio processing libraries.

archive projects →

Education

CMU Robotics Institute Logo Carnegie Mellon University
Master of Science in Robotic Systems Development (MRSD)
CGPA: 3.83 | August 2024 - May 2026
Coursework: Deep Reinforcement Learning (10-703), Generative AI (10-623) Show more
LNMIIT Logo The LNM Institute of Information Technology (LNMIIT)
Bachelor of Technology (B.Tech) in Computer Science and Engineering
August 2018 - July 2022
Coursework: Artificial Intelligence, NLP, Advanced Algorithms Show more

Teaching Experience

CMU LTI Logo Carnegie Mellon University
Teaching Assistant - Introduction to Deep Learning (11-785)
Spring 2025, Fall 2025
Course Website / YouTube Channel

Responsibilities:
• Lead recitation sections and office hours for 300+ students
• Mentor teams on Deep Learning projects
• Assist in course development and curriculum refinement
• Create educational content and recorded lectures for online learning

cloned from here!