Projects

Research


Bachelor's Thesis · IIT KGP

Multi-Task RL Representations

B.Tech Thesis, Aug 2024 – May 2025 · advised by Aritra Hazra & Naveen Kumar Garg

Explored unsupervised pretraining for task-agnostic representations in multi-task RL; reproduced AVDC results on the impact of shared image-space representations for multi-task learning.

Google DeepMind

VideoAgent — Self-Improving Video Generation

Research Intern, Jan–Nov 2024 · with Sherry Yang & Bo Dai

Proposed VideoAgent to self-refine robotic video plans using pretrained VLM feedback and online replanning. Raised MetaWorld success 43.1% → 50%, iTHOR 31.3% → 34.2%, and +22% task acceptance in human evals on BridgeV2.

RISS · CMU

Offline RL from Vision-Language-Model Feedback

RISS Scholar, Jun–Aug 2024 · R-Pad Lab, CMU — with David Held, Zackory Erickson & Yufei Wang

Built an RL system that uses VLMs to generate reward functions from offline, unlabelled datasets. Achieved a 40%+ gain in success rate and SOTA 0.82 on a real-world assistive-dressing task in out-of-distribution settings.

AGV · IIT KGP

SLAM & RL for F1-Tenth Autonomous Cars

Undergrad Researcher & Executive Head (25-person lab), Jul 2022 – Jul 2024 · with Debashish Chakravarty

Built SLAM pipelines (ICP + KD-tree local mapping on CARLA point clouds) for a 20 cm localization gain, and led a 25-student RL & vision team across 3+ international competitions.

RBCDSAI · IIT Madras

Sim2Real Visual Domain Adaptation for CARLA

Research Intern, May–Dec 2023 · with Balaraman Ravindran, RBCDSAI

Developed visual domain adaptation for CARLA using stable-diffusion models conditioned on simulator images with uniform domain randomization, learning a generalized control policy robust to the sim2real gap.

Competitions


Inter IIT 13.0 · Gold

Multi-Agent Swarm Navigation

Gold Medal · Captain, IIT Kharagpur contingent · Nov–Dec 2024

Led the team to gold with an Active-SLAM + YOLO-World goal-detection pipeline, Hungarian task-agent matching over a sparse bipartite graph, and PRIMAL-2 with hindsight replay for map-agnostic multi-agent path planning.

NeurIPS 2023 TOTO · Gold

DiffClone — Diffusion-Driven Behaviour Cloning

Gold-winning submission · Train Offline Test Online (TOTO) Workshop, NeurIPS 2023

Built DiffClone for offline, sparse-reward robotic control: a MoCo-fine-tuned ResNet-50 encoder + DDPM behaviour-cloning agent reaching a 92% success rate on pouring, surpassing existing benchmarks.

Inter IIT 11.0 · Silver

Domain-Specific Question Answering

Event Silver · Inter IIT Tech Meet 11.0, Jan 2022

Closed-domain QA on SQuAD-like data with T5/GPT-3 augmentations and a faster DrQA retriever; an Electra-BERT ensemble hit 0.85 F1 with 2.65× runtime improvement via ONNX + quantization.