Projects
Research
Multi-Task RL Representations
B.Tech Thesis, Aug 2024 – May 2025 · advised by Aritra Hazra & Naveen Kumar Garg
Explored unsupervised pretraining for task-agnostic representations in multi-task RL; reproduced AVDC results on the impact of shared image-space representations for multi-task learning.
VideoAgent — Self-Improving Video Generation
Research Intern, Jan–Nov 2024 · with Sherry Yang & Bo Dai
Proposed VideoAgent to self-refine robotic video plans using pretrained VLM feedback and online replanning. Raised MetaWorld success 43.1% → 50%, iTHOR 31.3% → 34.2%, and +22% task acceptance in human evals on BridgeV2.
Offline RL from Vision-Language-Model Feedback
RISS Scholar, Jun–Aug 2024 · R-Pad Lab, CMU — with David Held, Zackory Erickson & Yufei Wang
Built an RL system that uses VLMs to generate reward functions from offline, unlabelled datasets. Achieved a 40%+ gain in success rate and SOTA 0.82 on a real-world assistive-dressing task in out-of-distribution settings.
SLAM & RL for F1-Tenth Autonomous Cars
Undergrad Researcher & Executive Head (25-person lab), Jul 2022 – Jul 2024 · with Debashish Chakravarty
Built SLAM pipelines (ICP + KD-tree local mapping on CARLA point clouds) for a 20 cm localization gain, and led a 25-student RL & vision team across 3+ international competitions.
Sim2Real Visual Domain Adaptation for CARLA
Research Intern, May–Dec 2023 · with Balaraman Ravindran, RBCDSAI
Developed visual domain adaptation for CARLA using stable-diffusion models conditioned on simulator images with uniform domain randomization, learning a generalized control policy robust to the sim2real gap.
Competitions
Multi-Agent Swarm Navigation
Gold Medal · Captain, IIT Kharagpur contingent · Nov–Dec 2024
Led the team to gold with an Active-SLAM + YOLO-World goal-detection pipeline, Hungarian task-agent matching over a sparse bipartite graph, and PRIMAL-2 with hindsight replay for map-agnostic multi-agent path planning.
DiffClone — Diffusion-Driven Behaviour Cloning
Gold-winning submission · Train Offline Test Online (TOTO) Workshop, NeurIPS 2023
Built DiffClone for offline, sparse-reward robotic control: a MoCo-fine-tuned ResNet-50 encoder + DDPM behaviour-cloning agent reaching a 92% success rate on pouring, surpassing existing benchmarks.
Domain-Specific Question Answering
Event Silver · Inter IIT Tech Meet 11.0, Jan 2022
Closed-domain QA on SQuAD-like data with T5/GPT-3 augmentations and a 3× faster DrQA retriever; an Electra-BERT ensemble hit 0.85 F1 with 2.65× runtime improvement via ONNX + quantization.
