STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory

University of Toronto¹, York University²

Key Features

Long-Horizon Multimodal Robot Memory (OmniMem). We introduce a unified, task-agnostic memory that integrates 3D primitives, temporally aligned video captions (dynamic scene descriptions), and keyframe visual memory, enabling joint spatial, temporal, and semantic reasoning over long-duration robot memory.
Scalable Task-Conditioned Retrieval via Information Bottleneck (STaR). STaR applies the Information Bottleneck principle to distill a compact, non-redundant, and information-rich subset of memories tailored to a given task, avoiding the inefficiency and hallucination risks of naïve Retrieval-Augmented Generation (RAG).
Agentic RAG for Planning, Retrieval, and Reasoning. We propose an agentic workflow in which an MLLM autonomously plans search strategies, issues memory retrieval calls, and reasons over STaR-distilled evidence, enabling precise answers and object-goal navigation.
Extensive Evaluation and Real-Robot Deployment. STaR is evaluated on long-horizon navigation VQA benchmarks, including NaVQA (campus-scale indoor/outdoor scenes) and WH-VQA, a warehouse benchmark with many visually similar objects built in Isaac Sim, and is further validated through end-to-end deployment on a real Husky mobile robot.

Citation

If you find this work helpful, please consider citing:

@article{Yuan2026STaR, title={STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory}, author={Mingfeng Yuan and Hao Zhang and Mahan Mohammadi and Runhao Li and Jinjun Shan and Steven L. Waslander}, year={2026}, eprint={2602.09255}, archivePrefix={arXiv}, primaryClass={cs.RO}, url={https://arxiv.org/abs/2602.09255}, }

STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory

Key Features

🎥 STaR Demo Videos

🧠 Method Overview

🚘 On-Device Deployment

STaR deployed on a Husky robot for indoor and outdoor experiments, supporting both text-based and multimodal queries.

Citation