Active 6D Pose Estimation for Textureless Objects using Multi-View RGB Frames

Jun Yang1, 2, Wenjie Xue2, Sahar Ghavidel2, Steven L. Waslander1,
1University of Toronto Institute for Aerospace Studies and Robotics Institute, 2Epson Canada Ltd.
Pipeline overview

In this work, we present a complete framework of multi-view pose estimation and next-best-view prediction for textureless objects.

Abstract

Estimating the 6D pose of textureless objects from RBG images is an important problem in robotics. Due to appearance ambiguities, rotational symmetries, and severe occlusions, single-view based 6D pose estimators are still unable to handle a wide range of objects, motivating research towards multi-view pose estimation and next-best-view prediction that addresses these limitations.

In this work, we propose a comprehensive active perception framework for estimating the 6D poses of textureless objects using only RGB images. Our approach is built upon a key idea: decoupling the 6D pose estimation into a two-step sequential process can greatly improve both accuracy and efficiency. First, we estimate the 3D translation of each object, resolving scale and depth ambiguities inherent to RGB images. These estimates are then used to simplify the subsequent task of determining the 3D orientation, which we achieve through canonical scale template matching. Building on this formulation, we then introduce an active perception strategy that predicts the next best camera viewpoint to capture an RGB image, effectively reducing object pose uncertainty and enhancing pose accuracy.

We evaluate our method on the public ROBI and TOD datasets. Additionally, we reconstructed a challenging transparent object dataset and created a large-scale synthetic dataset, corresponding to ROBI and our transparent dataset, which is used to train the network. Using the same camera viewpoints, our multi-view pose estimation significantly outperforms state-of-the-art approaches. Moreover, by leveraging our next-best-view strategy, our approach achieves high pose accuracy with fewer viewpoints than heuristic-based policies.

Qualitative Results

Result visualization

BibTeX

@article{yang2025active6dposeestimation,
      title={Active 6D Pose Estimation for Textureless Objects using Multi-View RGB Frames},
      author={Jun Yang and Wenjie Xue and Sahar Ghavidel and Steven L. Waslander},
      year={2025},
}