Gen6D: Generalizable Model-Free 6-DoF Object Pose
Estimation from RGB Images

ECCV 2022


Yuan Liu1, Yilin Wen1, Sida Peng2, Cheng Lin3, Xiaoxiao Long1, Taku Komura1, Wenping Wang4

1The University of Hong Kong    2Zhejiang University    3Tencent      4Texas A&M University

Abstract


Gen6D is able to predict unseen object poses in RGB images based on reference images of the object.

In this paper, we present a generalizable model-free 6-DoF object pose estimator called Gen6D. Existing generalizable pose estimators either need the high-quality object models or require additional depth maps or object masks in test time, which significantly limits their application scope. In contrast, our pose estimator only requires some posed images of the unseen object and is able to accurately predict poses of the object in arbitrary environments. Gen6D consists of an object detector, a viewpoint selector and a pose refiner, all of which do not require the 3D object model and can generalize to unseen objects. Experiments show that Gen6D achieves state-of-the-art results on two model-free datasets: the MOPED dataset and a new GenMOP dataset collected by us. In addition, on the LINEMOD dataset, Gen6D achieves competitive results compared with instance-specific pose estimators.


Comparison


Both DeepIM and Gen6D are trained on the same training dataset and generalize to these unseen objects. Gen6D generalizes better than DeepIM due to the utilization of a feature volume-based refiner.
PVNet is trained on the object using the reference images (about 200) which are not enough to train a PVNet for accurate pose estimation.


Application


A simple AR application: With the known poses, we are able to render an adorable Dodoco to replace the cute Lulu Piggy. Gen6D does not require the object model nor the object mask. By simply capturing reference images of an unseen object by cellphones and recovering the poses of reference images by COLMAP, Gen6D is able to predict the object pose on arbitrary query images. Thus, Gen6D can be easily applied on daily objects for AR/VR applications.


Citation


@inproceedings{liu2022gen6d,
  title={Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images},
  author={Liu, Yuan and Wen, Yilin and Peng, Sida and Lin, Cheng and Long, Xiaoxiao and Komura, Taku and Wang, Wenping},
  booktitle={ECCV},
  year={2022}
}