Where2Act: From Pixels to Actions for Articulated 3D Objects

Mo, Kaichun; Guibas, Leonidas; Mukadam, Mustafa; Gupta, Abhinav; Tulsiani, Shubham

Computer Science > Computer Vision and Pattern Recognition

arXiv:2101.02692 (cs)

[Submitted on 7 Jan 2021 (v1), last revised 10 Aug 2021 (this version, v2)]

Title:Where2Act: From Pixels to Actions for Articulated 3D Objects

Authors:Kaichun Mo, Leonidas Guibas, Mustafa Mukadam, Abhinav Gupta, Shubham Tulsiani

View PDF

Abstract:One of the fundamental goals of visual perception is to allow agents to meaningfully interact with their environment. In this paper, we take a step towards that long-term goal -- we extract highly localized actionable information related to elementary actions such as pushing or pulling for articulated objects with movable parts. For example, given a drawer, our network predicts that applying a pulling force on the handle opens the drawer. We propose, discuss, and evaluate novel network architectures that given image and depth data, predict the set of actions possible at each pixel, and the regions over articulated parts that are likely to move under the force. We propose a learning-from-interaction framework with an online data sampling strategy that allows us to train the network in simulation (SAPIEN) and generalizes across categories. Check the website for code and data release: this https URL

Comments:	accepted to ICCV 2021
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Cite as:	arXiv:2101.02692 [cs.CV]
	(or arXiv:2101.02692v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2101.02692

Submission history

From: Kaichun Mo [view email]
[v1] Thu, 7 Jan 2021 18:56:38 UTC (6,216 KB)
[v2] Tue, 10 Aug 2021 18:06:25 UTC (5,535 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Where2Act: From Pixels to Actions for Articulated 3D Objects

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Where2Act: From Pixels to Actions for Articulated 3D Objects

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators