diffusion-policy.cs.columbia.edu Open in urlscan Pro
128.59.16.27  Public Scan

Submitted URL: http://diffusion-policy.cs.columbia.edu/
Effective URL: https://diffusion-policy.cs.columbia.edu/
Submission: On November 11 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

DIFFUSION POLICY


VISUOMOTOR POLICY LEARNING VIA ACTION DIFFUSION

This paper introduces Diffusion Policy, a new way of generating robot behavior
by representing a robot's visuomotor policy as a conditional denoising diffusion
process. We benchmark Diffusion Policy across 12 different tasks from 4
different robot manipulation benchmarks and find that it consistently
outperforms existing state-of-the-art robot learning methods with an average
improvement of 46.9%. Diffusion Policy learns the gradient of the
action-distribution score function and iteratively optimizes with respect to
this gradient field during inference via a series of stochastic Langevin
dynamics steps. We find that the diffusion formulation yields powerful
advantages when used for robot policies, including gracefully handling
multimodal action distributions, being suitable for high-dimensional action
spaces, and exhibiting impressive training stability. To fully unlock the
potential of diffusion models for visuomotor policy learning on physical robots,
this paper presents a set of key technical contributions including the
incorporation of receding horizon control, visual conditioning, and the
time-series diffusion transformer. We hope this work will help motivate a new
generation of policy learning techniques that are able to leverage the powerful
generative modeling capabilities of diffusion models.

--------------------------------------------------------------------------------


HIGHLIGHTS

Diffusion Policy learns multi-modal behavior and commits to only one mode within
each rollout. LSTM-GMM and IBC are biased toward one mode, while BET failed to
commit.
Diffusion Policy predicts a sequence of action for receding-horizon control.
The Mug Flipping task requires the policy to predict smooth 6 DoF actions while
operating close to kinetmatic limits.
Toward making 🍕: The sauce pouring and spreading task manipulates liquid with 6
DoF and periodic actions.
In our Push-T experiments, Diffusion Policy is highly robust against
purturbations and visual distractions.

--------------------------------------------------------------------------------


SIMULATION BENCHMARKS

Diffusion Policy outperforms prior state-of-the-art on 12 tasks across 4
benchmarks with an average success-rate improvement of 46.9%. Check out our
paper for further details!

Lift 1
Can 1
Square 1
Tool Hang 1
Transport 1
Push-T 2
Block Pushing 2,3
Franka Kitchen 3,4

Standarized simulation benchmarks are essential for this project's development.
Special shoutout to the authors of these projects for open-sourcing their
simulation environments:
1 Robomimic             2 Implicit Behavior Cloning             3 Behavior
Transformer             4 Relay Policy Learning            

--------------------------------------------------------------------------------


CODE AND DATA

Sim & Real Repo
Experiment Data
State-based Notebook
Vision-based Notebook

--------------------------------------------------------------------------------


PAPER

Robotics: Science and Systems (RSS) 2023: arXiv:2303.04137v4 [cs.RO] or here.
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi, Siyuan Feng, Yilun Du, Zhenjia Xu, Eric Cousineau, Benjamin
Burchfiel, Shuran Song

The International Journal of Robotics Research (IJRR) 2024: arXiv:2303.04137v5
[cs.RO] or here.
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin
Burchfiel, Russ Tedrake, Shuran Song


BIBTEX

@inproceedings{chi2023diffusionpolicy,
	title={Diffusion Policy: Visuomotor Policy Learning via Action Diffusion},
	author={Chi, Cheng and Feng, Siyuan and Du, Yilun and Xu, Zhenjia and Cousineau, Eric and Burchfiel, Benjamin and Song, Shuran},
	booktitle={Proceedings of Robotics: Science and Systems (RSS)},
	year={2023}
}

@article{chi2024diffusionpolicy,
	author = {Cheng Chi and Zhenjia Xu and Siyuan Feng and Eric Cousineau and Yilun Du and Benjamin Burchfiel and Russ Tedrake and Shuran Song},
	title ={Diffusion Policy: Visuomotor Policy Learning via Action Diffusion},
	journal = {The International Journal of Robotics Research},
	year = {2024},
}


--------------------------------------------------------------------------------


TEAM

Cheng Chi 1
Zhenjia Xu 1
Siyuan Feng 2
Eric Cousineau 2
Yilun Du 3
Benjamin Burchfiel 2
Russ Tedrake 2,3
Shuran Song 1
1 Columbia University              2 Toyota Research Institute              3
MIT

--------------------------------------------------------------------------------


REAL WORLD PUSH-T TASK

In this task, the robot needs to
① precisely push the T- shaped block into the target region, and
② move the end-effector to the end-zone which terminates the episode.
First row: Average of end-states for each method. Second row: Example rollout
episode.

Diffusion Policy End-to-end Success.
Diffusion Policy R3M Success after stuck near the T block initially.
LSTM-GMM End-to-end Common failure mode: stuck near the T block.
IBC End-to-end Common failure mode: entering the end-zone repmaturely.
Click to see all Push-T results.
Diffusion Policy remains robust against:
(0:05) Occlusion caused by waiving hand in front of the camera.
(0:11) Perturbation during pushing stage ①.
(0:39) Perturbation during finishing stage ②.

--------------------------------------------------------------------------------


REAL WORLD MUG FLIPPING TASK

In this task, the robot needs to
① Pickup a randomly placed mug and place it lip down (marked orange).
② Rotate the mug such that its handle is pointing left.

Diffusion Policy
LSTM-GMM

--------------------------------------------------------------------------------


REALWORLD SAUCE POURING AND SPREADING TASK

In the sauce pouring task, the robot needs to: ① Dip the ladle to scoop sauce
from the bowl, ② approach the center of the pizza dough, ③ pour sauce, and ④
lift the ladle to finish the task.
In the sauce spreading task, the robot needs to: ① Aproach the center of the
sauce with a grasped spoon ② spread the sauce to cover pizza in a spiral
pattern, and ③ lift the spoon to finish the task.

Diffusion Policy
LSTM-GMM

--------------------------------------------------------------------------------


ACKNOWLEDGEMENTS

This work was supported in part by NSF Awards 2037101, 2132519, 2037101, and
Toyota Research Institute. We would like to thank Google for the UR5 robot
hardware. The views and conclusions contained herein are those of the authors
and should not be interpreted as necessarily representing the official policies,
either expressed or implied, of the sponsors.

--------------------------------------------------------------------------------


CONTACT

If you have any questions, please feel free to contact Cheng Chi

--------------------------------------------------------------------------------