Research Overview
All experiments use the Trossen AI Stationary Robot
pi0.5 policy for puttting a bead on a string, after 2 iterations of policy improvement.
pi0.5 policy for closing a tie wrap, after 1 iteration of policy improvement.
pi0.5 policy improved by augmenting datasets with human interventions.
Not all results are this good, but still much better than pre-DAgger!
pi0.5 learns a multi-prompt sub-task policy, e.g. 'pick up pink cube ...'
Gemini Robotics ER-1.5 HL controller chooses subtasks to 'put all cubes in bucket'
pi0 LoRA finetuned policy from SIMULATED dataset containing only red cubes.