Robots:

Trossen AI Stationary Robot

1. Hardware

Our Trossen AI Stationary was purchased May 2025. We are still running Trossen Arm Driver v1.7.8. Our local computer is a System76 desktop running Ubuntu 22.0.4 with an RTX 5090 GPU.

Back to top

2. Software

We have started by augmenting and tweaking the gym-aloha environment, as well as the (recently deprecated) Trossen lerobot framework, with the goal of providing seamless sim to sim, sim to real, and real to sim support for the Trossen AI Stationary robot. We have also been tweaking the lerobot software for smoother Trossen AI Stationary real robot dataset acquisition. In addition, we are in the process of adding real and simulated Trossen AI Stationary support to the openpi framework. Our forks are at github.com/anredlich. Highlights:

  • gym-aloha: we added *.xml mujoco files to the assets folder and augmented the sim.py and sim_end_effector.py simulator code to give gym-aloha the ability to simulate the Trossen AI Stationary robot (mujoco files and code adapted from trossen_arm_mujoco). This includes both joint controlled and end-effector controlled simulations for the transfer-cube task. For this task, we added environmental options such as box size, box position, box orientation, and box color, as well as some control over lighting, robot joint reference angles, and robot base positions.
  • lerobot: we added control_sim_robot.py which uses the augmented gym-aloha environment to create and replay simulated datasets for the Trossen AI Stationary robot. We also added scripted_policy.py, a heuristic waypoint policy adapted from trossen_arm_mujoco, for the simulated robot rollouts. In addition, we modified train.py and eval.py so that they can train and evaluate policies for the simulated Trossen AI Stationary robot. Together these additions allow full sim to sim, sim to real, and real to sim evaluations. Combining simulated and real robot replay can also be used to calibrate/match the simulated to the real robot. We added better text to voice and additional voice prompts to improve real robot dataset acquisition workflow. Also added 4 new evaluate_*.py and train_*.py example files for both the old aloha and the new Trossen Ai simulated robots.
  • openpi: we are in the process of adding hardware driver support to run pi0 policies on the Trossen AI Stationary Robot within the openpi framework. We also plan to add full simulated robot support too. Note, however, that Trossen Robotics now has a fork of openpi which is further along than ours, so we suggest starting with that one.
Back to top

3. Optimizations

This is a non-exhaustive list of small optimizations and problem resolutions that may be helpful to other Trossen AI Stationary Robot users.

  • robot: Do NOT let pets or small children near the leader arms: they can swing and swoop down violently, especially if you play with the arm joint_characteristics. Almost learned this the hard way.
  • robot: The right arm gripper was a bit sticky (it feels like static friction) and would over-shoot. Improved this by adjusting the embedded arm joint_characteristics variable, friction_viscous_coef, for the gripper (joint 6) from 202.61772... to 25.0. See the Trossen documentation for how to do this.
  • lerobot: There was a dataset version error which prevented lerobot simulation testing and dataset visualization for older aloha and pusht datasets. Converted this to a warning.
  • lerobot: Fixed a model writing error in train.py: the checkpoint config.json file was missing the "type: act" or "type: diffusion" line so the model could not be read, e.g. by eval.py. Solved this by adding type: str = "act" line to configuration_act.py and type: str = "diffusion" to configuration_diffusion.py.
  • lerobot: For real robot rollouts, we found that setting the robot.max_relative_target to 0.05-0.1 radians makes a huge difference in whether a learned policy succeeds. This argument clips the maximum joint angle change in one step, thereby reducing jerky motions which seem to take the robot out of the learning distribution and often lead to failure.
Back to top

4. Datasets

We have been acquiring and uploading -- to huggingface -- both real robot and simulated robot datasets. The real robot datasets were aquired using the lerobot control_robot.py with the record option. The simulated datasets were aquired using our control_sim_robot.py with the record option. These datasets can be visualized using lerobot's visualize_dataset.py or online at lerobot/visualize_dataset . See the anredlich/lerobot readme for more details. Datasets have 50-100 episodes. Here are the dataset repo_ids:


Real robot:

  • ANRedlich/trossen_ai_stationary_transfer_20mm_cube_01
    see video on home page
  • ANRedlich/trossen_ai_stationary_transfer_40mm_cube_02
  • ANRedlich/trossen_ai_stationary_transfer_multi_cube_03
  • ANRedlich/trossen_ai_stationary_place_lids_04
  • ANRedlich/trossen_ai_stationary_pour_box_05
    see video on home page
  • ANRedlich/trossen_ai_stationary_pop_lid_06
    see video on home page

Simulated robot:

  • ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_07
    cube color=red, size=40mm, tabletop=black, background=none, lighting=bright
  • ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_08
    cube_color=dark red, size=40mm, tabletop=mine, background=mine, lighting=medium
  • ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_10
    cube_color=r,g,b, size=25,40mm, tabletop=mine, background=none, lighting=bright
    see video below
  • ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13
    cube_color=red, tabletop=mine, background=mine, lighting=medium
  • tabletop=mine is image of my tabletop, background=mine is crudely images of my office walls

Back to top

5. Models

We have been aquiring and uploading -- to huggingface -- learned models/policies for both the real and simulated robot datasets. So far, these are ACT models used as a baseline, with chunk_size=100, trained for 100K steps. Both the real and simulated models can be tested in the simulator using lerobot eval.py, or for individual episodes, using our evaluate_trossen_ai_stationary_policy.py. See our lerobot readme for more details. Here are the huggingface policy paths:


Real robot ACT models:

  • ANRedlich/trossen_ai_stationary_real_act2_3
    best real to sim, try in evaluate_pretrained_trossen_ai_policy.py, still only about 20% correct!
  • ANRedlich/trossen_ai_stationary_real_act5
    see video on home page
  • ANRedlich/trossen_ai_stationary_real_act6
    see video on home page

Simulated robot ACT models:

  • ANRedlich/trossen_ai_stationary_sim_act7
  • ANRedlich/trossen_ai_stationary_sim_act8
  • ANRedlich/trossen_ai_stationary_sim_act10
    see video below
  • ANRedlich/trossen_ai_stationary_sim_act13
    best sim to real policy, but still very sensitive to conditions
Video description
Trossen AI Simulated Robot
Pre-trained ACT policy ..._act10 rollout learned from dataset ..._cube_10.
Back to top

6. Experiments

ACT: the following experiments use the baseline ACT algorithm. As other algorithms are tested, comparisons will be made. Also, note that in all real robot policy rollouts robot.max_relative_target=0.05 which clips the maximum 1 step joint angle change, and is critical to smoothing the rollouts and getting good results.

  • Sim to real:
    Conclusion: very sensitive to matching the simulated and real environments. Able to pickup cube but not to complete the transfer.
    Best model: ANRedlich/trossen_ai_stationary_sim_act13 with ~75% correct pickup, but no completed transfers. It was trained on the ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13 dataset which is the closest match to the real environment.
    Robustness: multiple cube colors and sizes and tabletop, background, and lighting variations do not seem to improve performance for the ACT algorithm in this context.
    Generalization: ANRedlich/trossen_ai_stationary_sim_act7 give ~25% correct even though the env is very different from the real environment, e.g. the tabletop is black. This shows there can be some generalization, maybe due to the ACT Resnet, but this has been unreliable across experiments.
    Cube color: moderately sensitive: ANRedlich/trossen_ai_stationary_sim_act8, drop to ~33%, although it was trained on an environment identical to the ..._cube_13, above, except for a slightly darker red cube.
    Cube size: very sensitive.
    Tabletop: moderately sensitive.
    Background: moderately sensitive.
    Lighting: very sensitive to the simulated environment lighting, and also the real robot lighting.
    Joint angles and Arm base positions: unlike for real to sim, see below, adjusting joint angle and base position did not help sim to real performance. Not sure why?
  • Sim to Real, pickup works, transfer is close
    ACT policy ..._act13 rollout learned from dataset ..._cube_13.
  • Real to sim:
    Best model: ANRedlich/trossen_ai_stationary_real_act2_3, which only gets ~20% correct in the simulated environment.
    Environment: except for lighting, best environment is the same as for sim to real, see ..._cube_13 above. This is the environment that best matches the real robot.
    Lighting: the best simulated lighting is different than used for sim to real: it is closer to the lighting in the real robot dataset. This lighting for sim to real, however, is not best.
    Joint angles: the arms_ref env option in anredlich/gym-aloha adds a +/- shift (pos0) to the simulated robot joint angles. In the calibration, below, it was discovered that this was necessary for joints 1 and 2, to get real and sim to match. We believe this is due to gravity weighing down the real arms.
    Arm base positionn: the arms_pos env option was used to place the simulated arm base positions where they should be based on both the calibration and measuring their actual position on the real robot.
  • Real to Sim
    ACT policy ...real_act2_3 rollout from real dataset ...40mm_cube_02.
  • Calibration:
    Replay: using the replay option in control_sim_robot.py in our lerobot fork, any of the real robot datasets can be replayed in simulation. Likewise, any of the simulated datasets can be replayed by control_robot.py on the real robot. This allows precise alignment of sim and real for an actual task.
    Joint angles: to get the real and sim replays to work perfectly, it was necessary to shift joints 1 and 2 by -0.025 and 0.025 radians, respectively, using anredlich/gym-aloha arms_ref option which is implemented in sim.py using physics.named.model.qpos0. We believe this compensates for some slight sag in the real robot due to gravity.
    Arm base position: using the arms_pos option in anredlich/gym-aloha, implemented with physics.model.body_pos in sim.py, the simulated robot base was moved to the y=0.0 position, which is consistent with physical measurement on the real robot.
  • Pretraining:
    Training: In train.py, the pretrained model policy.path=ANRedlich/trossen_ai_stationary_sim_act7, was used as pretraining to then learn an ACT policy for the dataset ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13 . Note that the simulated environment, ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_07, used to learn ..._act_7 is very different from the ..._cube_13 environment.
    Sim to sim: After only 10K steps, the learned model was correct 98% of the time, using eval.py on out-of-sample examples. This compares to only 90% correct learning from scratch using 100K steps. Training for 10K steps from scratch did not work well.
    Sim to real with sim pretraining: After only 10K steps, the sim model using pretraining was approximately as good on sim to real as ..._act13, see above, which trained from scratch for 100K steps.
    Real to real with sim pretraining: The sim model ..._act13 was used as the pretrained model to the continue training on the real dataset ANRedlich/trossen_ai_stationary_transfer_40mm_cube_02 for 10K steps. This gave as good a result on real to real as training on ..._cube_02 from scratch for 100K steps. Training for 10K steps from scratch did not work well.
  • Real robot ACT successes:
    Transfer cube: This task, for either a 20mm or 40mm cube, was easily learned by ACT from e.g. ANRedlich/trossen_ai_stationary_transfer_20mm_cube_01, see video on home page.
    Pour cup to cup: This task, was easily learned by ACT from from ANRedlich/trossen_ai_stationary_pour_box_05. It works for the same range of cup placements of ~2-3 inches as in the datasets. See video on home page.
    Pop lid: This task, learned from ANRedlich/trossen_ai_stationary_pop_lid_06, works well, but only if the "takeout" container is positioned carefully on the tabletop! The dataset did not have much position variety, so further experiments are planned. Also, the lid was very snug, so some crushing was necessary, even by a human using only two fingers, so again further experiments with better containers are planned. However, it does succeed! See video on home page!
  • Real robot ACT failures:
    Multiple cube colors, sizes, and orientations: The ACT algorithm did not learn the task in dataset ANRedlich/trossen_ai_stationary_transfer_multi_cube_03. It may be that the number of examples needs increasing, but we suspect that there is just too much task variety for ACT.
    Place lids: The dataset ANRedlich/trossen_ai_stationary_place_lids_04 has many different pot and lid colors and shapes at many locations, probably too much variety for ACT, but also the number of examples might be too few.
    Conclusion, tentative: ACT seems to work well for tasks with limited task and environmental variety. Not sure if this is because our datasets are too small, or if this is a fundamental limitation of ACT.
...multi_cube_03 dataset example
ACT failed to learn a policy for this dataset. Note orientation, color, size.
...multi_cube_03 dataset example
ACT failed to learn a policy for this dataset. Note orientation, color, size.
...place_lids_04 dataset example
ACT failed to learn a policy for this dataset. Note shapes, materials, positions.
...place_lids_04 dataset example
ACT failed to learn a policy for this dataset. Note shapes, materials, positions.

Pi0: We are beginning to experiment with pi0, with plans to test pi0 on all the above datasets, sim and real, and compare to the ACT.

  • Pi0 gym-aloha example: just a note that the default action_horizon: int = 10 in main.py for this example gave us only about ~40% correct. When we set action_horizon to 50 -- the default during learning -- performance improved to 85%.
  • Pi0 LORA fine tuning on gym-aloha: starting with the base policy, pi0_base, we trained on the original gym-aloha. On an Ubuntu computer with RTX5090 gpu for 100K steps training took about 4 hours. The results gave better results than the pre-trained example policy, above, giving about 95-100% correct.
Back to top