Frequently Asked Questions
General Questions
Which training method should I use?
- Regression (
regression): Simplest baseline, give strong performance on most tasks with correct architecture. - MIP (
mip): Minimum Iterative Policy, reserve most of the benefits of flow matching while inference faster. - Flow matching (
flow): Good default choice, best performance on most tasks. - CTM/PSD/LSD (
ctm/psd/lsd): Self distillation, slower to train. Useful for learning full aciton distribution.
See the paper for detailed comparisons and recommendations.
Which network architecture should I use?
- State observations:
mlp(simple, fast) orchiunet(better for long horizons) - Image observations:
chiunet(proven architecture) orchitransformer(for complex scenes) - Long horizons (>16):
chitransformerorsudeepdit - Limited compute:
mlp
Installation & Setup
Do I need to use UV package manager?
While UV is recommended for its speed, you can use pip instead:
bash
pip install -e .Can I pause and resume training?
Yes! MIP supports auto-resume:
bash
# Training will automatically resume from last checkpoint if it exists
uv run examples/train_robomimic.py task=lift_ph_state optimization.auto_resume=truePerformance
How can I speed up training?
Enable torch.compile to speed up training:
bashuv run examples/train_robomimic.py optimization.use_compile=trueUse CUDA graphs to reduce cpu overhead (state observations only):
bashuv run examples/train_robomimic.py optimization.use_cudagraphs=true
You can use both to get the best performance.