0%mastery unlocked by

Mode Consistency

Introduction Methodology Mode Consistency Model Arithmetic Stage Advantage Bottom Line Citation

χ₀: A Live-Stream Robotic Teamwork for Clothing Manipulation from Zero to Hero

Published

December 24, 2025

Report

Repository

Email

By

HKU MMLab

Community

Veni, vidi, vici.

Julius Caesar

We will release data, checkpoints, and host Challenge in 2026.

Three tasks varying from folding to hanging, each covering a 4-hour duration, presented in 100x time-lapse format with critical segments highlighted at 2-5x speed.

Mode Consistency System Architecture-Left: Human expert demonstration collection. Middle: Mixing models from different data sources via Model Arithmetic. Right: Real-robot inference. Bottom: DAgger Feedback and Stage Advantage from on-policy experience.

Methodology

Mode Consistency: Addressing the Distributional Trilemma

Distribution dynamics of P_train, Q_model, and P_test.

DAgger-Injecting on-policy recovery trajectories to expand P_train towards underrepresented failure modes in P_real.

Inference Optimization-Minimizing execution jitter to ensure smooth translation from Q_model to P_test.

Interactive 3D t-SNE visualization of action distributions for P_train, Q_model, and P_test.Click and drag to rotate the plot.

Success Rate (%) ↑

Recover Cost ↓

Improved data collection methods and on-policy recovery trajectories effectively enhance the model's error recovery capability, significantly increasing success rate and reducing recover cost (fewer retry attempts per failure). X-axis: baseline, improved baseline, + heuristic DAgger, + DAgger.

Success Rate (%) ↑

Throughput ↑

Spatio-temporal augmentation substantially enhances model performance, increasing success rate and throughput (more task completions per unit time). X-axis: baseline, +spatio-temp. augment.

Success Rate (%) ↑

Throughput ↑

Inference optimization through chunk-wise temporal smoothing and real-time chunking ensures the policy's intended actions are translated flawlessly into smooth, coherent real-robot execution, improving throughput (more task completions per unit time). X-axis: sync, + inchunk smooth, + temp smooth, + RTC.

Model Arithmetic

We merge models trained on different data subsets into a single entity using weight interpolation, with the mixing weights optimized against on-policy data.

Success Rate (%) ↑

The merged model surpasses both the best constituent models and the oracle model trained on the full dataset across multiple tasks, evidencing that Model Arithmetic successfully assimilates the distinct policy manifolds learned from diverse data subsets.

Stage Advantage

Negative

0.03

Comparison of cumulative progress induced by different methods along an inference-time manipulation trajectory. Green and red segments indicate higher- and lower-ranked actions based on predicted advantage, reflecting relative preference for task advancement. Direct+Stage (ours) produces smoother and more consistent progress accumulation than Value-diff.

Mean Squared Temporal Difference (MSTD) ↓

Smooth Frame Ratio (SFR) (%) ↑

Success Rate (%) ↑

Value-diff computes the advantage by subtracting two independently predicted state values. Direct predicts the advantage as the relative improvement from paired observations. Direct+Stage (ours) uses stage-conditioned direct advantage prediction for long-horizon training, achieving smoother results (lower MSTD), greater stability (higher SFR), and higher success rates.

Bottom Line

Citation

@article{hkummlab2025kai0,
  title = {A Live-Stream Robotic Teamwork for Clothing Manipulation from Zero to Hero},
  author = {HKU MMLab},
  journal = {HKU MMLab Research Blog},
  year = {2025},
  note = {https://mmlab.hk/research/kai0},
}

Stay in the loop

Get notified about live demos, challenges, and the latest research updates.

No spam, ever. Unsubscribe anytime.

χ₀: A Live-Stream Robotic Teamwork for Clothing Manipulation from Zero to Hero

Published

December 24, 2025

Report

2602.09021

Repository

OpenDriveLab/kai0

Email

[email protected]

By

HKU MMLab

Community

Discord

Share

X|LinkedIn|Bluesky

Veni, vidi, vici.

Methodology

Methodology

Mode Consistency: Addressing the Distributional Trilemma

Mode Consistency: Addressing the Distributional Trilemma

Model Arithmetic

Model Arithmetic

Stage Advantage

Stage Advantage

Bottom Line

Bottom Line

Citation

Citation

Stay in the loop

χ0: A Live-Stream Robotic Teamwork for Clothing Manipulation from Zero to Hero

Published

December 24, 2025

Report

Repository

Email

By

HKU MMLab

Community

Share

X|LinkedIn|Bluesky

Veni, vidi, vici.

Methodology

Mode Consistency: Addressing the Distributional Trilemma

Model Arithmetic

Stage Advantage

Bottom Line

Citation

Stay in the loop

χ₀: A Live-Stream Robotic Teamwork for Clothing Manipulation from Zero to Hero