KinetIQ: AI Stack Overview

KinetIQ: AI Stack Overview HMND_kinetIQ_graphic_01-1.svg
  • System 3 — coordinates multiple robots to achieve externally defined fleet-level goals, treating robots as tools within an agentic framework. Operates on timescales of seconds to minutes and beyond.
  • System 2 — coordinates actions of a single robot, achieving high-level goals set by System 3 by treating the underlying capabilities such as navigation or locomanipulation as agentic tools. Operates on timescales from seconds to minutes.
  • System 1 — a VLA-based locomanipulation neural network. Translates goals expressed in natural language by System 2 into target poses for a subset of robot frames (e.g. end effectors, torso or pelvis). Runs at 10Hz, enabling rapid adaptation to environment changes.
  • System 0 — a whole-body controller that achieves target poses set by System 1 while maintaining overall stability. Runs at 50Hz.

Hybrid whole-body control

KinetIQ: AI Stack Overview HMND_kinetIQ_graphic_02-1-1.svg

KinetIQ: AI Stack Overview HMND_kinetIQ_graphic_04-1-1.svg

KinetIQ: AI Stack Overview HMND_kinetIQ_graphic_03-1.svg

KinetIQ: AI Stack Overview graf_cropped2.svg

KinetIQ: AI Stack Overview HMND_kinetIQ_graphic_07-1-1.svg
KinetIQ: AI Stack Overview HMND_kinetIQ_graphic_05-1-1-1.svg

References

[ 1 ]

Li, Jialong, et al. "AMO: Adaptive Motion Optimization for Hyper-Dexterous Humanoid Whole-Body Control."

arxiv.org
[ 2 ]

Liao, Qiayuan, et al. "Beyondmimic: From motion tracking to versatile humanoid control via guided diffusion."

arxiv.org
[ 3 ]

Bjelonic, Filip, Fabian Tischhauser, and Marco Hutter. "Towards bridging the gap: Systematic sim-to-real transfer for diverse legged robots."

arxiv.org
[ 4 ]

Cheng Chi, Zhenjia Xu, Chuer Pan, Eric Cousineau, Benjamin Burchfiel, Siyuan Feng, Russ Tedrake, Shuran Song. “Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots.”

arxiv.org
[ 5 ]

Jonas Pai, Liam Achenbach, Victoriano Montesinos, Benedek Forrai, Oier Mees, Elvis Nava. “mimic-video: Video-Action Models for Generalizable Robot Control Beyond VLAs.”

arxiv.org
[ 6 ]

Moo Jin Kim, Yihuai Gao, Tsung-Yi Lin, Yen-Chen Lin, Yunhao Ge, Grace Lam, Percy Liang, Shuran Song, Ming-Yu Liu, Chelsea Finn, Jinwei Gu. “Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning.”

arxiv.org
[ 7 ]

Tonghe Zhang Tonghe, Chao Yu, Sichang Su, Yu Wang. “ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning.”

arxiv.org
[ 8 ]

Kevin Black, Manuel Y. Galliker, Sergey Levine. "Real-Time Execution of Action Chunking Flow Policies."

arxiv.org
[ 9 ]

Kevin Black, Allen Z. Ren, Michael Equi, Sergey Levine. "Training-Time Action Conditioning for Efficient Real-Time Chunking."

arxiv.org
[ 10 ]

Borghoff, U. M., Bottoni, P., & Pareschi, R. Human-Artificial Interaction in the Age of Agentic AI: A System-Theoretical Approach.

arxiv.org

Contact us

Have another role in mind? Let us know what you could bring to the team.