Brown University Develops Language-Driven AI for Robot Movement

Brown University researchers unveil MotionGlot—an AI model that turns natural language into lifelike robot movement across diverse body types, bridging language and motion with cross-embodiment translation for robotics, animation, and VR

Researchers at Brown University have introduced an AI model, MotionGlot, that allows robots and animated figures to move based on natural language commands. Inspired by how models like ChatGPT process text, this system converts user prompts such as “walk forward a few steps and take a right” into accurate motion sequences. The innovation lies in treating physical movement as a language that can be tokenized and predicted step-by-step, enabling machines to execute fluid, humanlike actions.

Cross-Embodiment Motion Translation

One of the core breakthroughs in this model is its ability to adapt instructions across different robot forms—whether it's a humanoid figure or a four-legged robotic dog. This cross-embodiment translation solves a significant challenge in robotics: the same instruction often requires completely different physical responses depending on body structure. The team demonstrated that MotionGlot could interpret general or specific commands and generate appropriate actions across both human and quadruped robots.

Also read: Dyna Robotics Unveils DYNA-1 AI Model to Power Affordable Robotic Automation

The model was trained using two distinct datasets: QUAD-LOCO, which includes dog-like robot motions paired with descriptive language, and QUES-CAP, which contains human movement data accompanied by detailed annotations. Through this training, the system can handle both explicit directives and abstract concepts, generating movements like jogging or happy walking even when such combinations weren’t part of the training data.

Future Use and Availability

The research is scheduled to be presented at the 2025 International Conference on Robotics and Automation in Atlanta. Its creators believe MotionGlot holds potential in several sectors—including human-robot interaction, digital animation, gaming, and virtual reality—due to its adaptability and contextual understanding. With plans to open-source both the model and its codebase, the team aims to support further development and collaboration in this emerging space.

Related Topics

Large Language Models (LLMs)LLMs