DeepMind is training robots for real-world activities

5 Jan 2024

A robotic arm with a camera attachment on the top, above a table that has an apple, a banana and a small cup on it.

Image: DeepMind

From more advanced AI models to a ‘robot constitution’, DeepMind claims its research will shape the future of robotics.

Google-owned DeepMind has revealed multiple research projects to create robots that make decisions faster and work in real-world scenarios.

The first of these is a system called AutoRT, which aims to use the power of AI to create robots that can “understand practical human goals”.

The system combines large foundational models – such as a large language model or a visual language model (VLM) – along with a robot control model. DeepMind claims this combination lets robots gather training data in new environments and is capable of impressive multitasking.

“AutoRT can simultaneously direct multiple robots, each equipped with a video camera and an end effector, to carry out diverse tasks in a range of settings,” DeepMind said in a blogpost.

“In extensive real-world evaluations over seven months, the system safely orchestrated as many as 20 robots simultaneously, and up to 52 unique robots in total, in a variety of office buildings, gathering a diverse dataset comprising 77,000 robotic trials across 6,650 unique tasks.”

To ensure robots are able to be integrated safely into everyday environments, DeepMind has also developed a “robot constitution” into AutoRT to ensure that the robots follow specific safety rules.

This constitution features Isaac Asimov’s Three Laws of Robotics – which includes the rule that robots “may not injure a human being”.

“But even if large models are prompted correctly with self-critiquing, this alone cannot guarantee safety,” DeepMind said. “So the AutoRT system comprises layers of practical safety measures from classical robotics.

“For example, the collaborative robots are programmed to stop automatically if the force on its joints exceeds a given threshold, and all active robots were kept in line-of-sight of a human supervisor with a physical deactivation switch.”

DeepMind also revealed a new system to improve the efficiency of robotic transformer models. This system is called the Self-Adaptive Robust Attention for Robotics Transformers – or SARA-RT.

DeepMind claimed that its best SARA-RT-2 models were 10.6pc more accurate and 14pc faster than other RT-2 models after receiving a short history of images.

“We believe this is the first scalable attention mechanism to provide computational improvements with no quality loss,” the company said. “We designed our system for usability and hope many researchers and practitioners will apply it, in robotics and beyond.”

“Because SARA provides a universal recipe for speeding up transformers, without need for computationally expensive pre-training, this approach has the potential to massively scale up use of transformers technology.”

The third project DeepMind is working on – RT-Trajectory – is designed to help robots become more generalised in their tasks. The system adds visual outlines that describe robot motions in training videos and uses them to provide practical visual hints to a machine as it learns robot-control policies.

DeepMind tested a robotic arm controlled by RT-Trajectory with 41 tasks that had not been seen in its training data. The company claims this arm achieved a task success rate of 63pc, compared with 29pc for the earlier RT-2 models.

In 2022, Google revealed a prototype system that let robots write their own code to respond to instructions and perform tasks.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.