Take a closer look at our new Gemini models for robotics.

["How does Gemini work in Google Maps?", "What is quantum computing?", "What are the camera features on Pixel 10?"]

Today, Google DeepMind announced a new family of Gemini models designed for robotics. Gemini Robotics is a vision-language-action (VLA) model that takes natural language and images as input and outputs actions, allowing robots to physically move and perform tasks. The second model is Gemini Robotics-ER, a reasoning model that enhances skills like identifying objects and their parts in 3D space.

Take a look at what robots can do using these Gemini models, from folding origami to packing lunches to spelling words with Scrabble tiles.

POSTED IN:

Related stories