Odaily Planet Daily reports: Google DeepMind has released Gemini Robotics-ER 1.6, positioned as a high-level reasoning model for robots, with significant improvements in spatial reasoning and multi-view understanding compared to its predecessor ER 1.5 and Gemini 3.0 Flash. The model is now available to developers via the Gemini API and Google AI Studio, with three core upgrades:
1. Improved pointing accuracy: Enables precise object detection, counting, spatial relationship reasoning (e.g., "Point to all objects that can fit inside the blue cup"), and motion trajectory planning, while correctly rejecting attempts to point to objects not present in the image.
2. Multi-perspective success detection: The robot can now analyze footage from multiple cameras to determine whether a task is complete, maintaining accuracy even in obstructed or dynamic environments.
3. Added instrument reading capability: Can interpret various industrial instruments, including circular pressure gauges, vertical level indicators, and digital displays, using agentic vision (visual reasoning + code execution) to perform step-by-step reasoning—first zooming in on detail areas, then using pointing and code-based calculations to determine scale and intervals, and finally combining world knowledge to derive the reading.
