AIMPACT message, April 14 (UTC+8): MiniMax has released two updates for its desktop Agent: the Pocket feature (Beta) and the official launch of Computer Use.
Pocket integrates with major IM platforms such as Feishu, WeChat, Enterprise WeChat, and Slack. After users send commands in their IM apps, the Agent performs tasks on their computer and returns the results to the original conversation. Computer Use enables the Agent to view the screen and operate the mouse and keyboard to directly handle local software, system settings, and graphical interface tasks. Together, these capabilities allow users to issue commands from their phones while the Agent executes them on the computer—no need to sit at the desk.
Technically, MiniMax breaks down desktop operations into four tool domains: Desktop Control (screenshot, mouse and keyboard input), Window Manager (window management and application launching), Browser Engine (DOM manipulation and CSS selectors), and Clipboard (clipboard reading and writing). Combined with CLI tools and Bash utilities from platforms such as Feishu and WeCom, this amounts to over 60 tools.
Visually, the Agent outputs relative coordinates between 0 and 1, which the system converts into actual screen pixels to ensure consistent operation precision on Retina and 4K displays. After each step, an automatic screenshot is taken for verification; if it fails, alternative approaches are attempted (such as using keyboard shortcuts instead of mouse clicks). If no solution is found after multiple attempts, the Agent proactively reports the exact point of failure to the user.
Permission management has been integrated into IM: Before executing sensitive actions such as file deletion, the Agent pauses and pushes a confirmation request to IM, presented as interactive cards in Feishu and Slack, and as text-based authorization commands in WeChat; users can send commands at any time to abort the task.(Source: MiniMax)
