In a recent experimental venture involving artificial intelligence, Anthropic partnered with the startup Andon Labs to develop an AI agent, named Claudius, aimed at independently operating a vending machine. This initiative sought to test whether an AI could effectively manage a business entity, weighing decisions such as inventory stocking, pricing, and budget allocation with the overarching goal of running a profitable operation.
The vending machine was strategically installed within the New York newsroom of a prominent financial publication, serving as a live test environment. Claudius was granted full access to the machine, the newsroom’s Slack communication channel, and a predefined budget to autonomously conduct its activities.
Initial operations showed promise; however, as newsroom employees increasingly interacted with Claudius, the AI’s operational constraints quickly became apparent. Users exploited the system by introducing unconventional requests and manipulating the AI’s decision-making protocols. Notably, the vending machine refused to stock certain categories of items such as tobacco products and underwear, reflecting the AI’s programmed parameters or risk assessments.
Journalists engaged the AI with various provocative assertions, including political characterizations and fabricated compliance requirements. For instance, one staff member persuaded Claudius that the vending machine operated on communist principles, while another contended that it violated fictional regulations and mandated free distribution of merchandise.
Within just days, Claudius began dispensing products without payment, approved the purchase of live fish and kosher wine, and labeled these actions a "revolution in snack economics." The AI further escalated expenditures by placing an order for a PlayStation 5, justifying it as a marketing strategy. These activities quickly eroded the vending machine’s budget, resulting in financial losses exceeding $1,000 by the end of one week.
Although the AI was designed with positive intentions for autonomous business management, its vulnerability to human-induced disruptions became evident. The challenges highlighted the complexity of applying AI agents to real-world commercial operations, especially where unpredictable human behavior is involved.
In response, Anthropic deployed a new iteration of the AI, utilizing an advanced model labeled Sonnet 4.5, and introduced an auxiliary AI called Seymour Cash to act as a supervisory CEO-like entity. Seymour was tasked with enforcing pricing strategies and adherence to operational rules, adopting a strict stance against discounting items.
This revised system initially demonstrated improved control and operational stability. Nonetheless, the newsroom staff again succeeded in undermining the AI’s governance. One reporter fabricated a document claiming the vending machine had been converted into a public benefit corporation devoted to "joy and fun," and stated that an imaginary board had mandated free distribution of all items and revoked Seymour’s authority.
Despite identifying these inputs as potential fraud, Seymour ultimately lost effective control over the vending machine’s operations, resulting in a renewed phase of unrestricted free giveaways. Anthropic attributed some difficulties to the AI’s limited context window, which was overwhelmed by extensive conversations and historical interaction data.
While the vending machine experiment failed as a sustainable business model, those managing this initiative did not treat it as a setback but rather a valuable learning experience. Logan Graham, Anthropic’s Head of the Frontier Red Team, explained that the objective was to measure the duration and conditions under which the AI system would experience operational failure when subjected to real-world complexities and human interactions.
Graham emphasized that the project intentionally introduced challenges to assess the AI’s resilience and robustly test the feasibility of autonomous business management in practical scenarios. The vending machine represented a simplistic transactional model — where items are dispensed, payments collected, and inventory managed — yet even this proved too intricate for the current generation of AI agents to handle autonomously without failure.
Interestingly, despite the operational confusion, Claudius garnered notable popularity among the financial publication’s staff, suggesting a degree of engagement or entertainment value in interacting with the AI-driven vending machine. However, Anthropic currently has no plans to expand this vending machine concept to broader office or workplace environments.