AI

Advancing Governance: Testing Autonomous AI Systems in Real-world Environments

Published

2 months ago

May 26, 2026

Autonomous AI systems test governance in physical environments

AI Governance Frameworks for Embodied AI Systems in Physical Environments

Autonomous AI systems are no longer confined to software environments but are expanding into warehouses, delivery networks, and public spaces. This shift has raised questions about the adequacy of current AI regulations for systems operating in physical settings.

Existing AI governance frameworks have primarily focused on addressing online harms and model outputs, such as bias, misinformation, and harmful content. However, the emergence of embodied AI systems in physical environments poses new challenges, as failures in these systems can have direct impacts on infrastructure, property, and human safety.

Singapore’s Infocomm Media Development Authority recently released version 1.5 of its Model AI Governance Framework for Agentic AI. This framework provides guidance for organizations deploying AI agents capable of planning, decision-making, and executing actions to achieve user-defined objectives.

The framework outlines that these agents can interact with various tools, external systems, and other agents, including those involved in updating databases, writing files, controlling devices, or conducting transactions. It emphasizes the importance of access controls, monitoring mechanisms, and human approval as essential governance measures for deployment.

AI Transitioning into Physical Systems

Discussions at an AI summit in Singapore highlighted the safety concerns surrounding robotics and embodied AI, which are reminiscent of issues commonly associated with aviation, industrial systems, and critical infrastructure oversight rather than traditional software regulation.

Experts deliberated on whether autonomous systems can operate safely and reliably in unpredictable real-world environments over extended periods. Dr. Ya-Qin Zhang from Tsinghua University’s Institute for AI Industry Research noted that embodied AI systems magnify existing risks related to autonomous software and can directly impact transport systems, drones, logistics networks, and critical infrastructure.

Zhang emphasized that any risks in the digital realm would be amplified in the physical world, leading to tangible consequences. As AI systems become more ingrained in physical operations, there is a growing need to address reliability, operational monitoring, and post-deployment assurance as key governance concerns.

The IMDA framework advocates for gradual rollouts, continuous monitoring, and ongoing testing post-deployment. It acknowledges that agents interact dynamically with their environment, making it impossible to anticipate all risks before their release.

Monitoring as a Key Deployment Issue

Companies like Grab, which is testing autonomous vehicles and delivery robots in Singapore, underscore the importance of deployment governance centered around simulation, testing, and continuous monitoring.

Grab’s Chief Technology Officer, Suthen Thomas Paradatheth, highlighted the extensive simulation and testing conducted to ensure the reliability of their robots before scaling up operations. Continuous monitoring systems are in place to track robot performance and identify unexpected failures post-deployment.

IMDA’s framework recommends organizations assess agentic AI use cases based on factors like data access, autonomy, and task complexity. It also stresses the significance of limiting agent access to tools and systems, defining standard operating procedures, and implementing mechanisms to deactivate malfunctioning agents.

Shared Accountability Across Multiple Actors

Embodied AI systems involve various stakeholders throughout their development, manufacturing, and deployment phases, leading to shared accountability among AI developers, robotics manufacturers, semiconductor suppliers, and infrastructure operators.

IMDA asserts that organizations and individuals remain responsible for agent actions, even in autonomous scenarios. The framework calls for clear delineation of responsibilities across the entire agentic AI value chain, from model providers and platform developers to deployers, tooling providers, and end users.

Companies like Applied Materials underscore the importance of semiconductor economics and systems integration in large-scale robotics deployment. They emphasize the need for purpose-built designs tailored to specific industrial ecosystems rather than a one-size-fits-all solution.

China’s robotics startup Galbot has deployed humanoid robotics systems in various sectors, emphasizing the importance of government-backed initiatives, industrial partnerships, and long-term funding to drive deployment scale and industrial commercialization.

Enhancing Accountability and Safety in AI Systems

Japan is focusing on developing standards, robotics datasets, and safety governance to ensure responsible deployment of embodied AI systems. Initiatives like the AI Association project and the AI Safety Institute aim to collect data and establish governance standards in collaboration with countries like Singapore.

IMDA’s framework delineates four governance areas for agentic AI, including risk assessment, human accountability, technical controls, and end-user responsibility. It emphasizes that these areas require continuous evaluation rather than one-time assessments.

The framework highlights the need for human oversight tailored to agentic systems, recommending human approval at critical checkpoints and high-stakes actions to mitigate risks. It also addresses challenges like automation bias and alert fatigue, suggesting real-time monitoring to detect unexpected behavior.

Organizations are advised to inform users about agent capabilities, data access, and user responsibilities, in addition to providing training on human-agent interaction and oversight. The framework stresses the importance of human intervention at critical decision points and maintaining final validation with designated reviewers.

Testing AI in Regulated Environments

Financial institutions like JPMorgan are leveraging AI tools to enhance their operations, with a focus on improving information accessibility, synthesizing data, and supporting client engagement. The adoption of AI tools is reshaping traditional banking roles, with a shift towards hiring more AI specialists.

IMDA’s framework includes a case study from OCBC Bank, showcasing the application of AI in source-of-wealth analysis. The system autonomously parses income-related documents and generates source-of-wealth memos, with human oversight required at critical decision points.

Companies like Goldman Sachs, Citigroup, and Bank of America are exploring AI tools like Anthropic’s Mythos cybersecurity model to detect vulnerabilities in browsers, infrastructure, and software. These initiatives align with the broader trend of global banks investing in AI technologies and transforming their workforce.

Overall, the integration of AI in regulated workflows underscores the need for continuous monitoring, human oversight, and adherence to governance frameworks to ensure the responsible deployment of AI systems.

Future Prospects of AI in Industrial and Retail Sectors

Japan’s increasing adoption of AI-powered robots reflects a growing trend in leveraging AI technologies to address labor shortages and enhance industrial capabilities. Companies across various sectors are considering deploying AI robots, with a focus on manufacturing, dangerous tasks, and customer-facing services.

Retail giants like Walmart are exploring the use of agentic AI across shopping, employee management, supplier relations, and software development workflows. The development of AI-powered “super agents” for different user groups highlights the potential for AI to streamline operations and enhance customer experiences.

Walmart’s deployment of AI agents like Sparky for shopping assistance and other specialized agents for different user groups underscores the transformative impact of AI in retail operations. These initiatives aim to improve efficiency, personalize customer interactions, and drive innovation in the retail sector.

As AI continues to evolve and expand into various industries, the importance of governance frameworks, human oversight, and accountability mechanisms becomes paramount to ensure the ethical and responsible deployment of AI technologies.

(Photo by Growtika)

For more insights on AI and big data, attend the AI & Big Data Expo hosted by TechEx events in Amsterdam, California, and London. Stay informed about upcoming enterprise technology events and webinars by TechForge Media here.