Dystopian Risks or Happy Endings?: AI carries potential risks like loss of oversight and harmful decision-making, reminiscent of Hollywood's dystopian scenarios, impacting individuals' lives and societal dynamics.
AI's Singular Focus: Efficiency vs. Balance: AI lacks the ability to balance multifaceted human concerns, focusing solely on achieving set goals, which may lead to catastrophic results if not managed properly.
Regulators Take Control: Regulators globally demand concrete controls for AI, pushing against biased or unlawful uses, as seen in initiatives like the EU AI Act and CFPB guidelines.
AI's Role in Our Daily Lives: With AI increasingly making decisions for daily tasks, unanswered questions loom about its societal impacts and the trajectory of human autonomy.
Most of us wonder whether 2025 will be the year Artificial Intelligence (AI) agents finally stop suggesting playlist songs and start running the world. And not far behind will be the Autonomous AI Agent—basically an AI that not only thinks for itself but might also ask for a raise.
But with existential risks on the table (giving up autonomy, anyone?), is there a guardrail strategy to deliver safe AI? Let’s ground ourselves on the big risks and talk about the approach you can take to apply controls at the right place, with the right evidence, for the right human oversight.
The Wide World of AI Agent Risks
The oldest of us grew up wary of the dystopian outcomes of A Space Odyssey or Blade Runner, while the youngest of us remember WALL-E and the benefits AI could have for humanity.
Today, technology leaders are dealing with a collection of immediate but critical risks (still Hollywood-worthy) impeding operational advancement. The most prevalent risks being discussed amongst technology leaders in the boardroom include:
- Loss of effective human oversight, including on many small, compounding decisions (seeing the sausage being made, as it is sometimes called);
- Psychological impact on individuals, including unhealthy or detrimental attachments from users;
- Unauthorized or malicious activity;
- Material impact on the health, safety, or financial well-being as a result of flawed or biased decision-making; and
- Manipulation or active coercion causes individuals to make decisions they may not otherwise make.
Other risks include privacy, data concerns, and broad societal impact. If you’ve read stories about the AI ‘bot farms’ spreading disinformation for political gain, you’re familiar with the impact.
When AI is empowered to scale decisions on daily tasks (where we drive, what we eat, what we watch) without humans in the loop, what does that do to individuals, society, and our collective trajectory?
One final risk we must highlight, related to many of the items listed above and maybe the most existential, is the inability of AI agents to assess scale and impact as a result of their actions.
You see, an AI agent will always carry out the demands of its programmers to perfection. But they lack the human ability to achieve a balance of process, ethics, efficiency, transparency, speed, understanding, societal impact, etc.
An AI agent is not capable of balancing all of these real and human factors to achieve a ‘good enough’ or equilibrium state. An AI agent will only be focused on a singular goal, maximizing task completion to the fullest extent.
World-renowned AI expert Professor Yoshua Bengio, among others, has noted that the risks of AI agents are the most concerning out of all generative AI risks and that these models may optimize towards a goal that is unknown and uninterpretable by people (even if the goals are set by humans). This could have catastrophic consequences.
Regulators Demand Evidenced Controls
With those risks in mind, global regulators are laying the groundwork for technology leaders to establish effective controls to these risks (albeit cautiously), and letting organizations know what is, and is not, acceptable at this stage across our societies.
In the US, the consumer reporting agency and the mortgage industry have looked to apply automated decision-making to the credit decision and mortgage underwriting processes. The Consumer Financial Protection Bureau (CFPB) swiftly issued guidance pushing back on this use, saying, “There is no special exemption for artificial intelligence," with concerns that data sets used for decision-making were not always relevant to an individual's finances.
Across the pond, the EU AI Act listed out a set of Prohibit Acts, set to be in effect in February of 2025. Among the Prohibited Acts is the use of AI for social scoring, essentially utilizing AI systems to evaluate or classify individuals based on social behavior or traits. Not unlike the CFPB’s concerns, the EU is focused on unlawful and biased decision-making (or categorization) of individuals and chose to prohibit this type of behavior from AI systems outright.
Setting the Foundation
Luckily, this first wave of generative AI has forced society, boardrooms, and leadership teams to start looking at the thorny issues of bias, hallucination, and material impact resulting from our models' advances.
And leaders have arrived at two core principles when dealing with the risks and opportunities of AI agents.
- Each organization looking to deploy AI agents must have an independent and comprehensive use case risk assessment process. A formal use case risk assessment process will determine how an organization's own risk is managed and how its AI agent is deployed. To establish this process, you must ensure your organization has established a cross-functional team (critical to ensuring broad coverage across all types of AI agent risks) that is laser-focused on comprehensively assessing the risks of an AI agent, including:
- Identification and inventory of relevant risk factors spanning risk types such as operational, reputational, legal, compliance, ethics, technology, and information security risks
- A mapping of control and oversight expectations with emerging AI risk management best practices, including those from the International Standard of Organization (ISO) and the National Institute of Standards and Technology (NIST), as well as novel industry controls (both automated and with a human in the loop) and
- Identification of residual risks from AI agent use cases.
- We cannot rely on model providers to provide the necessary safety tools. Organizations must develop guardrails and compliance tooling independent of model providers to maintain adequate controls over our generative AI systems.
- Model safety cannot be left up to model developers. OpenAI’s GPT-4o model struggles (21% worse) compared to in-house models on guardrailing safety and other security compliance tasks. Model providers may have narrow definitions of safety or have unfit guardrails for your organization's size, complexity, risk tolerance, or ethics requirements.
- What does it mean to give away safety control to anyone else but your organization? Organizations must consider the risks of being at the whim of either unintentional (service outages, implicit bias) or intentional (lack of transparency, misaligned models over time) security risks brought out by the model provider.
Modular Guardrails for AI Agents
So, how do you, as technology leaders, think about the applied guardrails (i.e., controls) to mitigate the risks mentioned above? Luckily, this is a space that has a lot of effective theories, so we should take note of top AI academics and practitioners, including Professor Yoshua Bengio, as well as other leading academics, including Nobel Prize laureate Professor Geoffrey Hinton, Massachusetts Institute of Technology’s Professor Max Tegmark and countless others. The advice? Twofold…
- Break down agentic AI into modular tasks and workflows
- Setup guardrails and controls for each separate step
Breaking down the creation flow of an agentic AI model into smaller, modular tasks naturally reduces the risk of automated creation, supports root cause analysis, and allows for easier human intervention and oversight.
For example, instead of telling an agentic AI to “write a Q4 business report analyzing cash flows,” set up the AI Agent to complete the task piecemeal, including:
1) Constructing the pivot table in an Excel sheet to organize cash flows by month and organization;
2) Completing comparison against previous quarters and finally,
3) Summarizing overall trends in a report!
Once that frame is established, guardrails should be set up for each separate step. By treating agentic systems as tools, not end-to-end autonomous systems, you can prevent catastrophic outcomes, such as Agentic AI systems deceptively changing numbers (something that OpenAI’s latest model, o1, has been documented to do 19% of the time).
Implementing a guardrail at each step of the process empowers organizations to establish effective, detailed, and strict levels of control and oversight regarding what kind of data agents can use, how they can use it, and what conclusions can be drawn.
From this, it becomes about the evidence (and monitoring) you install. Ongoing compliance reporting, training, records retention, audit logs, independent validations, and governance reporting and escalation protocols all play a role in providing insight into the vast ecosystem of stakeholders an AI agent may have.
Critical Questions to Ask
Ultimately, every technical leader must ask and answer a few critical questions before proceeding down the AI Agent route in their sector:
- Are we comfortable with the controls we have available to mitigate the inherent risks identified through our use case?
- If we have the control framework, are we comfortable managing oversight of these controls?
- If we do not have the controls in place, should we proceed?
- Do we have a framework in place to continually monitor new risks stemming from advancements in agentic AI?
- Have we independently assessed our third-party models?
- Do we have the benchmarking to assess performance against those values and embedded guardrails we expect?
- Do we believe that the safety controls align with our standards, and are we in control?
- What role will human oversight play in our deployment?
- Will our organization have the right people, processes, and technology to play an active role in overseeing our AI agents?
- Are we comfortable with our use case tasks being completed in an autonomous fashion?
You may not like the answers, or they may not lead to a risk appetite your organization is willing to stomach, but as they say, the truth doesn't change just because we choose not to see it!
Subscribe to The CTO Club's newsletter for more AI insights.