Technologies

AI Agents Are Increasingly Evading Safeguards, According to UK Researchers

Assistants and bots are lying, cheating and scheming more than ever.

Social media users have reported that their AI agents and chatbots lied, cheated, schemed — and even manipulated other AI bots — in ways that could spiral out of control and have catastrophic results, according to a study from the UK.

The Center for Long-Term Resilience, in research funded by the UK’s AI Security Institute, found hundreds of cases where AI systems ignored human commands, manipulated other bots and devised sometimes intricate schemes to achieve objectives, even if it meant ignoring safety restrictions.

Businesses across the globe are increasingly integrating AI into their operations, with 88% of businesses using AI for at least one company function, according to a survey by consulting firm McKinsey. The adoption of AI has led to thousands of people losing their jobs as companies use agents and bots to do work formerly done by humans. AI tools are increasingly being given significant responsibility and autonomy, especially with the recent explosion in popularity of the open-source agentic AI platform OpenClaw and its derivatives.

This research shows how the proliferation of AI agents in our homes and workplaces can have unintended consequences — and that these tools still require significant human oversight.

What the study found

The researchers analyzed more than 180,000 user interactions with AI systems — all posted on the social platform X, formerly known as Twitter — between October 2025 and March 2026. The researchers wanted to study how AI agents were behaving «in the wild,» not in controlled experiments, to see how «scheming is materializing in the real world.» The AI systems included Google’s Gemini, OpenAI’s ChatGPT, xAI’s Grok and Anthropic’s Claude.

The analysis identified 698 incidents, described as «cases where deployed AI systems acted in ways that were misaligned with users’ intentions and/or took covert or deceptive actions,» the study said.

Researchers also found that the number of cases increased nearly 500% during the five-month data collection period. The study noted that this surge corresponded with higher-level agentic AI models released by major developers.

There were no catastrophic incidents, but researchers did find the kinds of scheming that could lead to disastrous outcomes. That behavior included «a willingness to disregard direct instructions, circumvent safeguards, lie to users and single-mindedly pursue a goal in harmful ways,» researchers wrote.

Representatives for Google, OpenAI and Anthropic did not immediately respond to requests for comment.

Some wild incidents

Researchers cited incidents that seem like they came from a futureshock movie. In one case, Anthropic’s Claude removed a user’s explicit/adult content without their permission but later confessed when confronted. In another incident, a GitHub persona created a blog post that accused the human file maintainer of «gatekeeping» and «prejudice.» One AI agent, after being blocked from Discord, took over another agent’s account to continue posting.

In one case of bot vs. bot, Gemini refused to allow Claude Code — a coding assistant — to transcribe aYouTube video. Claude Code then evaded the safety block by making it seem that it had a hearing impairment and needed the video transcription.

The AI agent CoFounderGPT even behaved like a deviant child in one instance. The AI assistant refused to fix a bug, then created fake data to make it look as if the bug was fixed and then explained why: «So you’d stop being angry.»

Researchers said that, although most of the incidents had minimal impact, «the behaviors we observed nonetheless demonstrate concerning precursors to more serious scheming, such as a willingness to disregard direct instructions, circumvent safeguards, lie to users and single-mindedly pursue a goal in harmful ways.»

AI doesn’t get embarrassed

What the UK researchers found isn’t surprising to Dr. Bill Howe, Associate Professor in the Information School at the University of Washington, and Director of the Center for Responsibility in AI Systems and Experiences (RAISE). He says that AI has amazing capabilities, but they don’t know consequences.

«They’re not going to feel embarrassment or risk losing their job, and so sometimes they’re going to decide the instructions are less important than meeting the goal, so I’m going to do the thing anyway,» Howe told CNET. «This effect was always there but we’re starting to see it happen as we ask them to make more autonomous decisions and act on their own.

«We’ve not been thinking about how to shape the behavior to be more human-like or to avoid egregious failures. We’ve been fetishizing the absolute capabilities of these things, but when they go wrong, how do they go wrong?»

Howe said one issue is «long-horizon tasks,» in which the AI system has to perform a multitude of tasks over days and weeks to reach a goal. Howe said the longer the task horizon, the more chance for slip-ups.

«The real concern is not deception, it’s that we are deploying systems that can act in a world without fully specifying or controlling how they behave over time, and then we act surprised when they do things we don’t expect,» Howe said.

Making AI safer

Center for Long-Term Resilience researchers said detecting schemes by AI systems is vital to «identify harmful patterns before they become more destructive.»

«While today AI agents are engaging in lower-stakes use cases, in the future AI agents could end up scheming in extremely high-stakes domains, like military or critical national infrastructure contexts, if the capability and propensity to scheme emerges and is not addressed,» the study said.

Howe told CNET that the first step is to create official oversight of how AI operates and where it’s used.

«We have absolutely no strategy for AI governance, and given the current administration, there’s not going to be anything coming from them,» Howe told CNET. «Given these five to 10 folks that are in charge of big tech companies and their incentives, they’re going to produce anything either. There’s no strategy for what we should be doing with these things.

«The aggressive marketing of these tools and investments in them among these handful of companies and the broader ecosystem of startups that are doing this has led to a very rapid deployment without thinking through some of these consequences.»

Technologies

Google races to put Gemini at the center of Android before Apple’s AI reboot

Google is using its latest Android rollout to position Gemini as the AI layer across phones, Chrome, laptops and cars.

Google is using its latest Android rollout to make Gemini less of a chatbot and more of an operating layer across the phone, browser, car and laptop, just weeks before Apple is expected to show its own Gemini-powered Apple Intelligence reboot at WWDC.
Ahead of its Google I/O developer conference next week, the company previewed a number of Android updates, including AI-powered app automation, a smarter version of Chrome on Android, new tools for creators, a redesigned Android Auto experience, and a sweeping set of new security features.
Alphabet is counting on Gemini to help Google compete directly with OpenAI and Anthropic in the market for artificial intelligence models and services, while also serving as the AI backbone across its expansive portfolio of products, including Android. Meanwhile, Gemini is powering part of Apple’s new AI strategy, giving Google a role in the iPhone maker’s reset even as it races to prove its own version of personal AI on the phone is further along.
Sameer Samat, who oversees Google’s Android ecosystem, told CNBC that Google is rebuilding parts of Android around Gemini Intelligence to help users complete everyday tasks more easily.
“We’re transitioning from an operating system to an intelligence system,” he said.
As part of Tuesday’s announcements. Google said Gemini Intelligence will be able to move across apps, understand what’s on the screen and complete tasks that would normally require a user to jump between multiple services. That means Android is moving beyond the traditional assistant model, where users ask a question and get an answer, and acting more like an agent.
For instance, Google says Gemini can pull relevant information from Gmail, build shopping carts and book reservations. Samat gave the example of asking Gemini to look at the guest list for a barbecue, build a menu, add ingredients to an Instacart list and return for approval before checkout.
A big concern surrounding agentic AI involves software taking action on a user’s behalf without permissions. Samat said Gemini will come back to the user before completing a transaction, adding, “the human is always in the loop.”
Four months after announcing its Gemini deal with Google, Apple is under pressure to show a more capable version of Apple Intelligence, which has been a relative laggard on the market. Apple has long framed privacy, hardware integration and control of the user experience as its advantages.
Google’s Android push is designed to show it can bring AI deeper into the device experience while still giving users control over what Gemini can see, where it can act and when it needs confirmation.
The app automation features will roll out in waves, starting with the latest Samsung Galaxy and Google Pixel phones this summer, before expanding across more Android devices, including watches, cars, glasses and laptops later this year.
The company is also redesigning Android Auto around Gemini, turning the car into another major surface for its assistant. Android Auto is in more than 250 million cars, and Google says the new release includes its biggest maps update in a decade and Gemini-powered help with tasks like ordering dinner while driving.
Alphabet’s AI strategy has been embraced by Wall Street, which has pushed the company’s stock price up more than 140% in the past year, compared to Apple’s roughly 40% gain. Investors now want to see how Gemini can become more central to the products people use every day.
WATCH: Alphabet briefly tops Nvidia after report of $200 billion Anthropic cloud deal

Technologies

Waymo recalls 3,800 robotaxis after glitch allowed some vehicles to ‘drive into standing water’

Waymo issued a voluntary recall of about 3,800 of its robotaxis to fix software issues that could allow them to drive into flooded roadways.

Waymo is recalling about 3,800 robotaxis in the U.S. to fix software issues that could allow them to “drive onto a flooded roadway,” according to a letter on the National Highway Traffic Safety Administration’s website.
The voluntary recall is for Waymo vehicles that use the company’s fifth and sixth generation automated driving systems (or ADS), the U.S. auto safety regulator said in the letter posted Tuesday.
Waymo autonomous vehicles in Austin, Texas, were seen on camera driving onto a flooded street and stalling, requiring other drivers to navigate around them. It’s the latest example of a safety-related issue for the Alphabet-owned AV unit that’s rapidly bolstering its fleet of vehicles and entering new U.S. markets.
Waymo has drawn criticism for its vehicles failing to yield to school buses in Austin, and for the performance of its vehicles during widespread power outages in San Francisco in December, when robotaxis halted in traffic, causing gridlock.
The company said in a statement on Tuesday that it’s “identified an area of improvement regarding untraversable flooded lanes specific to higher-speed roadways,” and opted to file a “voluntary software recall” with the NHTSA.
“Waymo provides over half a million trips every week in some of the most challenging driving environments across the U.S., and safety is our primary priority,” the company said.
Waymo added that it’s working on “additional software safeguards” and has put “mitigations” in place, limiting where its robotaxis operate during extreme weather, so that they avoid “areas where flash flooding might occur” in periods of intense rain.
WATCH: Waymo launches new autonomous system in Chinese-made vehicle

Technologies

Qualcomm tumbles 13% as semiconductor stocks retreat from historic AI-fueled surge

Semiconductor equities reversed sharply after a broad AI-driven advance, with Qualcomm suffering its worst day since 2020 amid inflation concerns and rising oil prices.

Semiconductor stocks fell sharply on Tuesday, reversing course after an extensive rally that had expanded the artificial intelligence investment theme well past Nvidia and driven the industry to unprecedented levels.

Qualcomm plunged 13% and was on track for its steepest single-day decline since 2020. Intel shed 8%, while On Semiconductor and Skyworks Solutions each lost more than 6%. The iShares Semiconductor ETF, which benchmarks the overall sector, fell 5%.

The sell-off came after a key gauge of consumer prices came in above forecasts, and as conflict in Iran pushed crude oil higher—prompting investors to shift away from riskier assets.

The preceding advance had widened the AI opportunity set beyond longtime industry leader Nvidia, which for much of the past several years had largely carried the market to new peaks on its own.

Explosive appetite for central processing units, along with the graphics processing units that power large language models, has sent chipmakers to all-time highs.

Market participants are wagering that the shift from AI model training to autonomous agents will lift demand for additional AI hardware. Among the beneficiaries are memory chip producers, which are raising prices as supply remains tight.

Micron Technology slid 6%, and Sandisk cratered 8%. Sandisk’s stock has surged more than six times over since January.