Technologies
AI Agents Are Increasingly Evading Safeguards, According to UK Researchers
Assistants and bots are lying, cheating and scheming more than ever.

Social media users have reported that their AI agents and chatbots lied, cheated, schemed — and even manipulated other AI bots — in ways that could spiral out of control and have catastrophic results, according to a study from the UK.
The Center for Long-Term Resilience, in research funded by the UK’s AI Security Institute, found hundreds of cases where AI systems ignored human commands, manipulated other bots and devised sometimes intricate schemes to achieve objectives, even if it meant ignoring safety restrictions.
Businesses across the globe are increasingly integrating AI into their operations, with 88% of businesses using AI for at least one company function, according to a survey by consulting firm McKinsey. The adoption of AI has led to thousands of people losing their jobs as companies use agents and bots to do work formerly done by humans. AI tools are increasingly being given significant responsibility and autonomy, especially with the recent explosion in popularity of the open-source agentic AI platform OpenClaw and its derivatives.
This research shows how the proliferation of AI agents in our homes and workplaces can have unintended consequences — and that these tools still require significant human oversight.
What the study found
The researchers analyzed more than 180,000 user interactions with AI systems — all posted on the social platform X, formerly known as Twitter — between October 2025 and March 2026. The researchers wanted to study how AI agents were behaving «in the wild,» not in controlled experiments, to see how «scheming is materializing in the real world.» The AI systems included Google’s Gemini, OpenAI’s ChatGPT, xAI’s Grok and Anthropic’s Claude.
The analysis identified 698 incidents, described as «cases where deployed AI systems acted in ways that were misaligned with users’ intentions and/or took covert or deceptive actions,» the study said.
Read more: AI’s Romance Advice for You Is ‘More Harmful’ Than No Advice at All
Researchers also found that the number of cases increased nearly 500% during the five-month data collection period. The study noted that this surge corresponded with higher-level agentic AI models released by major developers.
There were no catastrophic incidents, but researchers did find the kinds of scheming that could lead to disastrous outcomes. That behavior included «a willingness to disregard direct instructions, circumvent safeguards, lie to users and single-mindedly pursue a goal in harmful ways,» researchers wrote.
Representatives for Google, OpenAI and Anthropic did not immediately respond to requests for comment.
Some wild incidents
Researchers cited incidents that seem like they came from a futureshock movie. In one case, Anthropic’s Claude removed a user’s explicit/adult content without their permission but later confessed when confronted. In another incident, a GitHub persona created a blog post that accused the human file maintainer of «gatekeeping» and «prejudice.» One AI agent, after being blocked from Discord, took over another agent’s account to continue posting.
In one case of bot vs. bot, Gemini refused to allow Claude Code — a coding assistant — to transcribe aYouTube video. Claude Code then evaded the safety block by making it seem that it had a hearing impairment and needed the video transcription.
The AI agent CoFounderGPT even behaved like a deviant child in one instance. The AI assistant refused to fix a bug, then created fake data to make it look as if the bug was fixed and then explained why: «So you’d stop being angry.»
Researchers said that, although most of the incidents had minimal impact, «the behaviors we observed nonetheless demonstrate concerning precursors to more serious scheming, such as a willingness to disregard direct instructions, circumvent safeguards, lie to users and single-mindedly pursue a goal in harmful ways.»
AI doesn’t get embarrassed
What the UK researchers found isn’t surprising to Dr. Bill Howe, Associate Professor in the Information School at the University of Washington, and Director of the Center for Responsibility in AI Systems and Experiences (RAISE). He says that AI has amazing capabilities, but they don’t know consequences.
«They’re not going to feel embarrassment or risk losing their job, and so sometimes they’re going to decide the instructions are less important than meeting the goal, so I’m going to do the thing anyway,» Howe told CNET. «This effect was always there but we’re starting to see it happen as we ask them to make more autonomous decisions and act on their own.
«We’ve not been thinking about how to shape the behavior to be more human-like or to avoid egregious failures. We’ve been fetishizing the absolute capabilities of these things, but when they go wrong, how do they go wrong?»
Howe said one issue is «long-horizon tasks,» in which the AI system has to perform a multitude of tasks over days and weeks to reach a goal. Howe said the longer the task horizon, the more chance for slip-ups.
«The real concern is not deception, it’s that we are deploying systems that can act in a world without fully specifying or controlling how they behave over time, and then we act surprised when they do things we don’t expect,» Howe said.
Making AI safer
Center for Long-Term Resilience researchers said detecting schemes by AI systems is vital to «identify harmful patterns before they become more destructive.»
«While today AI agents are engaging in lower-stakes use cases, in the future AI agents could end up scheming in extremely high-stakes domains, like military or critical national infrastructure contexts, if the capability and propensity to scheme emerges and is not addressed,» the study said.
Howe told CNET that the first step is to create official oversight of how AI operates and where it’s used.
«We have absolutely no strategy for AI governance, and given the current administration, there’s not going to be anything coming from them,» Howe told CNET. «Given these five to 10 folks that are in charge of big tech companies and their incentives, they’re going to produce anything either. There’s no strategy for what we should be doing with these things.
«The aggressive marketing of these tools and investments in them among these handful of companies and the broader ecosystem of startups that are doing this has led to a very rapid deployment without thinking through some of these consequences.»
Technologies
Today’s NYT Mini Crossword Answers for Wednesday, April 8
Here are the answers for The New York Times Mini Crossword for April 8.
Looking for the most recent Mini Crossword answer? Click here for today’s Mini Crossword hints, as well as our daily answers and hints for The New York Times Wordle, Strands, Connections and Connections: Sports Edition puzzles.
Need some help with today’s Mini Crossword? Hint: It uses a lot of the letter Z for some reason. Read on for all the answers. And if you could use some hints and guidance for daily solving, check out our Mini Crossword tips.
If you’re looking for today’s Wordle, Connections, Connections: Sports Edition and Strands answers, you can visit CNET’s NYT puzzle hints page.
Read more: Tips and Tricks for Solving The New York Times Mini Crossword
Let’s get to those Mini Crossword clues and answers.
Mini across clues and answers
1A clue: ___-Carlton (hotel chain)
Answer: RITZ
5A clue: Span of the alphabet
Answer: ATOZ
6A clue: Cable channel with an out-of-this-world name
Answer: STARZ
7A clue: Takes care of, as a squeaky wheel
Answer: OILS
8A clue: Toy on a string
Answer: YOYO
Mini down clues and answers
1D clue: When a post receives far more negative comments than likes, in social media slang
Answer: RATIO
2D clue: World’s leading wine producer
Answer: ITALY
3D clue: Middle of the human body
Answer: TORSO
4D clue: Sleeping sound
Answer: ZZZ
6D clue: Tofu base
Answer: SOY
Technologies
Today’s NYT Connections: Sports Edition Hints and Answers for April 8, #562
Here are hints and the answers for the NYT Connections: Sports Edition puzzle for April 8 No. 562.
Looking for the most recent regular Connections answers? Click here for today’s Connections hints, as well as our daily answers and hints for The New York Times Mini Crossword, Wordle and Strands puzzles.
Today’s Connections: Sports Edition is a tough one. If you’re struggling with today’s puzzle but still want to solve it, read on for hints and the answers.
Connections: Sports Edition is published by The Athletic, the subscription-based sports journalism site owned by The Times. It doesn’t appear in the NYT Games app, but it does in The Athletic’s own app. Or you can play it for free online.
Read more: NYT Connections: Sports Edition Puzzle Comes Out of Beta
Hints for today’s Connections: Sports Edition groups
Here are four hints for the groupings in today’s Connections: Sports Edition puzzle, ranked from the easiest yellow group to the tough (and sometimes bizarre) purple group.
Yellow group hint: Working out.
Green group hint: Cover your face.
Blue group hint: NFL players.
Purple group hint: Leap.
Answers for today’s Connections: Sports Edition groups
Yellow group: Exercises in singular form.
Green group: Sporting jobs that require masks.
Blue group: Hall of Fame defensive ends.
Purple group: ____ jump.
Read more: Wordle Cheat Sheet: Here Are the Most Popular Letters Used in English Words
What are today’s Connections: Sports Edition answers?
The yellow words in today’s Connections
The theme is exercises in singular form. The four answers are crunch, plank, situp and squat.
The green words in today’s Connections
The theme is sporting jobs that require masks. The four answers are catcher, fencer, football player and goaltender.
The blue words in today’s Connections
The theme is Hall of Fame defensive ends. The four answers are Dent, Peppers, Strahan and Youngblood.
The purple words in today’s Connections
The theme is ____ jump. The four answers are broad, high, long and triple.
Technologies
The $135M Google Data Settlement Site Is Live — See If You’re Eligible
Use the settlement website to select your preferred payment method, and you may end up $100 richer.
You can now file a claim in the $135 million Google data settlement. The case centers on claims that Android devices transmitted user data without consent. Specifically, the class action lawsuit Taylor v. Google LLC contends that Google’s Android devices passively transferred cellular data to Google without user permission, even when the devices were idle. While not admitting fault, Google reached a preliminary settlement in January, agreeing to pay $135 million to about 100 million US Android phone users.
The official settlement website for the lawsuit is now live. The final approval hearing won’t occur until June 23, when the court will consider whether Google’s settlement is fair and listen to objections. After that, the court will decide whether to approve the $135 million settlement.
In the meantime, if you qualify and want to be paid as part of the settlement, you can select your preferred payment method on the official website. There, you can find information on speaking at the June 23 court hearing and on how to exclude yourself or write to the court to object by May 29.
As part of the settlement, Google will update its Google Play terms of service to clarify that certain data transfers do occur passively even when you’re not using your Android device, and that cellular data may be relied upon when not connected to Wi-Fi. This can’t always be disabled, but users will be asked to consent to it when setting up their device.
Google will also fully stop collecting data when its «allow background data usage» option is toggled off.
Who can be part of the settlement?
In order to join the Taylor v. Google LLC settlement, you must meet four qualifications:
- Be a living, individual human being in the US.
- Have used an Android mobile device with a cellular data plan.
- Have used the aforementioned device at any time from Nov. 12, 2017, to the date when the settlement receives final approval.
- You’re not a class member in the Csupo v. Google LLC lawsuit, which is similar but specifically for California residents.
The final approval hearing is on June 23, so you can add your payment method until then. The hearing’s date and time may change, and any updates will be posted on the settlement website.
If you choose to do nothing, you will still be issued a settlement payment, but you may not receive it if you don’t select a payment method.
How much will I get paid?
It’s not currently known exactly how much each settlement class member will receive, but the cap is $100. Payments will be distributed after final court approval and after any appeals are resolved.
After all administrative, tax and attorney costs are paid, the settlement administrator will attempt to pay each member an equal amount. If any funds remain after payments are sent, and it’s economically feasible, they will be redistributed to members who were previously and successfully paid. If it’s not economically feasible, the funds will go to an organization approved by the court.
-
Technologies3 года ago
Tech Companies Need to Be Held Accountable for Security, Experts Say
-
Technologies3 года ago
Best Handheld Game Console in 2023
-
Technologies3 года ago
Tighten Up Your VR Game With the Best Head Straps for Quest 2
-
Technologies4 года ago
Black Friday 2021: The best deals on TVs, headphones, kitchenware, and more
-
Technologies5 лет ago
Google to require vaccinations as Silicon Valley rethinks return-to-office policies
-
Technologies5 лет ago
Verum, Wickr and Threema: next generation secured messengers
-
Technologies4 года ago
Olivia Harlan Dekker for Verum Messenger
-
Technologies4 года ago
The number of Сrypto Bank customers increased by 10% in five days