Connect with us

Technologies

AI Is Bad at Sudoku. It’s Even Worse at Showing Its Work

Researchers did more than ask chatbots to play games. They tested whether AI models could describe their thinking. The results were troubling.

Chatbots are genuinely impressive when you watch them do things they’re good at, like writing a basic email or creating weird, futuristic-looking images. But ask generative AI to solve one of those puzzles in the back of a newspaper, and things can quickly go off the rails.

That’s what researchers at the University of Colorado at Boulder found when they challenged large language models to solve sudoku. And not even the standard 9×9 puzzles. An easier 6×6 puzzle was often beyond the capabilities of an LLM without outside help (in this case, specific puzzle-solving tools).

A more important finding came when the models were asked to show their work. For the most part, they couldn’t. Sometimes they lied. Sometimes they explained things in ways that made no sense. Sometimes they hallucinated and started talking about the weather.

If gen AI tools can’t explain their decisions accurately or transparently, that should cause us to be cautious as we give these things more control over our lives and decisions, said Ashutosh Trivedi, a computer science professor at the University of Colorado at Boulder and one of the authors of the paper published in July in the Findings of the Association for Computational Linguistics.

«We would really like those explanations to be transparent and be reflective of why AI made that decision, and not AI trying to manipulate the human by providing an explanation that a human might like,» Trivedi said.


Don’t miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source.


The paper is part of a growing body of research into the behavior of large language models. Other recent studies have found, for example, that models hallucinate in part because their training procedures incentivize them to produce results a user will like, rather than what is accurate, or that people who use LLMs to help them write essays are less likely to remember what they wrote. As gen AI becomes more and more a part of our daily lives, the implications of how this technology works and how we behave when using it become hugely important.

When you make a decision, you can try to justify it, or at least explain how you arrived at it. An AI model may not be able to accurately or transparently do the same. Would you trust it?

Why LLMs struggle with sudoku

We’ve seen AI models fail at basic games and puzzles before. OpenAI’s ChatGPT (among others) has been totally crushed at chess by the computer opponent in a 1979 Atari game. A recent research paper from Apple found that models can struggle with other puzzles, like the Tower of Hanoi.

It has to do with the way LLMs work and fill in gaps in information. These models try to complete those gaps based on what happens in similar cases in their training data or other things they’ve seen in the past. With a sudoku, the question is one of logic. The AI might try to fill each gap in order, based on what seems like a reasonable answer, but to solve it properly, it instead has to look at the entire picture and find a logical order that changes from puzzle to puzzle. 

Read more: 29 Ways You Can Make Gen AI Work for You, According to Our Experts

Chatbots are bad at chess for a similar reason. They find logical next moves but don’t necessarily think three, four or five moves ahead — the fundamental skill needed to play chess well. Chatbots also sometimes tend to move chess pieces in ways that don’t really follow the rules or put pieces in meaningless jeopardy. 

You might expect LLMs to be able to solve sudoku because they’re computers and the puzzle consists of numbers, but the puzzles themselves are not really mathematical; they’re symbolic. «Sudoku is famous for being a puzzle with numbers that could be done with anything that is not numbers,» said Fabio Somenzi, a professor at CU and one of the research paper’s authors.

I used a sample prompt from the researchers’ paper and gave it to ChatGPT. The tool showed its work, and repeatedly told me it had the answer before showing a puzzle that didn’t work, then going back and correcting it. It was like the bot was turning in a presentation that kept getting last-second edits: This is the final answer. No, actually, never mind, this is the final answer. It got the answer eventually, through trial and error. But trial and error isn’t a practical way for a person to solve a sudoku in the newspaper. That’s way too much erasing and ruins the fun.

AI struggles to show its work

The Colorado researchers didn’t just want to see if the bots could solve puzzles. They asked for explanations of how the bots worked through them. Things did not go well.

Testing OpenAI’s o1-preview reasoning model, the researchers saw that the explanations — even for correctly solved puzzles — didn’t accurately explain or justify their moves and got basic terms wrong. 

«One thing they’re good at is providing explanations that seem reasonable,» said Maria Pacheco, an assistant professor of computer science at CU. «They align to humans, so they learn to speak like we like it, but whether they’re faithful to what the actual steps need to be to solve the thing is where we’re struggling a little bit.»

Sometimes, the explanations were completely irrelevant. Since the paper’s work was finished, the researchers have continued to test new models released. Somenzi said that when he and Trivedi were running OpenAI’s o4 reasoning model through the same tests, at one point, it seemed to give up entirely. 

«The next question that we asked, the answer was the weather forecast for Denver,» he said.

(Disclosure: Ziff Davis, CNET’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Explaining yourself is an important skill

When you solve a puzzle, you’re almost certainly able to walk someone else through your thinking. The fact that these LLMs failed so spectacularly at that basic job isn’t a trivial problem. With AI companies constantly talking about «AI agents» that can take actions on your behalf, being able to explain yourself is essential.

Consider the types of jobs being given to AI now, or planned for in the near future: driving, doing taxes, deciding business strategies and translating important documents. Imagine what would happen if you, a person, did one of those things and something went wrong.

«When humans have to put their face in front of their decisions, they better be able to explain what led to that decision,» Somenzi said.

It isn’t just a matter of getting a reasonable-sounding answer. It needs to be accurate. One day, an AI’s explanation of itself might have to hold up in court, but how can its testimony be taken seriously if it’s known to lie? You wouldn’t trust a person who failed to explain themselves, and you also wouldn’t trust someone you found was saying what you wanted to hear instead of the truth. 

«Having an explanation is very close to manipulation if it is done for the wrong reason,» Trivedi said. «We have to be very careful with respect to the transparency of these explanations.»

Technologies

Can Chemicals Turn My Orange iPhone 17 Pink? Here’s What I Found Out

There are reports that some cosmic orange iPhone 17 Pro handsets are turning pink. I threw chemicals at my iPhone to see what would happen.

A recent Reddit thread suggests that it’s possible for a cosmic orange iPhone 17 Pro to turn vibrant pink. As PCMag’s Eric Zeman noted, it’s likely that the phone has been discolored by cleaning substances that affected the finish, turning it from vibrant orange to a wild hot pink. Sure, this might technically be a fault, but in all honesty I love pink phones and the idea of a hot pink iPhone 17 Pro filled me with joy. So I wanted to see if I could test the theory and see just what color-changing effects various household cleaners might have on my phone.

It’s important to note here that the iPhone 17 Pro I used was bought by CNET for the purposes of testing. Had I paid over $1,000 of my own money I wouldn’t be so reckless in smearing it with chemicals that could potentially irreparably harm it. And you shouldn’t either. If you need to clean your phone, do it safely. Disclaimer aside, let’s dive in.

The chemicals

I bought two chemicals to test this out. Zeman explains that it may be oxidation that caused the color to change and that hydrogen peroxide could do this. I couldn’t find this over the counter in the UK, so I instead bought an «oxy-active» stain remover spray that, among other things, contains «oxygen-based bleaching agents» which sounded ideal. Apple also clearly states «don’t use products containing bleach or hydrogen peroxide» on its support page so, naturally, I bought some thick bleach too.

Oxy application

I started by spraying the oxy cleaner on a microfiber cloth until it was noticeably wet from the liquid and then liberally applied this all over the rear of the iPhone. The Reddit user with the affected phone showed that it only affected the metal parts, not the glass back panel, so I made sure to focus my attention on the sides and camera bar. 

With the phone well and truly doused in chemicals that have no business being anywhere near a phone, I left it to sit and think about what it had done for 30 minutes — after which time I wiped it dry and took a close inspection. Disappointingly, my phone was still factory orange, rather than «what the hell have you done to your phone» pink. Time to move on.

Bleach blast

I opened the bleach and trying hard not to think about my days as a middle school cleaner, applied a liberal blob of the stuff to a cloth and smeared it over the defenceless phone, concentrating again on the metal areas. I definitely should have worn protective gloves for all of this so please make sure you take better care of yourself than I do if you do anything with bleach. 

Again, I gave it a 30-minute settling in period before cleaning it off and inspecting the results. 

The phone remained as orange as ever, looking as box fresh as it was the day before when it was, indeed, box fresh. The orange color hadn’t changed and now almost 24 hours later there’s still no sign of discoloration of any kind. 

Is the pink iPhone 17 real?

I can’t say with any certainty whether the Reddit user’s images of a pink iPhone 17 Pro are real or not. The cuddly human side of me wants to take them at their word, while the journalist in me is sceptical. What I can say with certainty is that putting your orange iPhone into close contact with household cleaning products isn’t going to win you a funky, ultra-rare pink hue that you could sell on eBay for a small fortune. 

It’s possible that using pure peroxide could be the thing that does it, but to be honest, if you’re going out of your way to throw industrial-grade chemicals at your phone then you may as well just directly try and dye it. My goal here was to see how susceptible the orange model is to everyday household cleaners such as kitchen cleaner or bathroom bleach — the sort of things it might naturally come into contact with in routine use. And what I’ve found is that, no, it won’t ruin the nice orange color. But it’s probably still not good for your phone. 

Continue Reading

Technologies

My Teen Loves Her Apple AirPods Pro 2 and You Will Too With This $100 Off Deal for Black Friday

Apple’s AirPods Pro 2 have everything you could want from a pair of wireless earbuds, plus a steep discount.

Black Friday deals: The Apple AirPods Pro 2 are some of the best personal audio gear on the market, even if they aren’t the latest model anymore. Sure, Apple’s AirPods Pro 3 are the newest earbuds in the lineup but the AirPods Pro 2 are still an excellent pick for most people.

They’re an even better buy this week during early Black Friday sales when you can get your hands on a pair of Apple AirPods Pro 2 at a discount. Right now, Walmart is shaving a massive $100 off the AirPods Pro 2, dropping the cost to $139. That’s one of the lowest prices we’ve seen — but we doubt this deal will stick around for long.


Don’t miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source.


CNET’s key takeaways

My 13-year-old daughter loves her music and her privacy, and for years she has wanted a pair of AirPods. They’re not cheap so I’ve only been getting her more budget options, like the Amazon Echo Buds, as a result. These kept seemingly disappearing, though, so I finally ponied up for the AirPods Pro 2.

I picked them up during last year’s sales, and they were definitely well-received. She’s happy, she uses them every day, and she hasn’t lost them yet. The AirPods Pro 2 are currently on sale at Walmart for $139, a nice price for a high-quality pair like these, and one of the lowest we’ve seen.

What about the AirPods Pro 3?

The AirPods Pro 3 weren’t available at the time I bought the AirPods Pro 2, but they were rumored, and I didn’t wait to see what they offered. As CNET’s resident headphone expert, David Carnoy summarized in his AirPods Pro 3 and Pro 3 comparison, the newer model is «significantly improved in the four most important areas: fit, sound quality, noise cancellation and battery life.» They also have heart-rate monitoring, like the Beats Powerbeats Pro 2.

Hey, did you know? CNET Deals texts are free, easy and save you money.

While these are undoubtedly all important things, a lot of people aren’t going to notice the differences or make the most of the new features. With the AirPods Pro 3 being newer, they’re on a smaller sale and are currently available at Amazon for $220, which is $30 off the list price.

Why I didn’t get the AirPods 4 instead

Why did I choose AirPods Pro 2 instead of the AirPods 4 with ANC? First, as I mentioned in another article about a different pair of earbuds I bought, I think sealed, in-ear buds are better than open-design models like the AirPods 4. The seal creates another layer of noise isolation and contributes to superior sound quality, and if you want to pay attention to the world you can always engage ambient sound mode, which Apple calls transparency mode

Also a factor was that, at the time, Carnoy considered the Pro 2 the best Apple noise-canceling wireless earbuds: «While we’re quite impressed with those new models — and with the AirPods 4 ANC in particular — the AirPods Pro 2 remain arguably the best Apple AirPods you can buy if you don’t mind having silicone ear tips jammed in your ears,» he said.

My daughter uses earplugs all the time to help her sleep, so she definitely qualifies as somebody who’s comfortable stuffing things in her ears. Like her fingers, when I start using words like «sigma,» «skibidi» and «relatable» to try to relate to her.

I asked Carnoy about the Pro 2s potentially not fitting in her kid-size ears and he reassured me that the range of eartips that come with the Pro 2s «now include XS, so they should fit.» 

Do AirPods make a great gift?

It took me years to finally understand, but yes, for someone looking for wireless earbuds, AirPods — especially the Apple AirPods Pro 2 — make the perfect gift, regardless of whether you’re a teenage girl. 

Join Our Daily Deals Text Group!

Get hand-picked deals from CNET shopping experts straight to your phone.

By signing up, you confirm you are 16+ and agree to receive recurring marketing messages at the phone number provided. Consent is not a condition of purchase. Reply STOP to unsubscribe. Msg & data rates may apply. View our Privacy Policy and Terms of Use.

Continue Reading

Technologies

If You’re Flying for the Holidays, This Bluetooth Dongle Transforms In-Flight Movies, and It’s 35% Off for Black Friday

Watch airplane movies just like you would at home with this game-changing device.

Air travel for the holidays can be stressful, especially when winter weather or flight delays force a change of plans, but one perk of flying still remains — watching new-release movies. However, in-flight entertainment on most airlines usually requires a wired set of earbuds. (And the ones the airline hands out are so bad they may as well not even be connected.) 

I’d far prefer to use my wireless, noise-canceling AirPods Pro, but they connect only via Bluetooth. There’s a simple tech solution that makes viewing movies on the plane feel more like watching them on your couch.

The AirFly is a simple Bluetooth dongle that allows me to connect my wireless earbuds directly to the airplane’s entertainment system, eliminating the need for adapters or wired workarounds. 

It’s become a must-pack item in my travel bag. Since I started using it, I’ve stopped dreading in-flight audio and finally get to enjoy movies on the plane. If you fly often, this little gadget could completely change how you travel. And the base level AirFly SE is 35% off for Black Friday at Amazon.

The AirFly Pro lets me enjoy in-flight entertainment

The AirFly Pro from Twelve South is a minimally designed dongle that allows me to connect to the 3.5mm headphone jack in my airplane seat, enabling me to listen to in-flight entertainment on my noise-canceling earbuds.

All I have to do is pair the AirFly with the Bluetooth headphones I’m using, such as my AirPods Pro, plug the AirFly into the display in front of me, and I’m all set. I don’t even need to use my phone to connect the two devices.

There are several versions of the AirFly: the AirFly SE, which is currently on sale for $26 on Amazon and connects to just one set of headphones, the AirFly Pro at $55, the Pro V2 at $60 and the Pro 2 Deluxe at $70, which comes with an international headphone adapter and a suede travel case.

Hey, did you know? CNET Deals texts are free, easy and save you money.

I use the AirFly Pro, which has been a game-changer for me on flights. I’ve never had to worry about battery life since the AirFly Pro lasts for over 25 hours and can be fully charged in just three hours. I can also pair two separate pairs of headphones to a single AirFly Pro, in case I’m with someone else on a flight and want to watch the same movie or show. 

And if that’s not enough, the AirFly Pro also doubles as an audio transmitter, allowing me to turn any speaker with a headphone jack, such as my old car stereo, into a Bluetooth speaker.

The AirFly Pro makes a great gift for any traveler

The AirFly Pro is the perfect present to give to someone who’s planning to travel this year. Besides my Anker MagSafe battery pack, the AirFly Pro has become my most treasured travel accessory when I fly, which is why I consider it one of those can’t-go-wrong gifts. 

For more travel gear, here are our favorite tech essentials to travel with and our favorite travel pillows.

Join Our Daily Deals Text Group!

Get hand-picked deals from CNET shopping experts straight to your phone.

By signing up, you confirm you are 16+ and agree to receive recurring marketing messages at the phone number provided. Consent is not a condition of purchase. Reply STOP to unsubscribe. Msg & data rates may apply. View our Privacy Policy and Terms of Use.

Impulse Buys Under $25 That Make Unexpectedly Great Gifts

See all photos

Continue Reading

Trending

Copyright © Verum World Media