{"id":677,"date":"2025-01-20T02:55:29","date_gmt":"2025-01-20T02:55:29","guid":{"rendered":"https:\/\/www.batteryone.co\/blog\/?p=677"},"modified":"2025-01-20T02:55:29","modified_gmt":"2025-01-20T02:55:29","slug":"apples-ai-stumbles-news-summaries-halted-amid-accuracy-concerns","status":"publish","type":"post","link":"https:\/\/www.batteryone.co\/blog\/archives\/677","title":{"rendered":"Apple\u2019s AI Stumbles: News Summaries Halted Amid Accuracy Concerns"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.batteryone.co\/search?keyword=Apple&amp;post_type=product\">Apple<\/a>\u2019s foray into artificial intelligence,<strong>Apple Intelligence<\/strong>, has been underwhelming, to say the least. The most glaring failure? Its<strong>news summaries<\/strong>, which faced widespread backlash for misreporting headlines and generating<strong>false information<\/strong>. The issue became so severe that Apple<strong>paused the entire feature this week<\/strong>until it can be fixed.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/san.com\/wp-content\/uploads\/2025\/01\/CLEAN-APPLE-AI-NEWS_Getty-Images_featuredImage_Fri-Jan-17-2025.jpg?w=1920\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">None of this should come as a surprise.<strong>AI \u201challucinations\u201d<\/strong>\u2014instances where AI models generate incorrect or misleading information\u2014are a well-documented issue with large language models (LLMs). To date, no one has found a true solution, and it&#8217;s unclear whether one even exists. But what makes Apple\u2019s situation particularly reckless is that<strong>its own engineers warned of these deficiencies<\/strong>well before the company launched its AI system.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">&gt;&gt;&gt;<a href=\"https:\/\/www.batteryone.co\/detail\/1747090\/S360X3B\">S360X3B<\/a>Replacement Battery for<a href=\"https:\/\/www.batteryone.co\/brand\/12\/Insta360\">Insta360<\/a>X3<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Apple Knew AI Models Weren\u2019t Ready<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Last October, a group of Apple researchers published a study evaluating the<strong>mathematical reasoning<\/strong>capabilities of leading LLMs. The yet-to-be-peer-reviewed research<strong>added to the growing consensus<\/strong>that AI models don\u2019t actually \u201creason\u201d in the human sense.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>&#8220;Instead,&#8221;<\/em>the researchers concluded,<em>&#8220;they attempt to replicate the reasoning steps observed in their training data.&#8221;<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In other words, these AI models aren\u2019t truly thinking\u2014<strong>they\u2019re just mimicking patterns they\u2019ve seen before<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Math Is Hard: How AI Fails at Simple Problems<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To test AI reasoning, Apple\u2019s researchers subjected 20 different models to thousands of math problems from the widely used<strong>GSM8K dataset<\/strong>. These problems weren\u2019t particularly difficult\u2014most could be solved by a well-educated middle schooler. A typical question might read:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>&#8220;James buys 5 packs of beef that are 4 pounds each. The price of beef is $5.50 per pound. How much did he pay?&#8221;<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The key test came when researchers<strong>changed the numbers<\/strong>in the problems to ensure the AI models weren\u2019t just memorizing answers. Even this minor tweak caused a small but<strong>consistent drop in accuracy<\/strong>across all models.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But when the researchers went further\u2014<strong>changing names and adding irrelevant details<\/strong>(such as mentioning that some fruits in a counting problem were &#8220;smaller than usual&#8221;)\u2014the results were disastrous. Some models saw accuracy drop by<strong>as much as 65%<\/strong>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Even the<strong>best-performing model, OpenAI\u2019s o1-preview<\/strong>, saw a<strong>17.5% decline<\/strong>, while its predecessor,<strong>GPT-4o, dropped by 32%<\/strong>. These results exposed a critical weakness: AI struggles<strong>not just with reasoning, but with identifying relevant information<\/strong>for problem-solving.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">&gt;&gt;&gt;<a href=\"https:\/\/www.batteryone.co\/detail\/1747089\/U914479PHV\">U914479PHV<\/a>Replacement Battery for<a href=\"https:\/\/www.batteryone.co\/brand\/11\/iData\">iData<\/a>K3S<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">AI: More Copycat Than Thinker<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The study\u2019s conclusion was damning.&#8221;This reveals a critical flaw in the models&#8217; ability to discern relevant information for problem-solving,&#8221; the researchers wrote. &#8220;Their reasoning is not formal in the common sense term and is mostly based on pattern matching.&#8221;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Put simply,<strong>AI models are great at appearing intelligent<\/strong>, and they often deliver the right answers\u2014but only as long as they can<strong>copy and repackage<\/strong>solutions they&#8217;ve seen before. Once they<strong>can\u2019t rely on direct memorization<\/strong>, their performance crumbles.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This should have raised serious concerns about trusting an AI model to<strong>summarize news<\/strong>\u2014a process that involves<strong>rearranging words while preserving meaning<\/strong>. Yet, Apple<strong>ignored its own research<\/strong>and pushed forward with Apple Intelligence anyway.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Then again, this<strong>trial-and-error approach<\/strong>has become standard practice across the AI industry. Apple\u2019s misstep may be frustrating, but it\u2019s hardly surprising.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Apple\u2019s foray into artificial intelligence,Apple Intelligence, has been underwhelming, to say the least. The most glaring failure? Itsnews summaries, which faced widespread backlash for misreporting headlines and generatingfalse information. The issue became so severe that Applepaused the entire feature this weekuntil it can be fixed. None of this should come as a surprise.AI \u201challucinations\u201d\u2014instances where [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[7],"class_list":["post-677","post","type-post","status-publish","format-standard","hentry","category-news","tag-apple"],"_links":{"self":[{"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/posts\/677","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/comments?post=677"}],"version-history":[{"count":1,"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/posts\/677\/revisions"}],"predecessor-version":[{"id":678,"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/posts\/677\/revisions\/678"}],"wp:attachment":[{"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/media?parent=677"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/categories?post=677"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.batteryone.co\/blog\/wp-json\/wp\/v2\/tags?post=677"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}