Bonum Certa Men Certa

The Great LLM Delusion - Part IV: Academic Papers as Microsoft Marketing for LLMs

posted by Roy Schestowitz on Jan 31, 2024

Photo capturing two very different architectural styles for student buildings at the UCLA Campus, California

"Strange and misleading article about LLMs," as explained by an anonymous contributor

THIS series coincides with Microsoft hype and vapourware (much-needed distractions). This part is a guest post of sorts, an unedited version of something a reader sent us. The supporting material is in there.

To be clear, you can talk to a chatbot. The chatbot won't talk back to you. It'll just spew out words, some of them merely plagiarised based on words similar to what you said (which the chatbot does not grasp, it's more like a Web search, except there's no attribution/link to source). Chatbots might lower the bar for journalism - to the point where Web as a whole will lose legitimacy, trust, value etc. Then what? Going back to physical libraries? Saying you can compose a physical book using a chatbot is like saying you can make a very large meal by assembling trash and cooking parts of it (LLMs are "digital pagpag"). Given reports like "Scammy AI-Generated Book Rewrites Are Flooding Amazon", this is already a real problem. "These 'AI' stock increases based on fake increases in revenue," an associate has remarked, "appear funded by mass firings to appease the LARPers in the financial community, That can only go on so long before they run out of people to take care of the core income-generating activities, a line which I suspect they have already crossed."

Will Microsoft also start spewing out "papers" or "publication" made by its chatbots, in order to generate hype about chatbots? That probably would not work, as the quality would not meet basic criteria.

Without further ado, here is the contributor's message:


I stumbled upon a recent article you may find curious.

While reading comments on a post at Bruce Schneier's blog, I saw a user who posted the following link as a kind of "proof" that conversations with LLM-based chatbots can be "useful" and "interesting."

Of course, it sparked my interest at first, but as I started reading it, red flags started to pop up here and there.

I do not know much about Quanta Magazine's credibility. At first, I thought that it was some semi-crackpot pop science news site, but after a shallow search, I saw a good rank from a fact-checking site.

The article was published on January 22, 2024, and the research it discussed was released on October 26, 2023. May be it does not mean much—just a few months—but it is a bit suspicious that the research paper is apparently not peer reviewed (just published on arXiv and cited in ~2–3 sources), and the article about it came out in parallel with "AI" swindle failure unraveling.

It seems like the article is desperately trying to spark new interest in readers regarding LLMs and chatbots, saying that there is some evidence that there is "much more than just autocomplete."

Following are some dubious parts.

1. The article talks about the "understanding" of something by LLMs but presents no clear definition of it.

The thing that can pass as a semi-definition (from the research paper)—"combinations that were unlikely to exist in the training data"—is, in my opinion, misleading for ordinary people. Much like other misnomers in the field (e.g., "hallucinations").

I guess it may be suitable to talk about "competence" instead, as in the "competence without comprehension" phrase from Dennet's writings.

2. The paper described in the article seems to support (or go in the direction of) the vague idea that if you shovel a lot of data and complexity into "AI" (LLM in this case), then "something" will emerge ("skills" and "ability to generalize" in this case, as stated in the paper and researcher's comments in the article). I find it concerning.

3. "Research scientist at Google DeepMind" among the authors of the paper, so it is probably not clearly independent (from corporate influence) research.

4. “[They] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight.”

Wait, what? Why is this part inserted in the article at all? Some guy from Microsoft is eager to tell us that LLMs are "something more." No bullshit. What a surprise!

5. The paper starts with this passage: "With LLMs shifting their role from statistical modeling of language to serving as general-purpose AI agents..."

I mean, what the fuck?! LLMs are not "shifting" anywhere; they are poorly shoehorned into use cases where a "general-purpose AI agent" is required (whatever it is, it does not exist in our reality anyway) by people who want to reap profits from selling half-assed "products" based almost entirely on lies! LLMs are definitely not suitable for general-purpose tasks other than text manipulation or some kinds of entertainment where facts, preciseness, and responsibilities do not matter at all.

One of the researchers acknowledges that it is not about accuracy.

"Arora adds that the work doesn’t say anything about the accuracy of what LLMs write. “In fact, it’s arguing for originality,” he said. “These things have never existed in the world’s training corpus. Nobody has ever written this. It has to hallucinate.”

I need to make it clear: I have no competence to review the actual paper; this task requires actual experts in the field.

As far as I understand the paper, the researchers devised some abstractions to describe observations they already made and try to construct a method that would be useful to work with their definitions and hypotheses that have a little in common with laymen's definitions (e.g., for terms like "understanding" and "creativity") and perceptions of the matter.

I tried to read the paper with an open mind to avoid at least some obvious biases. I have no problems with the paper; maybe it is actual useful research that will serve to advance the field (and not the companies of con artists)—I cannot say for sure.

What bothers me are the misnomers, misleading, and vague terms and descriptions in the paper (less) and the article (a great deal) based on it. In my opinion, the article commits the crime of severely misinforming the reader.

Other Recent Techrights' Posts

Daniel Pocock: "I've Gone to Some Lengths to Demonstrate How Corporate Bad Actors Have Used Amateur-hour Codes of Conduct to Push Volunteers Into Modern Slavery"
"As David explains, the Codes of Conduct should work the other way around to regulate the poor behavior of corporations who have been far too close to the Debian Suicide Cluster."
 
Links 18/05/2024: Caledonia Emergency Powers, "UK Prosecutor's Office Went Too Far in the Assange Case"
Links for the day
Microsoft ("a Dying Megacorporation that Does Not Create") and IBM: An Era of Dying Giants With Leadership Deficits and Corporate Bailouts (Subsidies From Taxpayers)
Microsoft seems to be resorting to lots of bribes and chasing of bailouts (i.e. money from taxpayers worldwide)
US Patent and Trademark Office Sends Out a Warning to People Who Do Not Use Microsoft's Proprietary Formats
They're punishing people who wish to use open formats
Links 18/05/2024: Fury in Microsoft Over Studio Shutdowns, More Gaming Layoffs
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Friday, May 17, 2024
IRC logs for Friday, May 17, 2024
Links 18/05/2024: KOReader, Benben v0.5.0 Progress Update, and More
Links for the day
Microsoft-Connected Sites Trying to Shift Attention Away From Microsoft's Megebreach Only Days Before Important If Not Unprecedented Grilling by the US Government?
Why does the mainstream media not entertain the possibility a lot of these talking points are directed out of Redmond?
[Meme] UEFI 'Secure' Boot Boiling Frog
UEFI 'Secure' Boot: You can just ignore it. You can just turn it off. You can hack on it as a workaround. Just use Windows dammit!
The Market Wants to Delete Windows and Install GNU/Linux, UEFI 'Secure' Boot Must Go!
To be very clear, this has nothing to do with security and those who insist that it is have absolutely no credentials
In the United States Of America the Estimated Share of Google Search Grew After Microsoft's Chatbot Hype (Which Coincided With Mass Layoffs at Bing)
Microsoft's chatbot hype started in late 2022
Techrights Will Categorically Object to Any Attempts to Deny Its Right to Publish Informative, Factual Material
we'll continue to publish about 20 pages per day while challenging censorship attempts
Links 17/05/2024: Microsoft Masks Layoffs With Return-to-office (RTO) Mandates, More YouTube Censorship
Links for the day
YouTube Progresses to the Next Level
YouTube is a ticking time bomb
Journalists and Human Rights Groups Back Julian Assange Ahead of Monday's Likely Very Final Decision
From the past 24 hours...
[Meme] George Washington and the Bill of Rights
Centuries have passed since the days of George Washington, but the principles are still the same
Video of Richard Stallman's Talk From Four Weeks Ago
2-hour video of Richard Stallman speaking less than a month ago
statCounter Says Twitter/X Share in Russia Fell From 23% to 2.3% in 3 Years
it seems like YouTube gained a lot
Journalist Who Won Awards for His Coverage of the Julian Assange Ordeals Excluded and Denied Access to Final Hearing
One can speculate about the true reason/s
Richard Stallman's Talk, Scheduled for Two Days Ago, Was Not Canceled But Really Delayed
American in Paris
3 More Weeks for Daniel Pocock's Campaign to Win a Seat in European Parliament Elections
Friday 3 weeks from now is polling day
Microsoft Should Have Been Fined and Sanctioned Over UEFI 'Lockout' (Locking GNU/Linux Out of New PCs)
Why did that not happen?
Gemini Links 16/05/2024: Microsoft Masks Layoffs With Return-to-office (RTO) Mandates, Cash Issues
Links for the day
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Thursday, May 16, 2024
IRC logs for Thursday, May 16, 2024
Ex-Red Hat CEO Paul Cormier Did Not Retire, He Just Left IBM/Red Hat a Month Ago (Ahead of Layoff Speculations)
Rather than retire he took a similar position at another company
Linux.com Made Its First 'Article' in Over and Month, It Was 10 Words in Total, and It's Not About Linux
play some 'webapp' and maybe get some digital 'certificate' for a meme like 'clown computing'
[Meme] Never Appease the Occupiers
Freedom requires truth. Free speech emancipates.
Thorny Issues, Violent Response
They say protests (or strikes) that do not disrupt anything are simply not effective. The same can be said about reporting.
GNU/Linux in Malaysia: From 0.2 Percent to 6+ Percent
That's like 30-fold increase in relative share
Liberty in Liberia? Windows Falls Below 10% and Below iOS
This is clearly a problem for Microsoft
Techrights Congratulates Raspberry Pi (With Caution and Reservations)
Raspberry Pi will "make or break" based on the decisions made in its boardroom
OSI Makes a Killing for Bill Gates and Microsoft (Plagiarism and GPL Violations Whitewashed and Openwashed)
meme and more
The FSF Ought to Protest Against UEFI 'Secure Boot' (Like It Used To)
libreplanet-discuss stuff
People Who Defend Richard Stallman's Right to Deliver Talks About His Work Are Subjected to Online Abuse and Censorship
Stallman video removed
GNU/Linux Grows in Denmark, But Much of That is ChromeOS, Which Means No Freedom
Google never designs operating systems with freedom in mind
Links 16/05/2024: Vehicles Lasting Fewer Years, Habitat Fragmentation Concerns
Links for the day
GNU/Linux Reaches 6.5% in Canada (Including ChromeOS), Based on statCounter
Not many news sites are left to cover this, let alone advocate for GNU/Linux
Links 16/05/2024: Orangutans as Political Props, VMware Calls Proprietary 'Free'
Links for the day
The Only Thing the So-called 'Hey Hi Revolution' Gave Microsoft is More Debt
Microsoft bailouts
TechTarget (and Computer Weekly et al): We Target 'Audiences' to Sell Your Products (Using Fake Articles and Surveillance)
It is a deeply rogue industry that's killing legitimate journalism by drowning out the signal (real journalism) with sponsored fodder
FUD Alert: 2024 is Not 2011 and Ebury is Not "Linux"
We've seen Microsofers (actual Microsoft employees) putting in a lot of effort to shift the heat to Linux
Links 15/05/2024: XBox Trouble, Slovakia PM Shot 5 Times
Links for the day
Windows in Times of Conflict
In pictures
Over at Tux Machines...
GNU/Linux news for the past day
IRC Proceedings: Wednesday, May 15, 2024
IRC logs for Wednesday, May 15, 2024