How to Use ChatGPT Without Brain-Rot
An MIT study on AI use provides a caution, but also a tip
Is AI rotting your brain? Maybe. But it might depend on how you’re using it.
A few months ago, MIT released a study that produced some pretty alarming headlines—“AI’s great brain-rot experiment,” for example. This week, I finally got around to actually reading this scientific paper, and while it is indeed concerning, there is a very important nuance that escaped the headlines.
Let me start by describing what the researchers had people do in the first part of the study, and what they found:
Students wrote timed essays based on different SAT-style prompts, like “Does true loyalty require unconditional support?” or “Must our achievements benefit others in order to make us truly happy?” But the students were split up into three different essay-writing groups, each with different conditions.
Some of the students were afforded only their brains, while others were allowed to search Google (with the AI disabled), and still others could use ChatGPT (but not Google). Every student wrote three essays—across three separate sessions—with a different essay prompt each time. While they were working, researchers recorded the electrical activity of their brains, basically to see how strongly different brain areas were talking to one another. The researchers also interviewed the students after each essay, asking them to recount information from what they wrote. Finally, both an AI judge and real-life teachers graded the essays.
There was bad news for the ChatGPT group, and also worse news for the ChatGPT group. The bad news was that their essays were more similar to one another—even using common phrasing and examples—and they were worse than those of the brain-only group, according to judges (who were unaware which group produced a given essay). But these are students, so who cares how good their essays are if they’re learning, right?
The worse news for team ChatGPT was that the students in that group—unlike those in the other two groups—couldn’t produce a single accurate quote from their own writing when interviewed by the researchers. And as for the measurements of brain activity, the “brain-only” group showed the strongest connectivity between brain regions, followed by the Google-search group, with the ChatGPT group bringing up the rear—precisely what you’d expect if the heavy cognitive lifting was outsourced to the tool.
So that’s not good. Fancy neurological measurements aside, the fact that the essay writers using ChatGPT could not remember the work they had just completed seems like a pretty good indication that they were not learning.
For students, in most cases the point of essay writing is not to write the most polished essay in the history of humanity, but to learn how to organize their thinking, synthesize ideas, and communicate effectively. The point is the act of writing, not the final product.1
In cognitive psychology, there is something known as “desirable difficulties.” These are obstacles that make learning more challenging, slower, and sometimes more frustrating in the short-term, but deeper in the long-term. One example of a desirable difficulty involves what is known as the “generation effect.” Simply: forcing yourself to come up with an answer to something before having one given to you primes your brain for subsequent learning. (Chapter four of Range—”Learning, Fast and Slow”—is all about desirable difficulties.) It would seem that, for these essay-writers, ChatGPT was the precise opposite of a desirable difficulty. It allowed the essay writers to give answers before doing their own thinking. Let’s call it an “undesirable ease.” (Please submit better coinage in the comments below!)
All that said, there is hope embedded lower down in this paper.
After the three initial essay sessions, a subset of the students returned for a fourth round. This time, the students who had been using ChatGPT were told they now had to write without any tools at all. And the students who had previously been writing with just their own brains were allowed to use ChatGPT.
The returning students were asked to pick from essay prompts they’d seen before and to write another timed essay. To reiterate: this wasn’t a brand-new topic; it was one they had written about previously. The question was: What happens if you add AI after people have already done some unaided thinking, versus taking it away from people who’ve been leaning on it from the start?
The results split in a really revealing way. The former “brain-only” students actually did quite well with ChatGPT. Their new essays, on topics they’d already wrestled with on their own, were judged mostly above average across all the groups. Their content was better structured than in their earlier, brain-only essays, and they used ChatGPT more for information seeking rather than for spitting out answers to copy.
The measures of brain activity also showed more connectivity between brain regions among the students who went brain→ChatGPT compared to the students who went ChatGPT→brain. The brain-first group also remembered what they had written, unlike the ChatGPT-first group. The group that started with ChatGPT before turning to their own brains struggled to recount what they’d written previously, and wrote about a less diverse set of ideas with less unique phrasing even when AI was taken away. It was as if they were stuck in the furrow AI had plowed for them, even when it went away.
The authors of the study argue that early, heavy reliance on the AI seemed to have encouraged “shallow encoding”: The work got done, but it didn’t get deeply integrated into memory. They also use the term “cognitive debt”: If you repeatedly lean on ChatGPT upfront, you defer effort now but pay later, with weaker critical thinking, poorer recall, and more superficial engagement with ideas.
I think it’s worth noting that this was one study—and a “preprint” at that, which means it hasn’t gone through peer review yet. But it generated a boatload of news articles, so I think it’s worth commenting upon. I think it also fits conceptually with other work on undesirable ease in learning and how that impacts the brain. More on that in another post soon. For now, my takeaway is clear: brain first, ChatGPT only after.
Thank you for the time you took to read this post. If you’re a subscriber, you may have missed my last post (in which I was interviewed by Author Insider From The Next Big Idea Club). It was a cross-post, so did not go to all subscribers.
And if you’re not subscribed, please consider doing that here:
If you appreciated this post, I’d appreciate it if you’d share it.
I’ll be doing another post soon building on this one.
Until then…
David
Justin Cerenzia made a thoughtful comment on this paragraph, so I want to share it here: “We’re seeing similar things as we pilot an LLM-assisted (question generation based on student writing) oral assessment platform (with a video component). But I might quibble with this part [that for students the point is the act of writing, not making the best finished product]. Especially in secondary schools, the point is the product. So they become competent at producing a product, but not, necessarily, at learning.”





Interesting. My son (7th grade) uses ChatGPT to figure out coding for physical computing....and find answers on trying to modify a dirt bike. My husband uses ChatGPT to write his emails so they sound very polished. He's an eloquent talker, but not writer. I'm the opposite (better writer than speaker) and hardly ever use ChatGPT - and wonder if I should use it after a first essay draft? I worry that it will change my "voice" in writing.
Will Storr posted something about Substack essays and ChatGPT and how people use it to post their essays - people generally like the ChatGPT essays more but the essays all start sounding the same.
I actually find this paper more informative - clear and rigorous, and easier to interpret (without all the brain imaging stuff)
https://www.sciencedirect.com/science/article/pii/S0747563224002541
TL;DR: LLM use reduces friction, leading to more superficial engagement with material, leading to more shallow argumentations.
This study explores the cognitive load and learning outcomes associated with using large language models (LLMs) versus traditional search engines for information gathering during learning. A total of 91 university students were randomly assigned to either use ChatGPT3.5 or Google to research the socio-scientific issue of nanoparticles in sunscreen to derive valid recommendations and justifications. The study aimed to investigate potential differences in cognitive load, as well as the quality and homogeneity of the students' recommendations and justifications. Results indicated that students using LLMs experienced significantly lower cognitive load. [following is the key point} However, despite this reduction, these students [that used LLMs] demonstrated lower-quality reasoning and argumentation in their final recommendations compared to those who used traditional search engines.