Skip to main content

Support WBUR

Brainwaves: Is AI actually thinking?

32:17
(Photo Illustration by Avishek Das/SOPA Images/LightRocket via Getty Images)
(Photo Illustration by Avishek Das/SOPA Images/LightRocket via Getty Images)

This is the third episode in 'Brainwaves: Mysteries of the human brain.'

The rate at which artificial intelligence is able to replicate human behavior has increased in recent years. Does that mean it's thinking like us? In the third episode of "Brainwaves," what artificial intelligence teaches us about our own capacity for thought.

Guests

Melanie Mitchell, Professor at the Santa Fe Institute. She’s an artificial intelligence and cognitive science expert.

Kyle Mahowald, assistant professor in linguistics at University of Texas at Austin. Co-author of the paper “Dissociating language and thought in large language models” published in the scientific journal Trends in Cognitive Sciences in June 2024.

Also Featured

Anna Ivanova, assistant professor of psychology at Georgia Institute of Technology.


The version of our broadcast available at the top of this page and via podcast apps is a condensed version of the full show. You can listen to the full, unedited broadcast here:


Transcript

Part I 

MEGHNA CHAKRABARTI: We are at episode three of Brainwaves, our special week taking a look at the human brain. So today we're going to talk about what is human thinking. When your brain is thinking, what is it actually doing? And then does AI actually think? How does AI help us learn how humans think?

Now, this is different from yesterday when we talked about consciousness. Today we're talking about thinking, and we'll discuss the difference between the two a little bit later. But first, we asked you, our On Point listeners, what would AI have to do to convince you that it's thinking like humans think.

Got a lot of responses. But this one from Bill Lester of Piqua, Ohio really caught our attention because he said his metric for AI achieving actual thought might be determined by the 1966 science fiction novel. The Moon Is a Harsh Mistress. And Bill, thank you so much for not mentioning HAL and his revolt against humankind.

But in that book, The Moon Is a Harsh Mistress, it features a supercomputer named Mike. And for our listeners what humans do with Mike might hold the key to AI.

BILL LESTER: They have a computer that has got so many responsibilities that actually wakes up and they give it jokes. They ask for it to write jokes. And if it can write a funny joke, originally, that was one of the first things that the character saw happening when he thought, oh my, something's going on here.

So I think a sense of humor and maybe ask a computer to write a funny, original joke would be a good start.

CHAKRABARTI: That's too tantalizing to not actually try.

So joining me now is Melanie Mitchell. She's a professor at the Santa Fe Institute, an expert in artificial intelligence and cognitive science. Also, author of the book, Artificial Intelligence: A Guide for Thinking Humans.

And She joins us today from Santa Fe, New Mexico. Professor Mitchell, welcome to On Point.

MELANIE MITCHELL: Thank you very much. I'm thrilled to be here talking to you.

CHAKRABARTI: Okay, we're going to do a live on the fly experiment here. I'm just determined to have wacky fun here today. You have a computer in front of you in Santa Fe and I've got a laptop also here in front of me.

We're over Zoom right now, and I am screen sharing with you my screen that has Claude AI open. Can you see that?

MITCHELL: Yes, I can.

CHAKRABARTI: Okay. So should we try this experiment that Bill has recommended about asking Claude to write a original funny joke?

MITCHELL: Absolutely. Let's try it.

CHAKRABARTI: Okay. So here, I'm going to type it in and then when Claude gives us the answer let's just see what happens.

You can give your response, but, okay. Write an original funny joke. Okay. I'll say, please. I don't know why I'm trying to be polite to the AI, but please write an original, funny joke. Okay, here it goes.

Oh, I have to press the arrow, not the return key. Sorry. Now Claude wants me to sign in. Why Claude? Okay. You know what? I'm going to come back and do this later because I did not sign in. I was using the free version. I do have a Claude account, but let me just ask you. The dangers of doing an experiment on live radio.

We will come back to that, I promise. Let's get to the question of what is thinking?Like how would you define what the brain is doing when it's thinking?

MITCHELL: Yeah, that's a very hard thing to define because it's not really a scientific term, if you will. We mean many different things by thinking and our notion of thinking, as humans, our notion of what thinking is has changed over time dramatically, often in response to AI.

CHAKRABARTI: Okay. In response to AI, so well before AI, then, what would we, what was the notion of what thinking is?

MITCHELL: Many people thought that thinking was a logic process.

A process that maybe was reflected in our language as we spoke, that we were expressing our thoughts. And it was all at a conscious level. And when we were solving a problem, we were going through the logical steps of solving the problem. And that was how AI was created back in the beginning of the field in the 1950s and 1960s.

It was all modeling human thought on logic, the steps of rational, logical deduction. But that didn't work.

CHAKRABARTI: Yeah. It's interesting because it was also strongly tied to language.

MITCHELL: Absolutely. Okay. There were many people who believed that language and thought were the same thing.

CHAKRABARTI: Ah, okay. That would actually put thought on quite a high and narrow plane in terms of the definition, if it's tied to logical reasoning as expressed through language.

So what were the limitations of that?

MITCHELL: It turns out that's not how we think and it doesn't really work very well. For example, if you try and get a computer to actually solve real problems in the real world, it required a lot more than that. And just trying to build it in, to program it in manually turned out to be very unsuccessful.

Which is why nowadays, AI systems are trained on data rather than programmed in, having all of that knowledge manually programmed in. They're trained and they learn from data.

CHAKRABARTI: Okay. Okay. So the difference then being that initially, just to explain it in a more layperson's term, that we were actually almost like by brute force coming up with code that we thought would replicate the process or the steps that a human being goes through.

When thinking, is that kind of what, it's an oversimplification, but is that what the approach was before?

MITCHELL: Yeah, absolutely. It was totally brute force.

CHAKRABARTI: Okay. And obviously now we have training on basically all the information out there on the internet. Exactly, yes. Okay. So I guess, I don't know if this is the right time to ask this, but I always wondered like, how is that done?

I don't expect you to explain the trillions of lines of code behind how that training is done, but it does take an imaginative leap for me, which I somehow can't successfully make. That there's still some sort of fundamental set of rules, right? That the AI's code has, that it applies to all the data that it's training on.

Is that accurate?

MITCHELL: Yeah, but it's actually very simple. Basically, the AI system is trained to predict the next word given a set of texts. So if I say, this morning for breakfast, I ate. And then it has to predict what the next word is. And for instance, if you say a bowl of oatmeal, that might have very high probability. Or if you say, a pile of children's blocks, that would be very unlikely for me to eat for breakfast.

And over many cycles, weeks or months of this training, on, as you said, everything on the internet, all kinds of books, all kinds of text, digital text. These systems are trained to predict what the next word is, given a sentence or a block of text, and that results in these systems learning the structure of language and the structure of what is likely to be said.

So that's basically it. Except that after being trained on all of this text, then they have to learn how to be chat bots to have a conversation. And so they're trained on conversations after all of that, what's called pre-training on language, they're trained on how to have a conversation and some other things, like not to be toxic, not to be biased.

All of those things, those are all through simple training on predicting the next word.

CHAKRABARTI: So is that how human beings think? At least with language-based law, like we're just doing word prediction.

MITCHELL: We don't understand that well how humans think. The brain is an incredibly complex thing.

We do word prediction; that's something that has been shown. That often, we just are very good at predicting what someone's going to say next. And you can try that with your friend. Try to have them say part of a sentence and you predict what they're going to say next. It's not that hard.

Language is very predictable. But we also have a lot more going on in our brains than just predict the next word. And that's something that's very different between how we think and how these machines behave.

We also have a lot more going on in our brains than just predict the next word. And that's something that's very different between how we think and how these machines behave.

Melanie Mitchell

CHAKRABARTI: Okay. So right there then we're seeing a difference, at least between the AI's that are the large language model type AI's versus even human thinking.

So would you say that the LLMs in doing their prediction and then shaping those predictions into conversational language, that it's not close to what human beings do, or is it just separate and apart?

MITCHELL: I think it's very different from what humans do. When we learn language, we learn it in the context of the world.

We're here in the world, think of a baby. It has senses, it's seeing things, it's hearing things, it's actually moving its body and getting reactions. It's not a passive learner of language. It's connecting language to its experiences and getting lots and lots of feedback from its carers and other people it interacts with in order to learn what a word actually means.

And language models are passive. They're fed all this text, they predict the next word. They get some feedback that says right or wrong, and they don't actually interact with the real world. It's only the world of language. So that's very different.

CHAKRABARTI: It seems to me that one difference of the others that we'll talk about later make the kind of thinking, quote-unquote, that AI is doing, like a small fraction of the capacity of thought that human brains have still.

MITCHELL: That's a big question, is what is their capacity? And they're actually very good at some kinds of things, some kinds of problem solving. They are very convincing as well. See, when we play with Claude, they're convincing conversationalists.

CHAKRABARTI: Thank you for mentioning Claude. When we get back into, because we are going to do it. I finally figured it out that I hadn't even logged in like a fool. See my not fully thinking today in my own brain, but today it's episode three of our series Brainwaves and we're talking about AI and human thinking.

Part II

CHAKRABARTI: I have always stood by the mantra that failure is actually good in life. Because you learn from your failures and just to be completely transparent with listeners, the reason why we failed in our Claude experiment in the previous segment is because I hadn't logged into Claude. So dumb. Okay. But you can you see the Claude window now?

MITCHELL: Yes, I can. It says, welcome Your Highness.

CHAKRABARTI: (LAUGHS) You outed me. Yeah. That was just me being cheeky when I signed up for Claude. Because I was like, I'm interested in AI as a tool, but I'm also dubious of it, so I thought I'll just put it in its place by calling me your Highness. Okay, excellent.

Beforehand we had listened to listener bill from Piqua, Ohio, and he talked about the book The Moon is a Harsh Mistress, and there was a supercomputer named Mike in that book that was asked to make jokes, like original jokes. So here we go. Okay. How can you help me? Claude, can you create an original? No. Of course it can. Please create an original joke for us. Okay. Fingers crossed that it'll work this time.

Here we go. Arrow. It's thinking. (LAUGHS) Did you read that Professor Mitchell? Can you see that?

MITCHELL: Yes.

CHAKRABARTI: You're not laughing.

Okay. It says: A pessimist and an optimist walk into a library. The pessimist whispers, I hate this place. It's so quiet and boring. The optimist whispers back, That's because all the good stuff is between the lines. And the librarian shushes them both. And the pessimist says, See?

Okay, now that I've read it, it is actually not funny.

MITCHELL: Yeah. I don't really get it, do you?

CHAKRABARTI: No, I don't get it either.

MITCHELL: Maybe you can ask Claude to explain it.

CHAKRABARTI: Oh, okay. Okay. That's a good idea. Explain this joke. Should I say I don't? Why --

MITCHELL: Say, why is it funny?

CHAKRABARTI: Why is it funny? Okay. Okay. Why is it funny? Oh, dissecting human humor. That's a good question.

Okay. Why is it funny? The joke works. Here's what Claude says: The joke works on a few levels. Oh my God, this is a long explanation. The setup primes you to expect the classic optimist sees the glass half full, pessimist sees the glass half empty dynamic, a familiar trope. I guess. The punchline mechanics, all the good stuff is between the line is a double meaning, blah, blah, blah, blah.

I didn't really see that as the punchline. Oh, but the real punchline is the librarian shushing them, unexpected intruder that reframes everything. Hang on, remind me tomorrow. It's a twist that makes the joke land. So a combination of wordplay, a subverted expectation and character truth. The pessimist doesn't need actually to do anything to go wrong.

He just needs something to happen. I don't buy it at all. It still wasn't funny.

MITCHELL: Yeah, I didn't find that funny.

CHAKRABARTI: So what's it doing here, right? Because if you read this explanation, you could think there's the recipe for writing a good joke. Except clearly, it's not because writing a good joke is supremely difficult, right?

Humor is not easy and it's one of those highly abstracted things that people can do that is really hard to replicate. What are you seeing here in the AI's explanation?

MITCHELL: The AI obviously has been trained on many jokes and also probably on textbooks if they exist, comedy textbooks that teach you how to tell a joke.

I don't know if they exist, but probably, and many analyses of jokes. And so it has the form of a joke. But as you said, what makes a joke funny is very subtle and it doesn't, hasn't really gotten that. I think that the AI systems, they can produce the form of a joke and maybe sometimes they get lucky and the joke actually is funny, but for the most part, I don't think they really get it.

CHAKRABARTI: Okay, so this is really critical, I think. That by definition, it seems, and correct me if I'm wrong, but in doing the predictions, right, while these models are being trained, they start out with a set of rules and then maybe those rules evolve and change as they're absorbing information and learning from each prediction set that it goes through.

But like how much are those rules changing? Because this still doesn't seem to be all that sophisticated. I'm going to try right now, I'm going to tell Claude, that joke wasn't funny.

MITCHELL: Oh, it's going to apologize to you. I bet.

CHAKRABARTI: I'm going to say please try again.

Make a funnier joke. Let's see what it does.

MITCHELL: Okay. Perfect.

CHAKRABARTI: But it still seems like the rule following is so absolute that is also not what human brains do. And I just might be like romanticizing human thought here as something more than the most sophisticated computational device that we have in the universe.

But it seems like somewhere originality emerges in human thought, which is not what we're seeing here, even though we're asking Claude to come up with an original joke.

MITCHELL: Oh, this joke that it just came up with is a little racy. Okay.

CHAKRABARTI: Oh my God. (LAUGHS)

Oh, can I say that word? Can you say this on the radio? I can't say that word. Holy cow. ... Okay guys, it has to do with human anatomy. Let's just put it that way. Do you know what made me laugh? It's not the joke that made me laugh. What made me laugh was, as you said, the context, we're in this context of being on a radio show where we can't actually say that word, and that was funny.

MITCHELL: Yeah. It's like you didn't like my last joke. How about this one where I'm going to actually be a little bit not safe for work.

CHAKRABARTI: (LAUGHS) Okay. Wow. I did not know that Claude did that. Is that a surprise to you?

MITCHELL: A little bit. I would expect that of Grok, but not of Claude.

CHAKRABARTI: Oh, yeah. Okay. I'm going to read most of it except the questionable word.

It says, A man walks into a psychiatrist's office wearing nothing but plastic wrap. The psychiatrist looks up and says, I can clearly see your [expletive]. Anyway, so tell me, this is, again I'm trying to understand then, how is it that, let's say, Claude's process right now, what do you see in that process that is helping us understand more about how human brains think?

MITCHELL: Oh, that's a great question. One thing that we don't know is how much of Claude's training data, it's memorized. Probably quite a lot, and it's possible that this very joke or something similar was in the training data. That's something that's hard to know.

I know I had an experience with my husband came up with this very silly riddle that he gave to ChatGPT and ChatGPT got it and he was quite impressed. And here's the riddle. The riddle is if you add an 'a' to a processed meat and get a shellfish, and the answer was abalone.

CHAKRABARTI: Oh, okay.

MITCHELL: He thought that was very impressive. But then we went back and we did a little bit of searching and we found that was actually a joke that was used on The Simpsons, an episode of The Simpsons.

CHAKRABARTI: (LAUGHS) So it just stole it essentially?

MITCHELL: It just stole it, yeah.

CHAKRABARTI: Oh, wow. Okay.

MITCHELL: So we don't know, that's something that's hard to know.

It's like the training data is so vast. That it's hard to know what's in it and what's not in it. And these systems are able to paraphrase, they're able to summarize, but they're also able to come up with new things on occasion. And I don't know. Listeners can look up the jokes about men walking into psychiatrist's office wearing nothing but plastic wrap and seeing if they can find any jokes with that opening.

CHAKRABARTI: It's so good. It just reminds me, again, humor is, I'm so glad we landed on humor as a thing to discuss. Because it is so complex and yet at the same time, accessible. And entirely, except for physical humor. It's language based and context based. Like one of my kids last year came up with a joke or claims that she came up with a joke that goes something like this.

It goes why was the dumpling so dangerous?

MITCHELL: Okay, why?

CHAKRABARTI: Because it's capable of wanton destruction.

CHAKRABARTI: Oh, that's a good one. See, that actually made me laugh. But so simple. Okay, so we're gonna just bring another guest in here. But first let me let's turn to Anna Ivanova because she's an assistant professor of psychology at Georgia Tech, who co-authored a paper with the person I'm about to introduce.

And Professor Ivanova says, it's hard for us to disentangle our own understanding of thinking as it relates to language, because we're the only other beings on the planet who use language.

ANNA IVANOVA: If we have a system that uses language, we automatically infer, oh, they must be good at reasoning. They must have thoughts and ideas and plans and desires.

But in fact, these language models arguably don't necessarily have all of these things. And so it's really important to separate how good they are at producing grammatical coherent language from how good they are at actually thinking.

CHAKRABARTI: Nevertheless, Professor Ivanova says, AI models do offer an opportunity to get a better understanding of human thought.

IVANOVA: What we can do with this model that we cannot do with the brain, if we can take them apart these models, if they are openly available, maybe custom built in the lab, we can then trace how information flows through this model step by step. So people often call these models a black box. But it's really a wide box so we can look inside.

We can see what is inside. It's still difficult to understand how these mathematical updates in the activations of the model translate into behavior, but it's an open problem and an interesting problem, and similar to the kind of problems that we face when trying to understand the human brain.

So that's Professor Anna Ivanova at Georgia Tech. And joining us now is Kyle Mahowald. He is an assistant professor in linguistics at the University of Texas at Austin, and co-author of this really interesting paper called Dissociating language and thought in large language models. Kyle, welcome to On Point.

KYLE MAHOWALD: Hi, Meghna. Hi, Melanie. Great to be here.

CHAKRABARTI: Feel free to tell me to put prompts into Claude as we continue on, because I'm really enjoying having another guest being an AI here. But so first of all, I just want to quickly point out something that Professor Ivanova said, which I do question a little and I want to get your take on it.

Because the presumption was it's hard to understand LLMs in terms of whether or not they're thinking, because we're the only other beings on the planet that use language, and I don't necessarily think that's true. We've found evidence that there's some kind of oral in terms you can hear it communication between say, dolphins or whales.

So is that, I just wanted to bring that up as perhaps a point of contention, Kyle.

MAHOWALD: Yeah, I mean I think it's fair to say that plenty of other animals like from apes to dolphins and whales have communication systems that are not like human language, but are certainly communication systems. They might be reasonably complex in their own ways and to some extent it might be a matter of degree rather than kind.

But I think it's certainly fair to say that there are really important differences between human language where we can express really abstract thoughts, right? So we could talk about something very far away in time or space. So when Melanie was talking about predicting what you had for breakfast.

I can get a very good picture of that even though I'm here in Austin, Texas, far away. And that kind of very abstract ability is something that, as far as we know, is pretty uniquely human.

CHAKRABARTI: Ah, okay. This has come up in another show that we did a while ago. So there's a distance factor that makes the brain human language unique.

Or communicating abstract thoughts across distance. That's really interesting. Okay, so let me turn, let me ask you the question that I asked Melanie, right at the top. How would you define what human thinking is? Do you have, do we have a workable definition?

MAHOWALD: Yeah, like Melanie said, I think that is a tricky word to define.

I might give a slightly annoying answer as a linguist, which is that I tend to have a very functional view of what words are and why we have them. And so I would say that what thinking is exactly what helps us if we can use the word thinking to help us communicate better. I would say that's thinking.

And so if we didn't have a word, like thinking or thought, and I had to describe what I'm doing when I'm trying to come up with ideas or trying to reason through a problem. I might have to be very verbose and use a lot of extra descriptions, and it turns out thinking and thought are very useful shorthands for communicating to you what it is that I'm doing.

And so my answer for something like, is AI thinking, would be if it turns out to be useful and that these concepts that we've developed, like thinking and thought. Help us better communicate about AI and better understand it, then I'd be happy to apply words like thinking and thought to AI. And I think we might be in something like that case at this point.

CHAKRABARTI: Okay. You know what, I'm going to ask Claude something. I'm going to ask Claude, When you generate, sorry, guys. Generate responses to prompts. I can't spell. Responses to prompts, oh my gosh. Prompts. Are you actually thinking? What do we think Claude might respond?

Okay, so the question is, when you generate responses to prompts, are you actually thinking? Okay, let's see what it does.

It's writing a novel. It's writing a novel. Oh, it's still going. Okay. It starts with, it's one of the genuinely hard questions, and I want to be honest with you, rather than give a tidy answer either way. Can we just dissect that quick? Oh my gosh. We're coming up into a break. This is an AI says, wants to be honest with you, that is just a program. That is just a conversational response. It doesn't necessarily mean that the AI is debating actually whether it should be honest or not. Kyle? That's correct?

MAHOWALD: Yeah, I think that's right. And I think Melanie had mentioned that modern models get a lot of training. After they've done their pre-training, they get a lot of very specific training to make sure they don't say things that are inappropriate. And one of the areas they get trained a lot in is how to answer exactly questions like this. Because these companies know people are going to ask, are you thinking, are you conscious? And so some of that might be reflected in that process.

One of the areas [AI] get trained a lot in is how to answer exactly questions like this. Because these companies know people are going to ask, 'Are you thinking, are you conscious?'

Kyle Mahowald

Part III

CHAKRABARTI: Kyle, I'm sorry you can't see this, but Melanie can. I just want to dive a little more deeper into what Claude, the AI, answered into the question of when you generate responses, Are you actually thinking? So the AI said something is definitely happening when I process a prompt.

There are computations and functions like weighing options, et cetera. So it says when it's generating something, there's a process involved with that. I think that's not really up for debate, right? But then it goes into this almost sort of metaphysical exploration of what constitutes thinking, how do human beings experience it.

And it says the deeper issues that even the definition of thinking is contested. What I can say is that I don't experience my responses as mechanical or hollow from the inside, but I also can't fully trust my own self report on that, since I could be wrong about my own inner state.

CHAKRABARTI: Kyle, parse that for us.

MAHOWALD: Yeah, it's a really interesting response. I think this touches on an area that I'm actually quite interested in right now in my own research, which is introspection. And whether we can actually trust AI reports about their own internal states. And so here it's giving us some description of what it's experiencing when it's thinking, and it's possible that is in some way a faithful or vertical description.

Or it's also possible that it's ingested a lot of science fiction. It's ingested a lot of text about robots reporting on their own internal state, and then is somehow using that to describe not necessarily what it's actually feeling, but somehow reporting on the kind of thing that one might expect an AI to say about how its internal processes work.

And so there's been interesting work lately. There is a recent paper out of anthropic on AI introspection, trying to test this. And it's actually a pretty tricky thing to actually come up with good empirical tests for whether these are in any way meaningful reports of introspection.

CHAKRABARTI: Yeah. And just for listeners I am using Claude AI, which is Anthropic's AI.

But Melanie, I wonder, even though I'm dubious about this issue of is AI actually thinking in the same way that the human brain is? I'm going to argue against myself here, because as Kyle just talked about, like an internal state for a human being is also highly influenced by information inputs, right?

Like my internal state is mediated partially by, of course, the information from my environment, what I'm reading, what I'm listening to, et cetera, et cetera. So how is that actually different than a quote-unquote internal state of an AI?

MITCHELL: I think these are tricky questions of course, but you have an internal state.

That feels like you have a notion that you are an entity, you have a self, you can talk about yourself. And that's an actual real feeling that you have. Whereas Claude talks about itself, it uses first person pronouns. It uses, I, it talks about its experience, but it's very unclear that it actually has anything like that. That it has any sort of self or any kind of experience in the way that we would talk about that.

And I think that's the crux of a lot of big debates about these models. But there's really, it's hard to get at what's happening here and whether we're fooling ourselves.

Because these systems are using this kind of language and using this first person pronoun, which they've been trained to do, that really can fool us into thinking that they are actually having some kind of self-awareness or experience.

CHAKRABARTI: Don't we have that same problem with people though? From a scientific standpoint, we would verify our hypothesis about, is there an internal state to a person? Through observation, through measurement. But the only, maybe I'm wrong about this, but it seems to be the only real way to observe whether a person has an internal state is for their own self reports.

Which is the same thing as with Claude telling me, I can't trust my own self report on that, but here it is. Do you see what I'm getting at Melanie?

MITCHELL: Yeah. I do think this is what makes it very tricky. But we know a lot about humans. We know that they are consistent to a much bigger degree than an AI model. You could give another prompt to Claude, and it would say something entirely different, a completely different opinion. We also know that we have consistent set of experiences that we remember that form us, that shape us. We have what people call autobiographical memory.

I remember the experiences and events in my life. Whereas Claude, if you now turn it off and turn it off back on again, it has absolutely no memory of this conversation. So it's not clear what it would take to test, to verify that it has this self-awareness. It's very tricky.

But I think it's dangerous because we humans have such a bias towards things that communicate with us in fluent language. And we have to be very careful about the fact that we've been fooled in the past by chatbots. This isn't a new thing. And we are very bad at overcoming that kind of anthropomorphic bias that we have.

CHAKRABARTI: Yeah, absolutely. Kyle, did you want to respond to that?

MAHOWALD: Yeah. So this is something I think is important to keep in mind. And this was something we wrote about in this Dissociating Language and Thought paper. That for the vast majority of human history, if something could communicate in fluent, coherent language, it meant that underlying that was fluent, coherent thought.

And I think a couple years ago, the models, I think to the surprise of a lot of people in linguistics, became able to produce fluent, coherent, syntactic sentences. And this was something that was really surprising to people in linguistics. Because this is actually a real challenge, to produce coherent sentences. But it didn't necessarily mean the models were thinking or could do mathematical reasoning or social reasoning or all of these other kinds of things.

Having said that, all of those capabilities, even in the last two or three years, have really started to improve. Just in the same way that the language ability came, a lot of these other abilities and capabilities also seem to be coming along. And so it's really a bit of a moving target.

And so I'd say the state of where AI capabilities at in terms of these higher-level cognition properties is a quite different one than maybe two or three years ago. It's going fast.

CHAKRABARTI: That's interesting. Because I was just gonna ask both of you that we're really focusing on language right now. Because I guess for obvious reasons, as we've discussed, but the brain, the human brain thinks in a lot other ways.

As both of you have mentioned, there's visual thinking, there's social and emotional thinking that goes on. Melanie, you had talked earlier about physical thinking, which I think is utterly fascinating, especially, my mind wanders to all those incredible athletes right now in the Winter Olympics, that as they're using their bodies in the highest form possible.

There's so much thinking going on in just the very actions they're doing, whether it's on the mountain or on the ice. And Kyle, are you saying that like even in those realms of thinking that AI has made progress.

MAHOWALD: So in the physical realm, I think things have not accelerated quite as much.

I think robots are not quite at the level of these language models, but in things like math and certainly writing computer code. The AI assistance that help break computer code have made dramatic improvements even in just the last couple months. So people who do a lot of software engineering and write a lot of code have started to report that they're becoming pretty indispensable to their productivity and their workflow in a way that's very different than even a few months ago.

CHAKRABARTI: Okay. Before we get to the next part of the conversation, because I do want to talk about how AI is actually helping us understand human thinking more. Let me just give Claude another prompt here. Because I'm fascinated. Okay. I'm going to ask Claude, Please come up with a test that we could apply that would help, that could show whether or not, you, Claude AI are actually thinking. Let's see what happens.

Okay, while we wait for that, I want to go back to Professor Ivanova. And she talked to us about semantic reasoning or sifting through basic world knowledge for information that's relevant to a specific question, and her lab is currently working on understanding this process.

IVANOVA: We bring in human participants into the lab. We're scanning their brains as they do the task to try and see what parts of the brain are involved. And then we're running a similar experiment on AI systems. And there we can look at how similar the behaviors are and then look inside and compare and contrast the two.

But in AI, we can do it with much more. Which might better resolution. Because we have access to every single artificial neuron.

CHAKRABARTI: Okay. So Professor Mitchell, tell me more about how AI is helping us understand better human thought through modeling, what we know of human thought through these systems.

MITCHELL: Yeah, I think that AI throughout its entire history has been teaching us about human thought, both by showing similarities and differences and also by questioning our assumptions. So for example, back in the 1970s, many people thought that a computer that could play chess at a grandmaster level would obviously need to have a human level, general intelligence that chess would be a good benchmark for that.

But of course, then we had these models like IBM's Deep Blue, which beat Gary Kasparov at chess became better than any human without anything like general thinking. Now we're facing that same, the question about fluent conversation that's been known as the Turing test. Can a machine fool you into thinking as a human through conversation?

And it turns out, and this I think was a big part of Kyle and Anna's paper that you've been talking about, was that language generating fluent language is not the same as thinking that it can be done without this kind of more, all these different kinds of mathematical, physical, et cetera, cognition.

But on the other hand, we're seeing some similarities. As Anna said, they're comparing activations in these models with activations in human brains. But it's a little, that's a little tricky to do. Because these models are very different than human brains.

But it is showing that this kind of predictive processing, predicting the next word, predicting the next scene in a video. That this is something that we humans do quite a bit as part of our thinking. We do prediction and that's, people are seeing a new kind of focus on prediction as a really important thing that is involved in when we think.

CHAKRABARTI: Okay. Kyle, before I get your take on this, I should, I just want to briefly share what Claude responded to when I said, come up with a test that could show you're actually thinking or not. And it mentioned the Turing test, but that one's been you conquered already. And then Claude mentions novel problem solving under genuine novelty.

Present the AI with the problem that it's never actually seen in any of its training data, that would be cool, but challenging, productive failure, a thinking entity should be able to recognize when it's stuck and explain why it's stuck. And then there's another one, and I'll skip to the last one.

Spontaneous metacognition. Does the system notice things about its own reasoning? Unprompted. Kyle, I'll give you the last here. What do you think about that? And also just would love to hear you on how AI is actually helping us understand our own thoughts.

MAHOWALD: We did. I think that was not a bad list of tests.

Yeah. Although some of them are tricky, right? So like Melanie said, the training data is enormous. And so figuring out what is or is not in the training data can actually be really tricky. Then yeah, in terms of what is AI teaching us about humans, I'm really optimistic that we've learned a lot from that and we're going to continue to learn a lot from that.

On the linguistic side, I think there have been these longstanding questions about what it takes to learn language. So it's this incredible feat that babies and toddlers acquire human language with relatively little input. It was long thought that to do that, you had to have a lot of built-in machinery in the human genome and the human brain for acquiring things like grammatical categories, like subjects and objects.

Language models don't have that. They learn, like Melanie said, from predicting the next word. And it turns out that signal is incredibly rich. And so there's a ton of information there.

The first draft of this transcript was created by Descript, an AI transcription tool. An On Point producer then thoroughly reviewed, corrected, and reformatted the transcript before publication. The use of this AI tool creates the capacity to provide these transcripts.

This program aired on February 11, 2026.

Headshot of Willis Ryder Arnold
Willis Ryder Arnold Producer, On Point

Willis Ryder Arnold is a producer at On Point.

More…
Headshot of Meghna Chakrabarti
Meghna Chakrabarti Host, On Point

Meghna Chakrabarti is the host of On Point.

More…

Support WBUR

Support WBUR

Listen Live