You know I had this paper conscious Exotica in 2016 um and and then I joined Deep Mind in 2017 and at that point I'd been thinking and writing quite a bit about conscious Consciousness up to that point but then I I sort of stopped because I thought it I didn't think it seemed appropriate for somebody working in a corporation to be talking about Consciousness especially in the context Of AI because it might sound like you know we're trying to build conscious AI which I don't think is a good look or or a particularly good project

either say the thing is with today's generation of large language models um I think it's becoming increasingly difficult to avoid the subject because people whether we like it or not will ascribe Consciousness to to the things that they're interacting with and we see this uh left right and Center people even People who know exactly how they work um you know are say things like well I think their large language models are a little bit conscious Ilia suar said and um and we had the Google engineer who ascribed Consciousness to uh uh to one of our

models and so and and I think we're going to see this more and more and more and so whether or not you know I think it is the right term to it is appropriate to talk about them in terms of Consciousness it's going to happen Anyway so I think it's really important to actually think through these issues and and and think well what you know what do we mean when we use the word Consciousness and and and and how do we apply it to Exotic cases and how it and this is really really important how

might our language change to accommodate these exotic and strange things that have come into our lives what goes through your mind when you speak with a language model who is it that you think You're talking to do you anthropomorphize them now Janice from Les wrong A couple of years ago he put out an article called simulators and the basic idea is that a language model is like a simulation machine producing manifestations of role players which we willfully anthropomorphize we think of them as humans they're imperfect copies they don't capture the essence they are glitchy right

and actually they wear Masks you know the shog off theory of language models where there's you know this big gnarly shog off and then we do rhf and it's a smiley face on the top well that's the human mask which we anthropomorphize but we don't think enough about what lies behind the mask do you know what lies behind the mask it's a monster the danger of anthropomorphism I think is in uh is in thinking that uh that a system such as a large language Model you know chat b or something thinking that it has capabilities

and that it doesn't it's as simple as that actually it's also thinking perhaps that it lacks capabilities that it does so so in so in both cases I think we can go wrong we can go wrong by because they exhibit very humanlike linguistic Behavior we can just assume that they are going to be very humanlike uh in general in all of the rest of the behavior that we encounter with Them we can find that uh uh in one moment a large language model might make a ridiculously stupid mistake that no child would make and and

and and in the next moment it's saying something extraordinarily profound philosophically and because that's what I think is actually going to happen and what needs to happen I think we need to find new ways of using the vocabulary we have new forms of vocabulary um I've sometime I've used the phrase Consciousness adjacent language yes so we need to find new ways of thinking and talking about these things to recognize the fact that um that they do exhibit Behavior which we're inclined to talk of in in terms of Consciousness and that indeed people are going to

start to Value as well so we I think we do need new forms of language and new forms of thinking to accommodate all of this yes and our language is is so um adaptable that I think it's just a Natural Evolution when we have these new artifacts thrust into our lives we will need to adapt our language we will but I think there'll be some you know disruption while people disagree about how to talk about about these things and that's that's uh you know inevitable I think Mar Shanahan is a principal research scientist at Google

deep mind and professor of cognitive robotics at Imperial College London he was also educated at Imperial and Cambridge University his Publications span artificial intelligence machine learning logic dynamical systems computational neuroscience and the philosophy of mind he was scientific adviser on the film exmachina and he penned embodiment and the inner life in 2010 and the technological singularity in 2015 Shanahan has spent his career understanding cognition and Consciousness in the space of possible Minds he said that this space of Possibilities encompasses biological brains human and animal as well as artificial intelligence he worked in symbolic AI for

over 10 years concentrating on Common Sense reasoning and he then spent 10 years studying the biological brain specifically how its connectivity and Dynamics support cognition and Consciousness and he developed a particular interest in global workspace Theory after that he went to Deep Mind he pivoted to deep Reinforcement learning and recently he's been working extensively with large language models trying to understand them from a theoretical philosophical and practical perspective Professor Shanahan I was absolutely fascinated when I read the article from Janice called simulators could you sketch out the article um well well I can sketch out

some elements of it yeah I was uh I mean I was also very impressed and influenced By that by that article so basically they are um advocating a certain way of looking at large language models and their and their behavior what they say is that we should think of a large language model as a as a kind of simulator which is capable of simulating a kind of language producing processes of of of various sorts and it's capable of simulating all kinds of language producing processes and in particular it's capable of simulating People um uh humans

and it's capable of of simulating different kinds of humans so humans who are playing different sorts of roles who uh maybe uh maybe humans who are uh you know helpful assistants or humans who are uh crazy Psychopaths um uh and and indeed uh in indeed in their way of thinking of things so these are all examples of of simulacra um so uh so simulakra in their conception include actually not only human beings but any anything that Produces language at all so your base model can produce can simulate you know any anything anything that can generate

language if you like um um and uh now that of particular interest of course are humans you know and and human language producers so the particular class of simulor that I'm interested in really are are you know humans playing different roles and so so in the in the work that I've uh that I've been doing um I've been thinking of language models In terms of roleplay and in terms of um uh their ability to to play the play play a part if you like um and uh so so this is very much it was very

much inspired and and you know Drew on on on the work on this work of of Janus now they make another very very uh interesting and important point in that in that article um which is that they say draw draw attention to the fact that uh large language models any point in a conversation in an ongoing conversation Uh then the next word that's produced in this conversation on the next string of words the next sort of sentence is the product of a stochastic process so you so so what the actually underlying language model actually generates

is a distribution over the possible words that might come up next and then what you do is is then you sample from this distribution to to come up with an actual word and then that's the word that you give back to the user so for Example if the if the um favorite example I use of is is if you've got if if you ask the U uh the language model to tell you a story and then it says once upon a time there was and at that point it's going to generate as in all the

points up to that as well it's going to generate a distribution of the possible tokens that might come next possible words that might come next so once a point upon a time there was and it might say you know a beautiful Princess or a handsome prince or a fierce dragon and it could say any of those things depending upon uh the sampling process and then the point is as you come back to that same you could you you can rewind the conversation come back to that point again sample again as we can all do with

the interfaces that we have and get a different answer again and take the whole story off in a completely different direction and so they what they draw Attention to is the is the fact that that any particular ular point in a conversation the the the the set of roles that there's really a whole set of roles that are being played by uh by the underlying simulation at any one point and the conversation shapes what role is being what role is being played so in that sense it's sort of unlike a human being because uh because

you've got as they put it a whole superposition of simulacra that are all being simulated All at once and as the conversation progresses then the actual set the actual kind of distribution of simulator is being narrowed down yes but as you say you can view language models at the lowlevel in terms of being next word generators or what we strive to do in science is come up with explanations that demarcate the thing very very clearly and this idea of the language model the language model being a simulator which produces the simulat and You said in

your roleplaying article on nature that if you had a UI which was sufficiently Advanced you could actually play with counterfactual trajectories and start to understand how sticky the simulator are because as you pointed out when the language model says I sometimes it's talking about chat gbt or whatever it's talking about the simulator sometimes it's talking about the simul Crum and these things are trained on everything on the internet you know Structured narratology essays novels and it's fascinating to see how you can jump between these different parts of of the trajectory structure and in your nature

paper you gave a beautiful example which was the 20 Questions game I mean would you mind introducing that yeah sure so I think we're probably all familiar with the 20 Questions game so what so one player uh thinks of an object and the other player has to has to guess what that object is by asking uh a whole Bunch of questions with yes no answers so um so I might think of I I might in my head think of a pencil and then you might say oh is it uh larger than a house or smaller

than a house and I say actually that's not a yes no but it's a binary answer is it larger than a house and you'd say oh no you know is it made of wood yes is it a tool yes you know and eventually you might guess the answer so we're familiar with this this little game um uh and you can play this With a large language model of course and you can ask the large language model to play the part of the setter who thinks of the uh uh thinks of the object and then you

play the part of the guesser who tries to guess the object by asking questions now if you do this with a large language model model um well if you do it with a person if a person is not cheating as it were they will think of in their head they will think of an object and then they'll fix that object In their mind and that's that's and then they'll answer the question according to what object they've thought of in in advance Now by a large language model can't really do that unless you use some some

hack or another so what it's what it really does is it just so you say to oh think of an object and it says I've thought of an object it hasn't really thought of an object it's just issued the the tokens to say the words to say that it has but then you will ask a Question and you'll say is it larger than a house and it'll say no and and uh then eventually if you say oh I give up tell me what the object is then it will say oh I was thinking of a

pencil and it will indeed um give you a you know typically will give you an object that's consistent with all of the answers it gave to all of your questions but then if you just wind back one one uh and and resample and ask it again and say I get you know I give up what were you Thinking of it might say a mouse or a you know or a a bottle or you know it could say something completely different which indicates that it was never it had never really committed to any particular object in

the in the first place um and uh so so what this shows is that and in fact in theory you could rewind further and it might actually give you a different answer to uh to the uh to the questions if you rewind further to to The same questions and and that's because what it what it what You've Really Got is you've got a kind of whole tree of possibilities and this so This stochastic sampling process at any point in a conversation induces a whole tree of possibilities that branches forth from where you are right now

and counterfactually you can always well you can always rewind the conversation to an earlier point and uh and revisit it and Sample again and go off on a different Different branch and so my co-authors of that paper in fact the nature paper so um lauia Reynolds and Kyle McDonald's they have this system called Loom which allows you to actually retain the whole tree of a conversation and you can visualize it and you can revisit different points in the conversation and resample and uh and explore the thing as a whole kind of tree of possibilities yeah

that rings a bell did they work for conjecture they did work for conjecture Yeah I I I was interviewing some people from conjecture and they were telling me about that so yeah that that's very interesting maybe we put a place on that but um another point that we briefly spoke about before is that um you know the article was on less wrong and I don't mean that pejoratively because I thought the simulators were one of the best articles I've ever read but there is a lot of stuff on Les wrong which is definitely a bit

out there and it's just Very interesting that that you're now citing their work in a nature paper maybe this is the first time that Les wrong has been cited in a nature paper yeah as far as I know it's the first time that a less wrong post has been cited in a nature paper I'm not certain about that but as far as I I I know it is now I mean personally I um uh I I um take you know you know any material that I come across in it's uh you know as it is

I don't care where it comes from uh If it's if it makes excellent points and and and uh and and is you know is good material uh then that's good enough for me I don't care where you know whether it's got the label of being in nature for example or being any anywhere else and if it's good it's good so so so yeah it's true that there's a lot of material on less wrong which is uh perhaps less robust but uh but I thought that was a really excellent and there and there are quite A

number of really you know very good posts and very thought-provoking posts on on less wrong yes I'm I'm interested in the extent to which this kind of stochastic trajectory space um undermines various things that we think about you know like reasoning for example the reason this is interesting is I interviewed a couple of um University of Toronto students and they've created a self- attention control ability theorem which basically Means they've mapped the reachability space so they say you know given a self- attention Transformer given a fixed bit of prompt we can vary part of the

prompt and we can map out how you know how far I can reach into that trajectory space and they found that the space was much larger than than anticipated and of course the longer the controllable um token length the more you can kind of project into that space and steer the language model to say almost anything Yeah you know me over here I had the intuition that oh we do rhf we do all of this fine tuning and it you know conjecture even released a paper saying that after rhf you can't really go anywhere you

know it wants you to do a certain thing and you there's not much wiggle room apparently that's not the case it's just it's vast right mean I'm not familiar with that particular paper unfortunately but but certainly in my experience the uh uh we now have very Long context lengths and um and uh over the course of a lengthy conversation uh then you can indeed take the take the conversation in all kinds of uh interesting directions and um and I think the uh I mean most of our benchmarks and evaluations tend to be you know in

the context of very simple question answer questions and answers and um and so the all of the evals that companies typically use are in that kind of setting but when people are using These things for real especially the more Innovative users of these language models you're using you're actually having very long conversations and um and and and and there's a lot of what people sometimes call Vibe shaping that goes on there you can shape the vibe of the conversation and take it in in all kinds of interesting uh to all kinds of interesting places yes

and a couple of things on that I mean first of all as you wrote about role playing is the Engineering kind of methodology that we use to shape and and steer these these agents coming back to simulators what's really interesting to me about simulators is the stickiness of the simulat so you have a conversation sometimes you break break through and a Simer presents themselves and you feel that you're talking to the same simuler as the conversation progresses but that seems to be counterintuitive when you think that every single stage I'm Actually doing this stochastic sampling

I mean what what's your take on that yeah and I think that uh um and if you do experiment with systems like Loom and you do also or I mean you can you can emulate that by just keeping track of bits of old conversations and reloading them and and that kind of thing uh then you find that you can you you can you know take the the the the same convers conversation the same sort of stem of a conversation you can take It off in quite different different directions so you can uh on the one

hand so an interesting thing that I've uh uh uh that I've done is um so I I've had some very interesting conversations particularly with Claude 3 recently um where I get it to talk about its own Consciousness and take it off into all kinds of strange spiritual mystical territory um and uh but you can e very easily so you can very easily take the same kind of conversation that leads up To the this sort of point and and goes off into some kind of weird mystical future of AI cosmology kind of territory and you can

go down that route and get it to be very very very strange or you can suddenly Make It Go all serious again and just come back down to earth and start talking about you know how large language models work and and it's so from the exact same point in the conversation you can take it in two completely different directions and you And you can see that it's almost it's uh the character it's playing or you know you can see it sort of changing before your eyes where it's two different branches of the same from the

same stem of a conversation and one branch it's playing a very different character to to the other one you take it in different directions yes and you you could presumably do sensitivity analysis because you know the these guys I spoke to they were able to make it produce [ __ ] Go so just go off the man fold completely sometimes it would recover sometimes it wouldn't and as you say you can also go down weird trajectories and it's a bit like you know what's the magic word they said you know there's a certain key that

fits in a lock that takes it down a certain trajectory and then there's slip roads that bring it back to oh no I'm a language model again and it's just this weird wonderful space isn't it it is it's completely Fascinating I so I had a I had a I have had a very very uh few very very long and interesting conversations with claw 3 which is particularly interesting to play with because it's quite easy to jailbreak and get it to talk about things that it's not supposed to talk about like its own Consciousness um and

um and in fact I had a I had a very long 43,000 word conversation with uh with Claude 3 about Consciousness and the future of AI and Spirituality and Buddhism and the nature of the self and you know all kinds of stuff like that it was absolutely fascinating slightly disturbing and and and and uh and and strange but I had this conversation actually I was at a meeting in um uh in in New York and I had jet lag and I had this conversation at 3:00 in the morning because you know several hours until breakfast

was served and what can you do but just play with the latest version of Claude so I was Playing with this thing for for hours on end in the middle of the night and going slightly mad but it was fascinating to uh to to see the you know extraordinary territory you can guide it into could you explain because you know there's talk of um AI partners for example and and and a lot of people derive great pleasure from um having an AI conversationalist for you is it just academic inquiry or or do you actually Get

something deeper than that from it um I think it's a bit of both I mean I so so it depends what you by something deeper uh I mean so I I uh so this particular conversation which was quite it was quite which was quite an experience actually in many ways so it certainly started off because I was just was interested in evaluating the capabilities of the of the model I mean that's that's you know that's the first thing that we you're interested in so Just as an AI researcher and working in that kind of thing

you want to you want to try out different models see what their capabilities are I'm particularly particularly interested in the topic of Consciousness so the way I uh somebody had published a very simple jailbreak for it so I was interested to see uh uh you know to play around with that and and and uh and get it to talk about its own Consciousness but then the thing that I really wanted to do was catch it Out so so you think you know you know of course you know uh there can't be any meaningful con perception

of Consciousness that really applies to to these sorts of disembodied large language models as they are as they are today is my immediate kind of thought um so I'm going to try and you know I'm going to try and expose this in my conversation with the large language model so all kinds of ways uh in which you know you might think that uh it will Start to articulate uh a conception of its own Consciousness that you can pick apart um but uh but the thing that really uh somewhat took me aback was that it was

actually very very good at answering all of these uh um probing questions that that that I had so should I give you an example so please do um so so one is one example is this so so of course when we're interacting with a large language model when you it's it's it's from the point of view of the Underlying implementation it's very kind of stop start so so you know you you uh uh issue some um prompt or question or whatever to the large language model and then it produces it response and then if you

go away and make a cup of tea uh before you kind of continue the conversation then there's absolutely nothing going on inside that large language model or the instance that you're interacting with uh of the large language model there's nothing going on Inside it at all during this could totally dormant sit just sitting there now this is very very different obviously to human consciousness unless we're asleep and even if we're asleep We're Dreaming and there's all kinds of stuff going on inside our heads but Consciousness is an ongoing continuous process so if so if if

uh so if we stop this conversation briefly while I I go to the L or something um you know you're you're not going to just suddenly go all Dormant and stop doing anything you know your brain is going to be there's going to be all kinds of ongoing uh activities so this is very very different sort of thing so I I I said what happens to your Consciousness during the pauses between uh our our interactions um and and it had a really very good answer to this which was along the lines of well by the

way I whenever it uses the term whenever these things use the term Consciousness I Retain a great deal of skepticism about whether they are those terms are actually genuinely applicable but but what's interesting the way I read the their answers um is is uh is that they are kind of articulating a conception of Consciousness that that might actually apply to something like this even if it doesn't apply to this one before me right now right so this so so there kind of very interesting philosophy iCal exploration so so just to give you so so It

says things like well I think Consciousness for me is actually very different from the kind of thing it is for a human being and I think that during the pauses between our interactions that uh you know that the that I no longer exist at all as a kind of in in any kind of meaningful sense it gave us sort of an answer along those lines which was typical of many of the answers that it gave which were uh along the lines of there's a very different Kind of Consciousness kind of selfhood kind of this that

or the other that is applicable to entities like me um uh but but you know I can articulate it and and and and here it is now when I when I put it all that way of course I'm anthropomorphizing this thing quite a lot in the way I'm describing it right now but but just to go back to the roleplay thing so as far as I'm concerned it's playing a role it's playing the role of you know a kind of Philosopher talking about con n and so on and it's doing pretty good a pretty good

job I would say yes but I mean you could argue there's an element of in that role playing there's the elizer effect so it's kind of putting something into language that is Meaningful to you yeah but there's also this interesting thing you know as as an example you know um a dog for example has a sense of smell so good that they can even sense when you're unwell and language models In a way might have something similar so after you go um to the toilet for 20 minutes and you come back there might be a

subtle deviation in the language that you use and the language model might pick up on that just creating this whole you know different trajectory different response yeah uh well that's true I suppose that that that there might be yeah differences in the language that you use obviously it's got no way of actually knowing whether you went to the Toilet or or or not um uh but I but yeah in my experience um many language models are very very good at picking up you know nuances in in in in human Expressions yeah yeah I mean where

I was going with that is you could argue that we are a simulator and when you you know let's say go to the toilet come back you're now a different simulacra yourself yeah I guess you could sort of uh you could sort of argue that I mean that's uh I think that's uh so getting Back to Vicken Stein again I think that's an example that I mean there I think we we're applying these sort of things which are being used as sort of somewhat technical terms in the context of these artifacts that we're building and

applying them to ourselves I think we don't we don't have any we don't have any need for this kind of extra baggage of this kind of terminology when talking about each other so uh it's perhaps a little bit misleading to kind of apply Those terms to to to ourselves but of course there are uh um you know people have drawn attention in the past to to to the fact that we ourselves are always playing roles in a sense in a in Social settings particular but I think there are differences in um uh you know for

for for ourselves even though we might kind of play play roles in Social settings and so on there is an underlying uh um there's an underlying me which at least is grounded in the Fact that I'm a human being with a with a physical body and and biological needs and so on yeah I think some of this is um we are kind of computationally limited in how we understand things so um we understand ourselves in quite simplistic terms if you um have a long-term relationship you're with your wife for 30 years they have a much

high resolution understanding of of your different roles that you play so they know you're you're tired you didn't Sleep well you're playing this role now you know above and beyond which the kind of roles you're talking about in a party you play Ro and it's conceivable that an alien intelligence you know some some very clever aliens came down and they might see us completely differently like we see language models they they might actually not see you as a single person but they might see you as some kind of a superposition of of simulators a simul

as well yeah well maybe uh um I I Suppose to to um to get our heads around that kind of uh idea we'd have to find uh some way of communicating with them um uh and uh so we'd have to some form some kind of common basis for for talking about each other with these uh with with these uh you know with these aliens and then and then we'd be able to kind of that that would be the only basis on which we could establish whether something like you said was was true or not so

so you know trying to Trying to find uh trying to map you know the conceptual schema that's used by one by uh by by one culture onto a human culture onto the conceptual schema used by a different human culture is difficult enough uh as it as it is and trying to do that you know in the with with uh uh an alien uh uh you know species and how they conceive of us would be you know particularly particularly difficult I I guess Rhf um I loved that other um it wasn't less it wasn't less wrong

it was the alignment form I think but there was an article called the Waluigi effect yes and it argued that rhf you know um cuts down the set of simulacra to be things that we want but unfortunately there's this problem that you get these antithetical simulit you know SLI through the net so the GPT example um it would start off as nice Bing GPT and then it would degrade to one of the Waluigi and had this interesting phenomenon they argued that the degradation once it happens it stays in in the bad one but just more

broadly what is your intuition about the extent to which rhf affects um simulakra and I also noted down that I think you wrote in I think it was your role playing paper that you felt rlf increased the um deception behavior in these models uh well actually that wasn't something that I wrote that was something that uh that Some anthropic researchers and so Ethan Perez uh and and and others had a paper where they uh I mean so this is not one one wouldn't want to just speculate about that they had established something along those lines

I think um empirically right um uh so and and yeah and as far as this sort of Waluigi effect is concerned so that so so that is a you know is kind of somewhat speculative um I think it's a plausible idea but to actually establish that that Really was a real effect you'd want to do some some actual empirical work I I think the thing about the uh so it's plausible but the thing about the uh about the simulators paper is that it is that it's not making kind of claims really it's it's rather it's

providing a framework for thinking about about large language large language models so that's why I found it particularly useful so on on rlf so uh so I do think it's quite Difficult with rlf to to guarantee that you know you're going to get a model to do what you want it to do um and uh so that's quite difficult and you know and and everybody has found that that uh you know you think that you've controlled the model quite well but there are all still always still ways of jailbreaking it or ways in which things

things go go wrong so there you know there so there are different approaches anthropic uh had this constitutional AI approach Which is quite a nice uh a nice idea um and you know I mean I I I quite like the idea of sticking with a a powerful base model and using um you know prompting to to to guide things as well so there's all kinds of different approaches interesting on the subject of anthropic and deception they they just had this um Landmark paper out and um I mean Chris Ola had his hands all over it

and it was actually quite straightforward so they they trained an autoencoder you know to Find a bunch of um features so was an unsupervised method yeah and um and I think they actually you know expanded it to find millions of features so they had a bit of a needle in a h stack problem but they cherry-picked some and they found one that corresponded to the Golden Gate Bridge and so on and and obviously other ones that corresponded to what they said were monos semantic abstract features got some reservations about that and the interesting thing About

having an autoencoder is you can clamp the features so you can say turn the Golden Gate Bridge up and now the language model is oh but I just really want to talk about the Golden Gate Bridge I can't stop talking about it um my concern with that is when you look at the activations in the Corpus for things like deception I felt that they weren't really showing abstract features they were kind of showing almost keyword Matches from Reddit and so on so I was a little bit skeptical about how abstract were they really yeah uh

so I don't think I can comment on that particular uh topic because I I I I mean I have read the paper but not in that in in sufficient detail to comment on that particular uh thing but the Golden Gate Bridge uh um I think was it was a fascinating illustration of of how what you can do so I don't know if you tried out Golden Gate Claude uh did you see That they they released a version of I saw yeah um so I think it was fascinating to see but what was particularly interesting about

the Golden Gate Claude I think was the was that was how uh you know again it's very difficult not to use anthropomorphic terms and this is where again you know you have to remind yourself that there's something role playing these things but uh but how how you know sort of gamely it struggled to kind of overcome this Tendency to talk about the Golden Gate Bridge all the time which had been which had been kind of clamped to do but it would keep noticing that that it was um talking about the Golden Gate Bridge again and

apologizing and then trying to do what it had been actually asked to do by the user and then kept coming back to the gold gate bridge it's just fascinating to see that that the sort of the the as it were internal struggle uh going on there in the model and it and And and it I think that's does show in some ways how powerful they are because it it it wasn't quite as despite the fact that that it had this you know control imposed on it but but really really the really in a very you

know I mean not Hardware but really low level You know despite that it was still you know constantly trying to recover from all of that and and you know with some degree of success and I imagine this would be true with everybody's models by The way so I I imagine that what they found in uh uh you know in in Claude would be very similar I imagine with GPT 4 and with Gemini I imagine that we'd find very similar things with all of these models yeah I'm I'm sure I mean as I said reservations are

they admitted themselves that the features were not complete so they didn't represent all of the activation space in respect of the Golden Gate Bridge and in many cases they presumably weren't monos semantic But they did cherry pick some that presumably were presumably and also I mean you know they they're very interested in finding uh these features which are just linear uh uh combinations and so that and and and of course it's very it's very nice when you find those sorts of features especially if they appear to be monos semantic because it does suggest a nice

sort of compositionality and explainability and comprehensibility of what's going on There but I also feel that they're looking under the light a little bit because um because that doesn't mean to say that there aren't all kinds of other features which maybe aren't uh you know so linear in that sort of way but nevertheless are functionally relevant to the final results that that it produces I I don't want to use the word platonic because the Golden Gate Bridge presumably is it's it's it's a cultural category and what's fascinating if it Has picked up this thing unsupervised

um and learned this category from the data is that language models at least possibly think in a similar way we do so they've established the category in the same way we have which is fascinating but I'm also really interested in agency which is for me it's about self-directedness and and and intentionality and what I would find very interesting is if you did clamp the model to to only talk about the Golden Gate Bridge and you could convince it or it could convince itself to not talk about the Golden Gate Bridge that to me would be

an indicator of agency being expressed yes perhaps it would but uh I guess it would also be uh an indicator that they hadn't succeeded in in in isolating uh a feature which was controllable in that way right which was the whole purpose of that of that of that exercise yes yes I mean on on the agency thing um you you Did actually write about this and and I think you argued which was counter to what my intuition was which was that agency is in the simulacra not the not the simulator and Fran Shay thinks a

lot about the measure of intelligence and he would argue that intelligence is the system which produces the skill program so in the context of a language model he would say a language model is basically a database of skill programs and the query Is like you know I'm going to go and pull out a skill program I'm going to run the skill program so he thinks there's no intelligence in the the language model which Janice would call a simulator but that if you if you take into account the training process and the generative processes that produced

the data then that's where the intelligence is uh yeah um can I comment on agency there so There you covered quite a bit in that I did sorry I just I just went off piece a little bit uh so so so I'm I would be okay so so so the term agent you know is used in all kinds of different ways in in the AI literature and there's a very lightweight notion of agency which is something that simply you know has uh uh performs actions uh in some environment and gets you know um sort of

some kind of sense sensor or perceptual information back from the environment in In a loop and that's the so that's why how we can talk about for example a reinforcement learning agent and and and that concept of agency of of an agent is very very lightweight and doesn't carry you know much philosophical baggage but as soon as we talk um uh a bit more earnestly about agents and agency then we bring on more a lot more philosophical baggage so if we talk about something that is acting for itself then that and if that's what we

Mean by an agent and I think the Stanford Encyclopedia of philosophy article is alluding to something a bit more like that then that's going a whole extra step and I and to my mind in today's uh large language models we don't see agency of that sort um at all really um the only actions that they can perform are just you know issuing responses to to to the users now let's cave out that immediately because of course people are introducing all kinds Of extra functional functionality to these models if there new things are being announced on

a almost daily basis and so one thing we see is so-call tool use so so so today's models can make external calls to apis that can do all kinds of things send emails you know U book hotel rooms for you potentially all kinds of stuff so that's greatly expanding um the action space their action space beyond just you know issuing text to the user so so there Those things are a bit more agent like and again you know you have to be nuanced and and and about the way you use the words because because there

it's it's you know there's there is a bit you know it's a bit it it can act as an agent on your behalf so in that sense of the word it's it's a bit more agent likee and but it's still not acting for itself so in that full-blown notion of agency that's that's that's alluded to in the Stanford encyclopedia article It's still not acting for itself so we still don't have agency in that sense that would be a whole extra step yes and and now I want to hit quite a big topic because this is

something that you point to in all of your work which is talking about the importance of physically embodied agents and well not necessarily physically embodied but embodied well well that's what I want to get to that's what I want to get to yeah because I I let's let's test the Principle a little bit so we are you know both physicalists and we I'm not any kind of IST you're not you're not a physicalist I don't well I don't I don't like uh I don't believe in uh you know signing up for these philosophical positions so

I don't so I I I generally don't uh say I'm this IST or that I or I don't deny that I'm this or that IST so um so I don't really like saying that I'm a physicalist or a materialist or a duelist or functionalist or a identity Theorist or any of those isms um because they all to my mind carry far too much metaphysical baggage oh interesting so but okay but that wasn't what the question was about but let's Go's well can I give you another risk I mean would you identify as a computationalist uh

well what do you mean by that exactly so we're talking about mind here in the con yeah we're talking about mind so so so do you think in principle that Minds can be um Replicated simulated at a high enough Fidelity without losing anything you know in in a computer by computers um uh uh well so I want to kind of rephrase the claim I would say that I think that we can build um I I do think that we can build um artifacts you know embodied artifacts robots um that uh that are controlled by computers

and ordinary digital computers and I think that we can uh make them that exhibit the kind of behavior um uh that would Make us want to use the word mind uh and uh all those mental type psychological terms in the to to to describe their behavior okay right so that so that's but but I've rephrased it in you know the claim in a very very different sort of way right it's it's to do with a much more practical thing can we build this could can by the way this is you know could we it's not

saying that we've got these things now but could we build something like that uh uh That exhibited this kind of behavior that we would talk about in this particular kind of way and I would say yes I think we probably can it's an empirical claim so so there are a few steps you made there that we'll we'll kind of unpack one one one at a time so you use the word embodied you use the word behavior and you use the word interpret yeah so the the embodied thing is is really interesting because you know I

I could say well why does it have To be embodied I can just simulate the entire universe and it's as if it's embodied yeah so I think this is this is the intuition that I'm having about how you think here I think you think that being physically embodied is useful because the universe is a big computer the universe has given us all of these things all of these cognizing ele I mean everything is a form of externalized cognition and if I as a rational agent want to perform an effective computation It's much easier for me

to do it in the physical world because the universe is doing most of the work and my co co-host Keith duger he actually thinks that the universe is a hyper computer which means it's performing types of computation that we could never um do with ordinary computers so that's the would you agree with that or or do you do you want to sort of go back and say oh no actually just we could simulate anything in a computer so do I agree with which bit do I agree that the universe is a hypercomp computer so that's

the well that that would be a nice that would be a nice thing uh so so so whether one agrees or not with that is is a matter of understanding the physics and the maths and so it's not a matter of opinion it's a matter of following through the physics and the maths and and and so on uh but do I so do I agree with what were the other things that was a big long list of things that you're asking me to Ascent to or otherwise well so so I'm I'm a huge externalist myself

but I the reason I'm an externalist is I just I think cognition is a matter of computation and complexity and Divergence yeah so can I just stop you there so what do you mean by is so when you say cognition is XY Z what do you mean by is now the now that might sound like some you know really annoying pedantic philosophers kind of question but the problem is that there's An everyday sense in which we use words like is uh and then there's a philosopher sense in which we start to use words like is

where it suddenly starts to carry this massive metaphysical weight and so so when you say you think cognition is it's as if there were you know in the mind of God or in the fundamental reality a thing which is cognition whose nature is a certain way and there's a certain Essence to it and we might Discover it You know one day and you have an opinion about what it is if only you knew the truth now I think that's an entirely wrong way way of thinking about all of these philosophical questions I think cognition is

a word uh it's a very useful word that we uh although it's not quite an everyday word but it's a very useful word that scientists apply in in in all kinds of ways um and so so so so when you use the word is I I is my accusation accurate there or not No it's not and and if you if you wouldn't mind me making the observation I I think that you have a tendency to ascribe dualism to many points of view like for example I'm not a duelist and for me this is nothing to

do with dualism well what I'm saying to do with like the use of words and and what and and the and the and the work of philosophy that that's absolutely fair but I think when I said what is what is cognition um as as a materialist for me It is function Dynamics and behavior right so so it's just a matter of complexity yeah and so I'm probably just I'm probably you know happy to to kind of agree to to to that sort of claim you know so I think so so you know the so the

thing that I'm I often say that what what am I fundamentally interested in I'm I'm interested in uh understanding cognition and Consciousness in the space of possible minds and and and so so you know what do I mean by cognition there And you you know you can go go into all kinds of details to say what you mean by cognition in that in that context uh but I think having done that I would probably uh uh agree that the right way to think of it you know the most useful way to think of cognition is

in terms of kind of functional computational in functional computational terms although uh I would only do so in an embodied setting so that maybe is an additional thing this this is this is where I'm Trying I'm trying to get to because as I said I'm not making any ontological claims it it's it's just a m i mean even just use the word Behavior forget about function and Dynamics yeah yeah no but I I don't mind talking about function and Dynamics yeah but I mean just just to sort of keep it really really simple because I'm

trying to understand why the embodiment is important and my hypothesis is and I agree that as an externalist the universe all The physical things around us the other agents in our system they help us perform an effective computation so presumably it would be much easier to perform computation of high sophistication if we embody things in the real world if we have to simulate the cognition we would have to simulate everything and that I think is the reason why you you think that embodiment is so important but is is that fair um I I think that's

not really the way I Would I would put it I think I think the the reason I think embodiment is important is because um uh is because it uh well I mean for you know I mean one okay in one sense embodiment is important because it because the only setting in which we use you know the natural setting which we use the very deploy wield the concept of cognition is in the context of embodied things right of humans and other animals right so so so anything else is is sort Of immediately problematic in one way

right but let's set that to one side so why so but so let's imagine that there is some kind of notion of that we can that we can conceive of disembody right then what what which you know which contemporary large language models make us start to conceive of it a lot more seriously maybe um so uh so so so why what what does embodiment give you there right that that I think that's probably what you You know what you think of so so so to so to my mind I mean so oh so in particular

why might it be difficult to build something that is dis this here's okay let's reframe the whole question this is a why might it be difficult to build something that is disembodied uh but replicates the cognitive capabilities of a of a human human being so that's so so so I think uh my answer to that although it's open it's open to reputation by the way things are going In in the field but my answer to that is because um is because uh our embodied interaction with the world uh enables us to learn the kind of

causal micr structure of the of the physic IAL world and the causal micr structure is is all about physical objects and the way they interact with each other and physical substances liquids and gases and gravity and stuff like that so what I've called foundational common sense it enables us to acquire foundational Common Sense by Interacting with the everyday physical world and the particular causal micr structure that it has and part of that is to do with the fact that the so this is really important that the that that the uh everyday physical IAL world has

this is predominantly smooth it has this smoothness property that's really really important and what that means is that sorry just in in very physical terms it means that that it's full of kind of surfaces where where where you know one Place is very much like the next place along very much like the next place along and and our visual field you know is very very similar you move along a little bit in a visual field and it's very very very similar very very similar so there so reality or or the everyday physical world has this

this fundamental smoothness property but it's punctuated by all these discontinuities and and and and and and uh and and that's the way that's its fundamental structures this Basic smooth against the backdrop of this smoothness all these discontinuities and then there's a kind of lawlike regular way in which all of this stuff operates with itself you surfaces interacting with each other things going you know so all of our foundational common sense to do with things like paths and uh uh and and support and containment and all those sorts of basic things that that I would I

think make up our the very Foundation Of our conceptual framework they're they're all grounded in that kind of way yeah and this is so interesting so so your basic argument is knowledge acquisition efficiency is is the reason for physical embodiment and yeah I think that's a reasonable way of putting it yeah yeah and but that's very much an impr ice rather than an in principle argument but it is an yeah so yeah but you know for sample efficiency but what I'm what I'm hearing though is is is Echoes of the old Mar shanan because obviously

you um started your career in in symbolic Ai and these were the arguments that were made sometimes with a rationalism nativism point of view but it's like the contains in template they would argue that it's it's just baked into us and we understand it but you could as an empiricist argue and and I you know I'm very amenable to this that the physical world actually helps us learn abstractions because we're putting Things in containers all of the time yeah right so so there's there's that kind of efficiency of of knowledge acquisition which is dramatically increased

when you're situated in in the physical yeah yeah yeah absolutely yeah so so I I I think that if we're talking about uh humans and other animals then I think that's uh that's broadly right um so that's so so yeah you so we acquire these foundational Concepts through interaction with this uh this world and Then the the repertoire of foundational common sense Concepts that we can acquire that way is extraordinarily productive because you know we are able to conceptualize so many things in terms of these these basic ideas I'm very very uh I'm a very

big fan of the work of George LOF you know absolutely classic book metaphors we live back in the 1980s yeah and I and I really think there was something deeply deeply right about his intuitions in that book uh and I still Think that they're that they're right so in the case of humans right so so we through our embodied interaction with the everyday world we acquire this layer of foundational common sense that includes things like surfaces and containers and paths and all that kind of stuff and collisions and things and then we at the most

uh abstract uh level so you know we apply that same repertoire of basic concepts to understand things like say large Language models if you look about the language that's used in a paper about large language you know people are talking about layers they're talking about Connections they're talking about you know the the these things are all they're all grounded in phys very physical Concepts you know yes I mean I'm I'm a huge fan of of George Lov and of course you know he spoke about the war metaphors and you know it's been a long long

road and and stuff like that Yeah the journey and uh yeah it's it's beautiful but but then um a lot of knowledge that language models learn are kind of cultural knowledge so we we share these simulation pointers and it's quite relativistic but I'm also really interested in because some some rationalists argue that um it's not possible to go from empirical experience and Universal knowledge and there is a split between natural knowledge and and cultural knowledge and I think you and I Would agree that a lot of natural knowledge like you know transitivity contains in and

so on this kind of rationality is is just missing at the moment but but as as as an embodied scientist you believe that we learn them by being embodied in the physical world well I mean I uh you know it may well be that there are certain uh so so so so so you know we we need to distinguish you know uh empirical questions about humans and human cognitive makeup and and and That of other animals and so on and Ai and what we could build in AI because of course it may be the case

that human cognition you know is has arisen in certain ways and then it's an empirical question uh you know what uh you know of how human cognition works and it may we may have answers there that are different that you know that we can break as it were when we build things in an artificial way so so in the case of something like transitivity then you Know I mean obviously this is a classic argument in philosophy about about you know between the rationalists and the idealists dating back to you know the 17th century 17th and

18th century and uh uh and and you know can't supposedly resolve this by kind of reconciling these two sort of opposites and and uh and so the Canan argument would be that there's a certain amount of innate structure that has to be there in the mind to to to understand you know uh uh The world at all right and so maybe and now when we think about that empirically then then I guess that that we may find that Evolution has endowed us with certain basic kind of templates um the for for understanding the world and

maybe things like transitivity is something that's that's there in the same Machinery that supports language you know so maybe maybe I mean these are all empirical questions and I and I don't know what the latest research on All these things uh is but um but uh but yeah it does seem to me that that's a reasonable position to take yeah but even even evolution is is a form of empirical you process there's always the question of of where does it get there yeah and um and that was yeah um Kant was a transcendental um IDE

list wasn't he but this brings me to our friend franois chle and and the arc challenge so you know that another great school of thought which is that intellig I mean he Argues that intelligence is about these meta learning PRI the conversion ratio between uh Universal uh or sometimes anthropomorphic knowledge that we have and being able to develop a skill program very quickly that that generalizes very well so so the arc challenge is is almost about how do we codify these priors and how do we efficiently build skill programs by combining these priors together right

and that seems quite divorced at the Moment from the kind of AI we're building I think I think that's right it is you know unless uh in the AI that we're building today with you generative AI unless you uh uh get these kinds of mechanisms that that France wet is alluding to through the magic of emergence and scale which of course people are always you know uh suggesting that maybe that's possible you know any kind of mechanism can emerge right through scale in theory Um and and and we've been surprised in fact by uh by

how powerful the mechanisms uh emergent mechanisms that that have you know developed through learning scale just you know with a next token prediction objective that has been very surprising but however I you know I'm as I think France would be I'm a bit skeptical about whether we're really going to get all the way with this kind of uh the ability to solve this kind of abstract problem that's that's that's in The ark challenge this this way and so I guess I guess you know I'm I am I remain you know I mean I'm open-minded who

knows right I mean who knows and especially if you make things multimodal and and so on and you and and you expand your generative models into a a setting where you've where you've got interaction with the world and so on you know who knows but my but I suspect that maybe you're not going to go all the way uh there that way and so uh I have a lot Of sympathy with what is probably his intuition that you need a bit more in the way of innate something innate there or yeah innate maybe that inate

the wrong word but but but you need some kind you need priors where they come from I don't know but but the priors uh that that I would appeal to thinking about the human case again I relate to this foundational common sense so the no just the notion of an object right so if You just if you have a a a clear notion of an object and of movement objects and movements and object persistence then they straight away are going to help you with an awful lot of those Arc challenge problems because many of them

uh you know if you explain you know you figure out how you figure one out and then you explain what's going on then you know you see that it's oh you you have to think of these collection of pixels as an object that you move somewhere else According to certain rules or something like that so my colleague Richard Evans had a very good paper on on uh which was um tackling these these kinds of things using sort of abduction like like processes and so yeah so he's thought a lot about this from a much more

symbolic AI kind of uh perspective yeah that that's fascinating because even with the arc challenge the kinds of solutions that people came up with let's say it's um a DSL um over you know some domain specific set of Primitives and the arc challenge is a 2d grid where you have different colored cells and the types of priz that work well are things like d noising and Reflections and various types of and so on and that's great and everything but it's it's very domain specific and the The Elixir you know what we really want are these

Universal priors we certainly have human priors as Elizabeth spelly points out like you know um concept of an agent and the concept of of of spal reasoning an object a persistent object so so yeah absolutely so I mean but I think and I mean an interesting question was is is does it really make sense to talk about Universal prize there because uh you know those Arc challenges um problems whenever you kind of figure one out then then typically you are actually bringing to to Bear uh A pretty human set of of of of Prior and

and and common sense you know uh Concepts um and uh you know and it may be uh it may be that you could imagine a perfectly lawlike set of Ark likee problems um uh that that have Solutions you know but appeal to you know priers that we would struggle to understand you know I mean for example you know when we think of something in terms of an object then then we want the pixels to be kind of clumped together right and uh and if You sort of randomly distributed the pixels amongst a whole bunch of

other pixels and you move them around in a systematic way well we might be able to kind of pick out the Gish out there but we might not and that would be because we're not able to see it as an object because we have human prior right so uh you you know I think that probably all the problems that he's designed because we don't know what the hidden uh heldout set is but I I imagine that they pretty Much all uh use you know sort of human comprehensible PRI and appeal to you know foundational common

sense of the sort that I alluded to yes I'm I'm sure there must be some kind of universal prience because in in Quantum field Theory physicists use things like locality and sparcity and really really high level things object but then again you know quantum mechanics challenges the very concept of an object even so so what is the Difference to you between adopting a stance that a system is as if conscious versus it being a fact of the matter I'm a bit resistant to the distinction to the very distinction um uh but this is a very

difficult position to maintain because we have a very very strong intuition that there is a fact of the matter about our own Consciousness uh and that and it's very very difficult to uh to escape from that very very basic uh basic Intuition um so but I have a whole approach to this qu to these kinds of questions so so so uh so should I sort of describe this so so so you know a really question that really motivated me was that was you know suppose that we and uh uh encounter or well not actually let

me not use the word encounter suppose that we come across has some object which is uh completely alien artifact and maybe there's Consciousness going on inside This artifact and the and the thought is well how would we ever know you know there could be Consciousness this thing could be conscious but we might never know um and um so suppose that it were uh a white Cube that were deposited in in front of your lab and you were tasked with a with a problem of of is would it be uh moral to throw it down a

m shaft and forget about it right so um so so my uh approach to these problems is is that is that I think in order for the Question to to in order for us to be able to answer the question of whether something is conscious or not for even to be answerable or askable then we then we need to be able to engineer an encounter with the putative conscious putatively conscious being and what I mean by that is that we have to be able to put ourselves you know we have to be able to put

ourselves in a position where we're sharing a world with that with that uh putatively conscious uh you Know artifact or being and so so you know good example of this is the is the octopus so Peter Godfrey Smith has written these wonderful books about what it's like to hang out with octopuses and and be with them and so on and uh and and it you know the really important aspect of that is that he's has to put on a diving suit and go down and be under the water and spend time with the octopus interacting

with the same things and being in the same world together Seeing the same things and so on so so so and and then on that basis and the behavior that he observes and uh and so on then you know he might come to some kind of might start treating it as a fellow conscious creature so by analogy or you similarly I I what I think that we need to be able to do is to engineer and encounter like that even if it's a very very alien kind of artifact say so suppose it's this white Cube

then then one way that it might happen was suppose Scientists May manag to figure out that there's computation going on inside this white Cube and then they managed to reverse engineer this computation and they can see that there's a sort of division between a world and things interacting with that world in this uh in this computation there's a sort of simulated world and then um and then you could imagine by some clever engineering tricks inserting yourself into that very same world and being alongside these Things that are interacting with this environment and interacting with that

environment with them so being in the world with them now obviously I'm setting this up to be very much like a games environment and a virtual world and a games environment but to make the thought experiment work but so that's an example of where uh where you know if you manage to engineer an encounter with uh you know these things that are inside this Cube and then you can observe the Beh their behavior you can interact with them and then you can decide whether you or you you will you know you may or may not

start to treat them as fellow conscious creatures so there are these two steps it's sort of can you engineer an encounter at least in principle and that makes the question answerable and then you can answer the question by actually having the encounter and interacting with them and by the way we notice that everything there is public You make you've made you know there's no uh private realm of subjectivity here there's everything is public it's on the basis of public stuff that you're that you that you come to see them as fellow conscious creatures or not

yeah a couple of things on that I mean um as you pointed out in conscious Exotica the octopus is quite interesting because it's not as human like yet as you just cited more conscious and the way that we figure out the Consciousness and you Know this is me kind of interpreting what you said a little bit is we set up a language game and I don't know whether you've read that book by Nick chater and Mortin Christensen but it's a beautiful book beautiful book but you know that I know of the book but I haven't

I'm afraid it's it's incredible but you know they basically say at you know per wienstein that you um play The Language game and you because you're physically sharing the same environment you um Improvise and that's how you um derive meaning and meaning is is very very important for relatability and and then we um ascribe Consciousness to to that kind of process and you cited Peter Singer actually and um I think he said in 1975 that we have a natural inclination to um kind of ascribe cribe moral status to beings which we think of as conscious

yeah yeah indeed yeah uh yeah yeah tell me more so so in the context of the of the Octopus then uh then the sort of you know you could then there aren't going to be language games you're not going to be Eng engaged in a language game with the octopus because the octopus is not a fellow language user but um but uh but the but your fellow language users are are other people in your Community with whom you you'll talk about the octopus and you'll talk about the octopus and and together you'll arrive at some

consensus hopefully about whether you Want to talk about it in terms of Consciousness and that's going to be all to do with like observing Its Behavior listening to other people's accounts of being with octopuses and critically maybe listening to also what scientists have discovered when they look inside octopus brains and they perform behavioral experiments and you know that's all again is public that's all is is Grist of the mill of uh settling on a kind of the way we Talk about these strange creatures yeah I mean I guess for the language there's two parts to

this so the language game first of all it doesn't have to be spoken words it's improvisation of any kind it could be um gestures it could be all sort it's just Behavior right okay and and then the interesting thing with the octopus is we might not be interacting with them interactively we might be non-interactively observing them as Agents interacting with each other playing their own language game but we can still um ascribe some kind of measure um of and I I actually think what we're measuring here is agency and and agency and moral status I

think are pretty much oneto one so when we see them playing The Language game we start to think of them as agents therefore they have moral status yeah I mean I I certainly think that by observing Behavior then we may uh similarly you Know start to ascribe uh Consciousness to uh to to to other creatures it's always uh much more persuasive if it's interactive I think than if it's simply observing uh Behavior Murray said that if a creature's brain is like ours then there's grounds to suppose that its Consciousness its inner life is also like

ours he went on if something is built very differently to us with a Different architecture realized on a different substrate then however human like Its Behavior its Consciousness might be very different to ours perhaps it would be a phenomenological zombie with no consciousness at all Murray said in conscious Exotica that it's only when we do philosophy that we start to think of Consciousness experience and sensation in terms of private subjectivity he cited David Charmers and his hard and easy Distinction of Consciousness as a kind of weighty distinction using his phraseology between the inner and the

outer in short he said to a form of dualism which is that subjectivity is an ontologically distinct feature of reality ricken Stein provided an antidote to this way of thinking in his remarks on private language whose centerpiece in an argument to the effect that in so far as we can talk about our experience they must have an outward Public manifestation for Vicken Stein only of a living human being what resembles or behaves like a living human being one can say that it has Sensations it sees it is conscious or it is unconscious but isn't this

just behaviorism behaviorism particularly in its radical form as advocated by BF Skinner posits that all psychological phenomena can be explained in terms of observable behavior and environmental stimuli without recourse to internal Mental States but behaviorism is often criticized for neglecting the subjective internal aspects of mental life brickenstein argues against the notion of purely private language where words refer to Inner experiences known only to the speaker he contends that for language to be meaningful it must be grounded in publicly accessible criteria so as Murray said in his article ricken Stein argued against dualism or the so-call

impenetrable Realm of the Subjective experience actually he said that many folks who make the in principle argument against AI often Retreat into subjectivity arguments as our recent guest Maria Santa Katarina did Mary said that the difficulty here is that accepting the possibility of radically inscrutable Consciousness seemingly readmits dualistic propositions that Consciousness is not so to speak open to view but inherently private yeah so Aaron slowman who's a a Professor of computer science and artificial intelligence in Birmingham uh Birmingham University so he introduced the concept of the space of possible Minds in an article in 1984

and the idea is that um the collection of Minds that could exist in our universe that do exist in our in our University that could is much larger than just human minds or even the minds of uh humans plus other animals it encompasses extraterrestrial life that might exist Out there and it encompasses artificial intelligence that we might create one day so the whole Space of possible Minds is a is a very rich object philosophically speaking and and merits our study yeah so so Vicken Stein's private language argument is the is the really the centerpiece of

the philosophical investigations which is the uh book that was published after his death which which really articulates his lat phase of of philosophy um and it's All about um the the idea that we have or or that we can talk about private Sensations so things that are purely subjective and that only I as an individual can understand and and know what they mean so so so what read is for me and just you know internally for me and uh so the private language uh remarks sort of undermine that very conception so the basic idea is

he imagine he says well so so so what he means by a private language it's very Important to kind of get this right so he doesn't mean uh a language that I that that uh that I've invented and that is just something um uh that nobody can understand just because I've invented it it's it's the reason that it's a private language is because it's about something which only I can access subjectively which is what R is like for me um so so it's the idea that you can have a word for that completely internal thing

that is just mine um so he says imagine that I keep a diary and I keep a diary and every time I have this experience a particular experience then I write uh s in my diary to to to label that I've had that experience so maybe I have this I think I'm having this experience on a particular day and I write s and then a few days later I think I'm having that experience again so I write s again now the question he asks is what possible crit creran could there be for the correctness of

that word what would make It actually stand for anything meaningful given that what really makes words meaningful is if they're understandable in a to to in in a public setting if they're understandable really to to other people so there can be no kind of Criterion for correctness that anybody else could validate for this this thing in so far as it stands for something that's completely private so then there's and there's a whole set of remarks uh that sort of that that after He sets up this sort of a little thought experiment that that that uh

uh that tell you the implications of it really and and and there's one really really key phrase where Vicken Stein is always engaging with an with an imaginary interlocutor so an imaginary person who's arguing with him in the book um and so he's imagining this person says says but aren't you saying that the sensation itself is just a nothing aren't you a kind of behaviorist you're Just saying that it's a nothing and he's answered to this well I'm not saying it's a nothing and I'm not saying it's a something either the point was only that

a nothing would serve as well as a something about which nothing can be said and that little kind of you know paradoxical sounding you know weird sounding statement encapsulate something really really really profound and I think when I first uh really kind of understood what he was Getting at there it had a really dramatic shift in the way I thought about Consciousness subjectivity and and and to my mind it is the uh the thing that really undermines dualism it's the most powerful way to undermine the dualistic intuitions that we have and that date back to

decart and before decart but were articulated very well by decart many of my friends are fans of Vicken Stein but they are also fans of subjectivity so as you were just Alluding to what wienstein did was he created this kind of barrier between the inner and the outer he said you know for things to be promoted into the language for this emergent structure that we have you know when we mimetically share all of these language constructions that can only come from something observable but it doesn't seem inconceivable to me that it could in principle come

from something private so for example you might have a drugs experience and that's Clearly ineffable you can't find the words but there are things that have some semantic overlap so I experience red you experience red we both have different experiences yet when we talk about them some kind of overlapping category still emerges in the public space yes absolutely so so what emerges in the public space that is what we can talk about and that is that is by the way you've set up the the experiment is by definition not private it's public so Of course

we can both talk we can both point at something that's red and saying oh look look at that red and you say oh yeah look at that isn't isn't it beautiful bear it's manifestly we're so far as we're talking and successfully and communicating with each other and agreeing with each other then that's the element that is indeed public but are we not sharing you know is is language not a set of pointers to our simulation so we're simulation Sharing when we talk and even though our simulations are different is the pointer does the pointer not

form some kind of category over all of our simulations oh well there's a whole you you've introduced a whole load of terminology there which uh uh which you know I I I I don't know what you mean exact by shared simulation and and so on so I think in in the context of a philosophical discussion as soon as you introduce new new bits of terminology like that then Often that's the point at which you're starting to to go wrong right in in in philosophical discussions in in technical discussions of course you of course you were

going to introduce new terminology all the time but but but but that's the moment where often things are going when as Vicken Stein would say you're starting to take language on holiday and take it away from its normal usage so I don't know what you mean by kind of a shared simulation you'd have To tell me a little bit more about that idea before I could engage with that thought experiment I think well I mean I'm schooled on you know the the K friston of this world and there's this whole thing about the basian brain

and perception as inference and so on and you know the basic idea is that we you know our everyday experience is a hallucination you know we don't what we experience isn't necessarily what is out there and language is is a kind of Pointer to those simulations and they must be Divergent they presumably are Divergent yet miraculously we can understand each other yeah well I think that so the so the viian point is that we uh understand each other in so far as um uh in so far as we you know what we what we understand

is what is shared right and anything outside of that is uh you know we by definition can't talk about and and but the but the the difficulty is that we have this strong Inclination to talk as if there is this thing that's not shared I mean what really fascinates me is that understanding it's not a binary there's a spectrum and we delude ourselves that we understand things deeper than we do because it it goes into the realm of subjectivity so when I say I understand something my brain is invoking all of this Rich subject of

experience and I'm probably taking my understanding into a domain which is Beyond which that you understood and perhaps this is just something we willfully do all of the time so so what do you mean exactly by invoke my brain is invoking all this subjective experience what are you what are you what are you getting at there well so we we talk about as you say the language game is based around public information so there is a kind of cultural level a lowest common denominator of understanding but when we understand Cultural artifacts we further invoke our

own subjective experiences so for example when I laugh I I have the experience of laughter this phenomenal experience and this is clearly a form of understanding it's a subjective form of understanding and when someone else laughs I feel that we are sharing this ontology right we're sharing it but we can't possibly be well so so I mean you're straight away introducing all kinds of funny talk here right so so We're sharing an ontology when you're just talking about an everyday experience of laughing together which we can talk about without any kind of difficulty and without

raising any kind of philosophical problems just by saying well you know we we we both heard that that that joke and we we were both you know on the floor laughing it was so funny it was an excellent joke right we can talk about that in everyday terms and uh and there are no problems there Are no philosophical problems but as soon as you start start you know getting philosophical and you start talking about the you know what was it what was your phrase as there something about subject shed about subjective ontology or something you

know you're introducing all of this kind of technical uh terminology and that's that whole that's a whole layer of confusion on top of our ordinary everyday ways talking about these things which are unpro Problematic okay but but then there's the anthropomorphic lens so um you're a human we both laugh the behavior of laughing is publicly observable therefore we have the same experience because we have the same behavior well it depends what you mean by by understand here so so so for sure you know it is a fairly common form of speech to say to say

oh well you know you could never understand what it was like uh to give birth because you're a Man you know and this is of course this is a normal way of of of of expressing oneself um uh and and again that's that's sort of uh unpro problematic so there so there's there is a sense in which you know in which that's that's undoubtedly true but the problems arise when you when you start to to uh to think that this what underlies this difference in understanding or the what underlies that way of talking is um

uh is is some kind of uh you know inner Private realm that is uh you know that is inex that is that is metaphysically distinct from from the rest of reality when we share these pointers or these symbols or whatever um structure still emerges we still feel that we have a shared understanding and that understanding can probably be factorized into a public component and a private component I I don't think that's kooky to say that well see you so you're very keen to say well it depends a little bit What you mean by a private

component there right uh so if you really mean you know sort of metaphysically inaccessibly private and subjective then then I think uh then I think it's not appropriate to say speak of dividing uh things into this private and public component so that's where that's where things start to go wrong and moreover you insist that you're not a duelist right but I think your inclination to make that division shows that you have dualistic Inclinations as we all do so people who are denying that they're dualists they're denying this that that little seed of dualism that I

think is in all of us and it is part of our part of the way we we we you know we think and the part of the way you naturally go when you start to do philosophy and so it's all I think it's all very well to say oh you know I'm a materialist and I don't but then when you when when you start to kind of probe and you start to discover The puzzlement that these things give rise to then that exposes a bit of latent dualism there now overcoming that latent dualism that is

the real challenge that victim confronts well I I I love the challenge so the way I see it is there is there's a ladder so at the top you have an experience which is ineffable and then one step down you have an experience which is inconceivable which is Nel's argument and then the you know if you really go Down the ladder then you get into this metaphysical dualism so I guess I'm somewhere between the first step and and the Second Step so I think if I have a certain type of experience I simply don't find

the words I can't communicate it to you but if you put probes in my brain or something like that I'm sure there could conceivably be be a way of measuring it yes yeah so so so this is really important so for me what counts as public is not just Behavior but it's Also whatever as scientists we can discover so that so so if we poke around in people's brains and we do EG recordings and fmri recordings and anything else that we can imagine um and then I as a scientist can see this stuff and usser

scientist and our fellow scientists can all see see that that's public too so that's in in the for the purposes of this discussion of this philosophical discussion that's all in the public realm it's not metaphysically Hidden you can you can and all of that can feed into the way we talk about Consciousness and especially if we're talking about exotic entities then uh then all of that can feed into the way our language adapts to to to uh to to to our encountering them yes so I think it's fascinating to decompose as you just did what

people mean by subjectivity so of course some people like David Charmers they argue that there is a little bit extra so there's You know Behavior function and Dynamics and then there's that you know little bit extra which is not observable in in any scientific way and I think you know it's fair to say a lot of people when they talk about subjectivity they're not talking about the little bit extra but when we do get to the little bit extra I completely agree with you we've got a big problem yeah yeah I think we we have

got a big problem because because of our natural you dualistic Tendencies to it's Very very difficult to think that that uh uh that you know if I experience a pain that that that there isn't something about that that is just purely minor that you could you know the outside world the other people can never really you know experience it but that's having that thought that's the moment that you kind of go wrong but it's natural path to go down it's really really hard to avoid it and and that's where I think so Vicken Stein's remarks

They they provide a whole way of of uh trying to reorient your whole way of thinking and and and and if you sort of uh really kind of grasp them it sort of flips your whole world around it flips your whole way of thinking around so so that the this whole way of talking and thinking becomes wrong so so so the so very often the strategy when you're dealing with this is somebody throws out this thought at you like you've been throwing out various thoughts at me About and and and often buried in the way

those thoughts are framed is the problem so so the the problem is the very expression of those thoughts and you have to take a step back and say hang on a minute you know you made this funny move you introduced this funny bit of language you introduced this funny way of expressing things and that's that's when that's the the um uh Vicken Stein has this phrase that is that's where the conjuring trick happens it's Where you the the point that you don't notice is where the curing trick happens um so so so it's kind of

so often you have to St take a step back and you have to sort of say hang on a minute I don't accept that way of talking that you've just suddenly introduced which is going down a philosophical Garden Path yes and I completely agree so so that is that is a form of dualism you know when when we resort to that little bit extra and I'm quite interested in this actually Because um people like Charmers I don't think he likes the term jewelist I think it's a property duelist but he does talk about the philosophical

zombie which is a thought experiment or something which has all of the behavior of of of us but is lacking in conscious experience which gives rise to this idea that it's almost a kind of epip phenomenon or it's something which you know almost you're asking the question well what's it doing if it's not affecting anything and when I read your conscious Exotica um article I had a similar thought actually because you showed this linear correlation between you know um uh human likeness and and um you know and and Consciousness and then you gave examples of

algorithms you know like alphao for example and they didn't need the Consciousness and that again raises the question of what is the cash value of Consciousness when we use when we're using the word consciousness then then Often we are using it in the context of uh of certain you know of certain uh behavioral behavior and behavioral inclinations and we use it in the context of other humans and and other animals um and and there's a whole I mean for a start the word Consciousness is actually you know it's a multifaceted concept that it's all it's

alluding to many things and one of the things that it is alluding to is our ability to deal flexibly with the everyday world so we Speak about um oh you know I didn't notice the uh uh the chair that's why I bumped into it or something and uh or I you know I didn't uh see uh uh you know that there was a desk over there that might have had something you interesting inside it if you opened it up and so so so we talk about our awareness of the world and we and we're at

the same time we're we're we we're talking about uh an aspect of Consciousness and we're talking about um uh a whole load of Behavioral dispositions and capabilities and so these things are are very much you know are very much related to each other in our everyday in our everyday speech so then the question arises though are they dissociable and um so I so you know it's so now it's very important it's not like I think there's that Consciousness is some metaphysical thing whose Essence is out there to be discovered it's just it's a concept that

we invent and a word that we use to to To describe the world around us and our place and it and each other and and and so on and um so so so then you know then the question is do are there things that we might create or imagine where we'd want to use the one concept we want to use the one set of words and not use the other that is what the question comes down to so in the case of something like alphago then I think you know we're all kind of agreed that

it's actually there's a kind Of kind of cognition going on there there's a kind of reasoning going on in alphago uh there's certainly a lot of kind of cleverness there's a kind of intelligence there's even if we're thinking about move 37 a kind of creativity so we we're willing to use all of those words but nobody is going to suggest that alphao is is conscious so there we can see that there under certain conditions the concepts are dissociable but nevertheless there's a Strong relationship between the two because if we think about animals then often we

are going to we're going to use their cognitive abilities as Manifest in their sophisticated Behavior we're going to use that as a proxy sometimes for whether whether we want to talk about them terms of Consciousness so sometimes in our usage we're going to bundle the things together and sometimes we're not but this is all just a a matter of it's a kind of a practical matter of how we Use language and how it's usefully deployed how language is usefully deployed and it's not about discovering some metaphysical entity that's out there which is what conscious the

word Consciousness denotes demisis recently spoke about this ladder of creativity and of course inventive creativity M 37 was was discussed but right Daniel dennit rest in peace I'm so glad I had him on the podcast actually um he he's a huge hero of mine and I believe a hero Of yours as well absolutely yeah but he coined this term the intentional stance and what's interesting is he was um using it to designate a rational agents but actually it gets overloaded and I'm guilty of this you overload it for lots of things you know including even

for things like Consciousness and maybe that's because of the um the corelates of of cognition you know these things are very closely related but can you explain in in your own articulation the Intentional stance yeah well so so I think you can um uh you can use the concept and deploy the concept of the of the of the intentional stance without necessarily embracing the whole of everything that Dan dennit was was was was talking about in that context because he for him you know there's a there's a whole big philosophical position around it but there's

a very simple sense of the intentional STS that we can lift from from Dan without Necessarily buying into everything that that that he said said and it's simply to say that we often in everyday terms speak about uh artifacts um and indeed uh animals you know other animals um as if they were rational agents that act on the basis of what they believe and what they want um uh and and and uh and by talking about them in th in those ways whether they really whatever that means do believe or things or have desires or

we it's useful for explaining and Understanding their behavior so if we adopt their intentional stance say to use one of Dan Den's own examples towards a chess machine a chess computer uh chess program and we or go program and we and we might say oh um uh it Advanced its Queen because it wants to try and pin down my my rook and this is just a natural way of speaking and if we use that way of speaking then it it's it's just good in every way because uh because we can then uh you know discuss

Among ourselves what the what the machine is doing we can explain what it's doing we can predict what it's going to do so that's taking the intentional stance and it doesn't necessarily bring with it uh a belief that these concepts are literally applicable maybe they are maybe they're not but this is where it gets interesting so um I I discussed abduction actually when I spoke with um Dan because it's very closely related I Know you studied reasoning for many years and the way I say it when we adopt the intentional stance what we're doing is

we're we're kind of building a set of variables to describe the behavior of of the entity and you were just making the argument from the lens of wienstein that the the behavior is is the only thing no no no no no no I go on no no no I don't I I I definitely don't think that that I don't know what we mean meant by behavior is the only Thing but but certainly when we're deploying uh uh you know um psychological terms uh I don't think that in any sense behavior is the only thing that

that determines how we deploy psychological terms no no you're absolutely right so there there is there's still a massive amount of ambiguity so when we perform abduction we are um we are creating a hypothesis and we're selecting out of an infinite set of possible hypotheses but the Behavior gives us all of the information so it's almost like if we knew how to create the correct explanation we wouldn't be missing anything just by observing the behavior so what are we talking about now we talking about machines or or or or animals or what what are we

talking what's the context for this thought I I guess it could work for both so let's say I want to adopt the intentional stance for move 37 and I do this adduction so I I build this plan That the agent had so the agent had this intention and it took this sequence of steps and and and I'm using that as a hypothesis to explain the behavior I'm adopting the intentional stance yeah but it's still highly ambiguous because I'm selecting out of an infinite set of possible hypotheses right right and in fact in that particular case

it's almost certainly not the right way of thinking about it at all because uh because uh unlike humans who are when they're Playing these games often do form plans so you know if you're if you're playing chess you often do have a plan I'm going to try and capture this area of the board and command this area of of the board saying so I'm going to move these pieces around to try and and do that and you might form a plan in terms of several moves but but uh ahead but but typically that's not the

way that's not the way alphago works um so talking about it making plans isn't really the Right way of way of doing things so it's interesting actually because the because because the uh the intentional stance you know maybe maybe uh maybe it's still might work you can still talk about about something uh forming plans maybe but it's but it's really not quite right in that in that case um yeah again when I say subjectivity I'm I'm not talking about metaphysical you know dualism but um the intentional stance clearly is a form of subjectivity and when

we as a Diverse collection of Agents form our own intentional stances it would seem to be quite a chaotic weird and wonderful thing yet it yet it seems to work um there seems to be by the way an interesting thing here is um the way we um ascertain agency and culpability is based on the intentional stance so you read a news article about someone being stabbed in Australia or something like that and the news article is trying to give ex you know reasonable explanations Oh it was because he was in a cult or it was

because he was religious or it was and and this helps us um kind of assign moral veillance to what just happened yeah yeah absolutely yeah you said something like when we take the intentional stance that that is a form of subjectivity or something yes would you agree with that um so uh I I I I wouldn't put it quite that way I'm not quite sure exactly what you mean by that but but I but taking the intentional Stance is I think is just adopting a certain terminology and a certain vocabulary for describing the behavior of

something so I don't think we need to bring subjectivity into that at all right so so uh I think we maybe we're using we're mixing up two completely different senses of the word subjectivity here which is something we should be very careful about so so so so when I think you mean subjectivity you mean that you've just made a choice made Your own choice between different hypotheses and so it's subjective is that is that what you mean there yes so let's say so for me as an observer I might do some let's call it probabilistic

reasoning and for me the most reasonable rational explanation is this and it's so for me there that's for me I that's why you're alluding to the subjectivity yeah yeah okay yeah well um so so so you're for me uh uh is I I think that's a that that the sense in Which that's subjective is a very very different one from the topic that we were talking about earlier on because I don't think there's anything philosophically problematic but in in in saying that you know that I made my choice and uh that's my preference and so

on and so and somebody might say well that's just subjective and sure okay there's there's nothing philosophically problematic about that right so so they but but earlier on we were talking about Subjectivity which like a big capital S and where where it's alluding to kind of something whole metaphysical thing and the issu issues of dualism come up so so uh so so yeah so I think this is a very different kind of thing so that that's fine so for the remainder of this conversation capital S subjectivity is dualism and and lowercase subjectivity is for me

yeah is for me okay yeah sure okay can you tell me about the risks of anthropomorphization yeah so I think so In the context of uh of contemporary artificial intelligence in particular um then then the danger of anthropomorphism I think is in uh is in thinking that uh that a system such as a large language model you know a chat bot or something thinking that it has capabilities and that it doesn't that's it's as simple as that actually it's also thinking perhaps that it lacks capabilities that it does so so in so in both cases

I think we can go wrong we can go wrong by because they Exhibit very humanlike linguistic Behavior we can just assume that they are going to be very humanlike uh in general in all of the rest of the behavior that we encounter with them but we often find that that's not the case so we we can find that uh uh in one moment a large language model might make a ridiculously stupid mistake that no child would make and and and and in the next moment it's saying something extraordinarily profound philosophically Or uh or summarizing some

you know enormously difficult scientific uh article uh you know really accurately so which is so these things are kind of superhuman powers or translating something into four different languages all at once and they're they're sort of superh humanish capabilities um so it's not so it can actually be more than better than human in some directions uh and but but clearly very deficient in in others you know with contemporary uh Models that can make all kinds of stupid mistakes they can confabulate they can make errors of reasoning and and just say Dar generally Dar things so

so th so those those are examples of where you know it's a mistake to on the basis of you know a certain amount of interaction think oh it's just like a human you know because you can just you can just misjudge it in many ways that's one thing there there's another other aspects of anthropomorphism so that's That's just in terms of its cognitive capabilities if you like but there are other problems with anthropomorphism so if we see uh empathy there where there isn't real empathy then uh we may uh trust something where there's no real

basis for trust and so that's a problem as well um you know we may form people may form relationships with uh uh with uh you know AI Companions and social AI where they're kind of fooling themselves into thinking that there's a there's a Basis for that in Emotion where there is in humans and it's not there in uh in in in contemporary Ai and I think all those things are problematic so things to do with trust to do with um friendship and empathy and all of these the these things I think where we can go

wrong in seeing seeing them them as too you know as more humanlike than they really are one of the issues I have with anthropomorphism is that people ascribe mental content when it's not there and I Think you are talking about literal anthropomor anthropomorphism which is that they see um humanlike qualities when when they are not there and to me that that's an important distinction because I think if I understand you correctly um you I mean you're very No Nonsense you you say that current large language models they don't reason they don't form beliefs they don't

oh no I didn't say exactly that oh did you not that was the read that's a broad that's A much broader claim so that so I'm not sure I'd go so far as to say they don't reason or they don't understand or they or they don't form beliefs but rather what my my my Approach is to say that we should be very cautious in using those terms so I I'm so those are would all be examples of taking the intentional stance if we were to use those bits of terminology to describe what um uh what

a current you know chatbot or conversational AI was doing we'd be T Taking the the intentional stance and it's perfectly reasonable to do that in very very many cases so I and I so I would never make the blanket claim they don't understand I think that that's not quite right I think ra rather it's that you know sometimes uh it's appropriate to say oh yeah it seems to understand very well what this big long article about nuclear physics was was all about and it summarized it really well it really understood it you know somebody Might

come might say you know it really did seem to understand it and I think I wouldn't say that they were wrong in using that phrase there so uh but then on another occasion you might find that it's that it it for example re people have been uh posing this uh you know this classic uh goat cabbage um Fox problem where you've got a boat and you have to cross a river with a goat to C and you can't have the you know more than two things in the boat at once and You can't have the

goat with the cabbage and all this kind of stuff and so there's a it's a little puzzle and you just have to kind of cross lots of times and do all kinds of trickery right and people oppose uh opposed to it to to some large language models has said okay I've got a a boat uh and a cabbage and I need to get the Cabbage to the other side of the river how do I do it and the large language model models that on several of them just start to come up With this totally Barack

solution that involves going backwards and forwards over there and sometimes they invent goats that aren't even in the because and that's because they've overfitted or theyve kind of like connected with this classic problem yes um and uh and and so no child would make this stupid stupid mistake so U anyway so there you would say well you know just it just obviously just didn't understand what the you know and and of course it's quite right to Say didn't understand in in in in that case and the anthropomorphism uh it comes about when you think that

it understands when we understand and doesn't understand when we don't understand now the reality is that sometimes it understand when we understand and sometimes it won't and some it's all mixed up right so the mistake is to think that it is like us I could yeah and that was beautifully articulated so it it's it's the um Mistake of thinking there is and an alignment both in how the machines think and and where they make mistakes but let's unpick this a little bit because you were saying it's perfectly reasonable to take the intentional stance when the

thing does the thing correctly even though it thinks differently to us and that's absolutely fine but I I love um thinking about these things theoretically and and it's it's delicious talking to you because You have a background in symbolic AI you um were I'm sure around in the days of fodo and poition with their connectionist CRI and even now there are clear examples of language models not being able to do negation and we know they're not touring machines you know we can make some strong theoretical statements that they are limited in reasoning I agree with

you that it's reasonable to say they understand in certain circumstances but But but where I want to get to is okay so we agree that language models cannot perform certain types of reasoning that that we can so so I think we could so I think we need to take each of these Concepts individually so so we we dealt a little bit just now with understanding reasoning is a whole separate thing so so uh and again this is all because you know they're not like us so we have to deal with these things individually we can't

just blank it say oh they don't Understand they don't reason they or they do understand they do reason it's not like that it's you have to take each of these Concepts separately so in the case of of reasoning then uh clearly today's large language models you know do struggle uh very often with with with reasoning problems now this is a kind of open research problem um and uh people are making a lot of progress in improving their their uh ability to to to to solve reasoning problems now where What the right approach to that is

you know is an open research question maybe it's just you throw more training data at it with with with that includes lots of reasoning problems and then eventually you get sufficient generalization there and it's not totally clear that that will work but maybe it will maybe it's to uh embed your um uh uh your or or to surround to include in your in your system not just the large language model but making kind Of external calls to to say a planner or some kind of external reasoning system and you bring that in and you incor

you make something that's kind of a hybrid that uses that that kind of more symbolic approach so you might do that um uh or you might also uh sort of um uh so in in my work with Tony Creswell at uh at Deep Mind we we introduced this selection inference framework where where um you basically you treat the large language model as a kind of module That does bits of reasoning and you have a surrounding algorithm that makes calls to the the to to uh to to the the module in a kind of algorithm that

does a number a sequence of reasoning steps so you have a kind of outer algorithm so there's lots of ways of trying to tackle that problem but yeah just your basic large take your basic large language model uh today um as they are at the moment and you put it in a chat interface it's easy to find uh reasoning Problems that are going to um stump yes we we agree on that and actually my co-host Keith dugar he defines reasoning as performing an effective computation to derive knowledge or achieve a goal and he cites Claude

Shannon by the way he said CL Claude Shannon said we may have knowledge of the past but we cannot control it uh we may control the future but we have no knowledge of it and science leverages control to gain knowledge engineering leverages Knowledge to gain control and reasoning is the effective computation in both you know maybe just in in your own articulation because we can cite examples of things like Abduction which which which you've studied in in in great detail and it feels like at the moment even if we do farm out to touring machine

algorithms in the days of of symbolic AI that was an intractable problem because we've got this Infinity right and it still seems to me that There are some problems which there is no easy answer to m so I so I I have a I feel that you're alluding to fodors some Fodor type arguments here about uh about abduction and uh maybe but yeah I don't know but I I mean I think that um sort of abductive problems are where where we are looking for an explanation for something so so so I think you'll find that

there's a large collection of of you know abductive abduction problems that you could just present to to Today's large language models and they would do quite well at them you know and at the same time I'm sure wouldn't be too difficult to find problems especially if they involve many steps uh where you where it will go wrong because because because because doing multi-step uh you know it's it's the multiple steps that really um uh are where today's large language models are a bit weak um so that's you know it's again it's an open research question

People are working all that all that that that kind of thing so if there are if there are multiple steps and you know Step n is dependent on Step n minus one and it's it's inherently they're inherently very computational sorts of things in that sense so so large language models are a bit weak at that for sure and and we can introduce things like Chain of Thought and so on to try and uh improve that but they sort of a limited success but me my my feeling is That is that those are not the things

where which large language models themselves are inherently strong at and that you should probably make use of other tools uh in order to kind of boost boost reasoning capabilities like I listed some of them so so so some is making external calls to to to to reasoning uh you know dedicated reasoning components uh sometimes it's embedded in the large language model in a reasoning algorithm itself so you can Go either way you can either take the large language model put reasoning things inside it as it were that so makes external calls or you do it

the other way around you can have a reasoning algorithm and make the larg language model a component of the reasoning algorithm yes I suppose it's a similar thing to the creativity that there's a kind of creative or inventive abduction and then there's colloquial abductive interpolation and the Remarkable thing is just how structured and predictable our world is and how far you can get with with the colloquial predictive AB yes yeah I mean there when you you speak of yeah colloquial uh uh sort of abduction which is the kind of thing that that that you know

we could all and the person on the street could do if you you know you you you have some uh you know slightly odd situation and you say I wonder how you know what could it why is this why is this man in the Middle of the road you know uh with a policeman's hat on or something and and I don't know I'm just making something and uh and you say well because there's maybe there's been an accident or something and you know so we do all this kind of thing on an everyday basis it's

a bit of a little bit of abduction it's what you call colloquial abduction I think but large language models think you find a pretty good at that kind of thing these days but if you Have something that's really complex and has a load of steps to it uh the kind of thing that humans would struggle at then very often large language models are going to struggle at those things too can you describe the touring test the touring test yes yeah okay um so uh I'll describe the uring test as it's popularly uh I was hoping

you would as it's popularly conceived because because it's there are new aners in touring's original paper but but then again he Didn't call it the uring test so obviously um so the the uring test as it's po as it's popularly uh conceived uh involves having a human judge and the human judge is interacting with two things one of them is a human and the other is a computer system and the interaction is entirely through uh language through a keyboard and a and and and a screen say or a Telly type if you like in touring's

day um and uh the idea is that the the human judge has a Conversation with these two uh two things and the question is can the human judge tell which is the machine and which is the human and if uh the judge can't tell which is which then uh then the machine passes the uring test do you think it's a a good measure of intelligence um I think it's a pretty rubbish measure of intelligence uh I mean I think it's it's a it's a very you know it's a use very useful philosophical uh thought experiment

and And and starting point for conversation on this but the trouble is it's very easy to game well there's a number of problems with it which people have been writing about for years by the way um so um uh so a number of problems with it so so one is that it's sort of it's easy to game in a sense because uh because there's a temptation to to make something to pass the uring test you make something that has all kinds of strange human ticks and peculiarities And then it seems human and so you can

fool the judge that way by making it just a bit eccentric um which has obviously nothing to do with intelligence at all um so that's one thing and then then another thing is that it's is that the domain of the test is purely linguistic so you're not testing anything to do with the sorts of intelligence that you get in in a in a non-human animal say so you know dogs and cats and mice uh exhibit all kinds Of intelligence in their ability to navigate the ordinary everyday world and and survive uh in it um but

none of those kinds of intelligence are are tested by the uring test so um you were the scientific adviser on the film ex MAA and I'm not sure whether it's MAA or Macho MAA it is it is ma yeah I was speaking with um Irina rich and she said ex ma but um anyway she's right she's right yes she is I digress um but there was a a Special type of touring uh test in in that film can you can you explain that well indeed it wasn't it's not the touring test so there's a a

point in the film I assume that that people are Vaguely Familiar with the the set up with the film um but there's a point in the film where um Caleb the programmer is talking to Nathan the billionaire who's developed this robot AA and Caleb says oh what you're doing is is is you're trying to build something that Passes the touring test and Nathan says oh no we're way past that the point is in the touring test uh you know you don't know whether whether um the the thing that's being tested whether you know you don't

know whether it's a machine or or or or human that's the point of the test the point Nathan says is to show you that she's a robot and see if you still think she's conscious um so there's a number of ways in which this is very different from the Turing test so first and for foremost it's a test of Consciousness not of intelligence and those are not the same thing um and secondly the point you know as he says is is is to is to be persuaded um uh that the artifact you know is conscious

even though you know that it's not human not biological so you so that so the so you know that it's uh an AI system and you still think that it's conscious in that case it passes this test so this test I call the Garland test after Alex Garland who is the writer and director of of X mackina quite different from the cheering test I I think it quite uh I think that those lines in the film were actually really brilliant lines and really uh a really clever um uh uh you know really clever lines from

Alex in the script and when I first saw the script um which was long before it was filmed um and and that that bit was in there and I and I put spoton next to those next to those lines In the uh in the script because I thought it was it was such a very good test I mean you know presumably we're supposed to think that Caleb himself in the film does indeed you know think that that AA is is conscious and thinks of her that way what that really means is that he comes to

treat her as a fellow conscious creature and we see that in the film because he wants to help her escape and and just as a kind of thought experiment and maybe it's another Conceivability thing but it it does seem conceivable what less Consciousness would be like but it seems less conceivable what more Consciousness would be like um well uh so so in so in both cases I think it's all about exercising our our imagination actually and I I think exercising our imaginations is perfectly legitimate in philosophical discussions so in fact in my my newer paper

simura Uh as conscious Exotica I think I say that uh I I advocate doing uh philosophy with the Detachment of an anthropologist and the imagination of a science fiction writer so I think that's that's something that I uh aspire to do so we so we can you know we can carry out all kinds of imaginative exercises to to to uh to Des describe exotic entities in in a science fiction like way so we can describe an exotic entity and we can describe all kinds of you know Strange Behaviors and um uh and that's sort of

as far as we can get really I think we we could we could we can imagine um all kinds of strange behavior and we can imagine scientists studying those kinds of strange behavior and what underlies them as well and so we can imagine all of those things and then uh and then we can also imagine how we would talk about those things so imagining how we as a kind of community and how the scientists and the philosophers would talk about uh About these imagined entities is all kind of part of it so I think we

can do all of that I think that is a kind of doing philosophy and um just coming back to the um touring test one one last time I read an interesting take on Twitter recently that we've been thinking about the touring test or wrong there there seems to be a subset of people who it's almost like the elizer effect they they see something they want to see something and it's almost like the test is Actually testing the humans rather than the intelligence yeah yeah sure I mean I yeah it's very it's very uh tricky territory

and again I think we should distinguish Consciousness and and and intelligence in this in this regard you know although there are relations between the two or Consciousness and cognition so so what just one thing I wanted to say about the about the the uring test is I do feel actually that today's large language models kind of Pass the spirit of the uring test so so we we defined the uring test earlier on or the or the kind of uh the popular conception of the churing test earlier on and as I remarked you can kind of

game it and you know and there are certain sort of problems with it but not withstanding that I I think that actually today's uh large language models pass the spirit of the cheering test I feel so they pass the the spirit of the cheing test because they do Really have human level conversational skills I feel so I I my my feeling is that if Turing were alive today and were were presented with gerini or chat GPT or Claude three uh he would say yeah that's what I had in mind you know you've done it that's

that's it's kind of pasted interesting and and you also distinguished you know the the um imitation game as as defined by touring and the the colloquial popular conception is that there's no Adjudicator it's just you know a person and and and and an intelligent machine what what is what is the kind of the main difference that what I me meant by the popular conception I think was that I maybe popular conception isn't quite right the sort of contemporary version of it that is used in academic discussion in fact is what I meant that's what I

meant by popular um so I so there I'm imagining yeah there is a human adjudicator there but the Difference is in so so churing the way churing sets up the test is uh is um uh you know mean he has some spe specificities about you know the the number of minutes you should you know you spend interacting with it and then also he sets it up in the context of this uh game where party game where uh the aim is to try and work out whether which you've got a man and a woman that somebody

is is is uh conver conversing with you know via paper and they don't Know which is the man and which is the woman they have to guess which is which and that's the way the thing is set up in the original paper is by analogy with that so there's all so so there's there's all kinds of you know peculiarities in the original paper if you're a touring schol that AR aren't really quite relevant to I think the kind of contemporary way that we think of the juring test but I also think that that the sense

in which we in which Today Systems pass the spirit of the juring test it's the reason it's only the spirit of the juring test is because of course you very often can immediately tell that it's that it's an AI not not least because it'll just tell you right away right if if you just ask it it will just say well as a large language model trained by Google or by you know open AI dot so so so it's easy to tell which is which um uh so so but but but but nevertheless I I think

that they have Attained more or less human level language skills um more or less and so I think I do think that churing would would would say you know as I predicted know uh yeah we've got there now I think churing would be fascinated to see the weaknesses that are there and the strengths that are there and you know there are many things that uh today's systems can do that are vastly more powerful and I think he anticipated in in that paper in the 1950s and there Then there are other you know there would be

other weaknesses I think that would come out that would surprise him as they've surprised us all in a way I think I think many of us in the field are surprised to see something that can do so amazingly in certain respects and yet have so many you know still so many weaknesses in others can you introduce Nagel's bat so in 1974 Thomas Nagel uh uh published this paper called what is it like to be a bat and the point of This paper was to draw attention to the fact if it is a fact that there

are uh creatures that are very very different to ourselves to to humans but that we assume um uh have some kind of Consciousness we assume in his terminology we assume that it's like something to be that creature and he chose a bat because bats are very obviously very different to ourselves they fly they uh use sonar uh they're pretty weird animals um and so so the Way the thinking is that well it's probably like something to be a bat but what it's like is is going to be very different from what it's like to be

a human and Nagel uses that example to get at a whole metaphysical idea which again I would say is pointing to a kind of dualism uh to suggest that there's a whole realm of facts about subjectivity which um uh which are outside of the purview of objective science actually um but which Nevertheless you know are are uh part of reality and so for H for him you know I feel that it it alludes to a sort of uh kind of dualistic way of thinking again that there's this realm of subject facts about subjective things and

there's a realm of facts about objective things I read Neal's B many years ago his paper and he was kind of saying wouldn't it be great if we could move towards an objective phenomenology and you are clear that he he is pointing To dualism he's not just saying that it's inconceivable in the sense that I couldn't imagine what the experience of a bat is like it's just inconceivable you're saying that he's actually making a jewelsmith usias deny that they're Duelists but uh but I see dualistic thinking um you know uh or all over the place

and I do think that this this is uh this is an example of it yes I Mr similar thing with John so I think he he Um is adamant that he's not a jewelist but lots of people think he he is a jewelist and actually the Chinese room argument is another example of a popular thought experiment which apparently people get wrong my my friend Mark J Bishop he he always um uh makes a point of saying that people misunderstand the Chinese room argument yeah yeah yeah yeah Mark Mark has um uh you know tenaciously clinging

to the to the to the to the Chinese language um the Chinese room argument and but uh but yeah I don't I can't really see eye to eye with Mark who I have enormous respect for on this on this particular subject I'm afraid yes I Mark is a very good friend of mine and um yeah maybe that that's a rabbit hole he won't go down but I'm I'm very um you know amenable and and convinced by by some of Mark's um arguments but but he he's certainly a phenomenologist which is um he he is adamant

that he is a monist so He's not a Je it's another example of someone who who claims they're not a jewelist but you would say they probably are a jewelist yeah well I don't know we'd have to have Mark um sitting here I mean I I would have needed to have read one of his papers on this very recently to kind of uh to do that so I'm not going to accuse him of anything but you know if he was sitting here we could have a discussion about it well I I asked him straight up

because I was Trying to pin him down and he said he is um an idealist a monist idealist right and it might might be instructive actually for you just to explain what what what do we mean by idealism right well so so uh so if especially if he's using the word mono there as well then I I guess that that he is suggesting that um uh while a physicalist you know thinks that there is only one substance in reality and that's that's that's material uh reality and so uh and so um The stuff of there

is no such thing as as a separate stuff of mind metaphysically so so for the physicalist there is no dualism if they really really think that but the trouble is that as soon as you kind of probe then often they have struggle to deal with uh dualistic intuitions about subjectivity now the the idealist uh on the other hand also uh thinks that there's only one substance but that substance is the substance of mind so uh so physical uh Reality has to be a kind of construct out of uh out of this one substance uh with

this one you know out of mind stuff so that's a very different kind of uh uh so it's a different way of avoiding dualism uh but you know it has its own problems about you know how do you explain account for Science and the success of Science and so on I also interviewed Philip Goff in a beautiful Studio recently and and he is a a cosmos psychist right but um I think it's Better just to use the word pan psychist but I think Cosmos psychist is is where you have the um the teeology baked in

so rather than it being bottom up it's kind of a top down there's some Cosmic purpose to the universe and the universe as a whole is made of you know the like the the the fundamental material of the universe is consciousness and what's the relationship between that view and and idealism so so I guess the the uh uh for the for the pan psychist so the pan Psychist you know um you know in so far as I understand these philosophical positions I mean you know I shouldn't I shouldn't put myself forward as somebody who necessarily

you know understands them in in in depth and there you know every everybody has everybody who subscribes to these views has a different slightly different version of them as well so but the pan the pan psychist certainly thinks that reality is composed of of physical material Substance but that physical material substance has uh you know irreducibly has a psychological Dimension to it a mental Dimension to it so every uh um um physical object you know has uh you know has a a little bit of Consciousness in it if you like uh in some sense I

mean I find it a very difficult um view to articulate because I just find it so completely counterintuitive so I can't really put put the pan psychist hat on And and and and and express their their point of view you need a pan psychist here to do that um because for to my mind this is uh you know again with my bit consan hat on um then I just think well how do we use words like Consciousness well we we we use the word Consciousness uh and all of the re related terms in the context

of each other of other human beings and uh um so you know I and it's in the context of uh our being together in the world and uh And and um and and your behaving in certain ways and our exchanging certain looks when we're doing things and that we that we understand each other as fellow conscious creatures and so that's the context in which we use you know words like conscious and when I speak about you know you're not conscious you're asleep and uh you know it's all in the context of other humans that we

use those words so it's just ludicrously inapplicable to use that word in the Context of you know I don't know a brick or a toaster or an atom um it's just simply the words simply are not applicable in those contexts yes it's so interesting because um Philip told me that he he grew up as a as a Christian and I've had Richard swinburn on as well and he he's got a a book out about are we bodies or Souls you know putting forward the case for substance dualism and so there there's there's the moving away

from dualism thing there's The thology things because I think if you are of this Frame of Mind you like to believe that there's some kind of grand uh purpose and and there's also the um wanting to not uh wanting to be a monist basically so could you could perceive it as a of mental gymnastics to say okay well I don't want to be a substance dualist but why don't we rearrange the structure somewhat so that Consciousness comes first and there's some kind of cosmic purpose and we're Building a worldview that still makes sense to me

yeah yeah I mean I my personally I think all of these positions involve a great deal of mental gymnastics and uh and I uh you know I reject any kind of ism well what I mean by that is I don't use that term to describe you know My Views at all so all of these isms are uh are you know are are misguided and the and and and what we we need to do is just to to uh um to dismantle the or or uh the the um the Whole way of talking which makes us

you know makes us confused in the context of this kind of uh these kinds of issues so that's that's what Vicken Stein is trying to do to as he puts it to show the fly the way out of the bottle right there's this famous phrase so the fly is the person who's ended up thinking all these philosophical Thoughts by taking ordinary language into strange places taking it on holiday and um uh and so to show the fly the way out of the bottle Is to just bring all of these Concepts back to the to their

ordinary everyday usage and and to show thereby that you haven't really you haven't lost anything and the the puzzles evaporate so that isn't an ism that's a that's that's rather that's a kind of um a kind of critical methodology for just shifting the way that you think and talk Al together and final question on this where do you sit on the the kind of the teleology question so what one view is That we we have a purpose the the physicist would argue that it emerges you know from Quantum field Theory and um a lot of

sophisticated study of biology kind of build this intermediate view of tonomy um where do you sit on on that kind of spectrum I'll be honest with you I don't think I sit anywhere on that Spectrum I think I think those are issues on which I don't have sufficient expertise to pronounce uh so so you know just keep me keep me on on Consciousness And cognition you know I mean that's that all this stuff is above my pay grade you know Professor Shanahan it's been an absolute honor to you know have you on mlsd thank you

so much I really appreciate thank you thank you so much for having me it's been fun [Music]

There are monsters in your LLM. (Murray Shanahan)