Forof Samsung war games [Music] for D podcast uh so well here we are with David isot Lewis again more or less at the end of the year Christmas Eve and having an interview because artificial intelligence doesn't stop right David it does not there's no break for AI absolutely not we have the last two weeks absolutely crazy tweeks in the comments of my YouTube channel someone says are you okay are you tighted you look you look you look tighted maybe you're skin color it's not so okay and he thought totally exed I hope this last week
of the year it will be relaxed and having nice nice days with with the family because it's absolutely crazy the the last two weeks fortunately anthropic well fortunately Anthropic didn't have any serious announcements so that's good yeah hopefully I'm not sure because I'm not sure either but any I think they they are in in the mood for for news and news especially with the last announcement of openi that put Above the Rest again so I don't know what do you think about the last weeks General so I kind of agree with a private comment you
had sent me it seemed like the 12 days of open AI were Becoming the 12 days of Google and then of course you know I was very disappointed on day 11 remember on day 11 they they opened talking about agents and then they said but that will be next year and then they said let's let's look at our desktop path and it's like I don't care about that okay I want to hear about your agenic systems and then and then then on day 12 they redeemed themselves I was happy for day 11 because it was
a such of a day for me to To to making a video so it's okay that it was such a Pity day I thought that they they go with a a new in chpt like like task or something like that but uh they just released that minimal new but the 12 day was was weak but we did know remember that folder that said AGI don't show this on the live stream that I think yeah that was great maybe and obviously this doesn't seem to be fake right this seems to be that it was real well
they weren't Faking it and I you when you look at Kevin's response the product manager at open AI he yeah it seemed like it was real it didn't seem like this was a joke I don't know it it seemed a continuous joke that it turns to real because it on on day 13 doesn't show it it was I don't know kind of of a joke but not a joke it but do you think 03 is AI or not AI or it doesn't even uh matters if it is or if not it is So in May
in May of 2024 when we when we spoke when I was in Andor with you um that's what we talked about right AGI when it's going to happen and now I would argue that maybe it's not really critical about that and the and here's my point nobody gets hired for their general intelligence right this that's game show knowledge who gets hired for game show knowledge you could win a game show but you're not going to get hired for that so is that really essential Anyway and then we could see with the improvements in the models
looking from like an IQ of 85 to some test that showed 120 125 and then with the 01 release you know we figured okay 01 when it gets out of preview is going to be about 130 which seems like it's clocking in at that and now we look at 03 which might be pushing you know that's just pure speculation maybe 135 you know because of the exponential curve on IQs it's not not going to be easy to Increase IQ levels very much the effective IQ level it's like when they say inference you know is really
kind of in a way like IQ it's not but it is from a practical perspective for most people it really functions like IQ does so so I know we saw all that and then well okay if the average IQ is 100 which it is and then the average worker a white color worker might be 110 and it's already operating at 125 what does AGI Mean anyway right does it really mean anything and then you know just the only ilas when he started uh since we last spoke you know safe super intelligence and if you saw
his presentation in nurs it's kind of like AGI yeah that's the last Millennium isn't it you know it's like yesterday oh by the way I the last time I wore my feel the AGI from Ilia now we have feel the you're absolutely crazy I love you so you know that's Ilia you know that's What he's doing and and that's now the March right it's you know obviously there's a lot that has to be done and you know you saw the cost of of of how much it's going to cost to run insanely expensive to make
working at the high level of of3 at that level but we know they're going to work on getting the cost down that's going to be a big big area and and that's you have test time compute test time training so on that's going to help With with resource allocation and figuring out when to do inference and and such and then uh and then the other big thing I would say going forward is you know think about this The Chain of Thought paper came out in January 2022 three years ago three years ago we've had then
tree of thoughts graph of thoughts we've had other other uh thoughts uh models but I think tree of thoughts and graph of thoughts would be the next two logical To come after Chain of Thought uh so so there's still a lot more to do and and then we also see the the models like R1 from Deep sync we see qwq from Alibaba now you know the chip embargo from the United States really the real tough one took effect just last month in mid November so figure when you do red teaming and you do guard rails
and such it's not going to affect any of their models from China for about maybe another nine months maybe two maybe even A year uh won't affect their models so they'll be able to make progress to a pretty substantial level as well but then of course that's when we start really entering into getting the cost down right getting the cost down at 03 or whatever is going to follow 04 um and then also looking toward ASI how if if if the costs are this ridiculous ACI and ACI is that in fact we have ACI we
have since latest 1990s deep blue that it's an ACI it's the best In ch that any other human could possibly be since we have a in some other games and my sensation is that we will have a bunch of a in a specific task so we will we have just to connect them in order to make work properly in concrete ideas so maybe we don't need AG as a functional general intelligence like us why we need that it's right exactly we need know it's going to be all domain focused I Think everybody kind of agrees
that 2025 is going to be domain specific applications uh so what does general intelligence even mean you know people are just moving the goalpost if it if they don't like AGI they move the goalpost right AGI now means that you can defeat space aliens in a war you know it will smooth the goal post and then other people will say oh anything's AGI right you know I I I was surprised just following everything I've read I've Read maybe 35 posts from substacks maybe a dozen from beehives looked about 15 videos maybe 20 videos on on
03 from YouTube and just looking at what every everybody's saying um I would say the consensus is that yes we've effectively achieved AGI uh I say that's I would say that's the consensus from from all the very ious sources it's effectively AGI but then what does that really mean if it's that expensive with 03 not 03 mini but Even but even 03 mini is not cheap okay but 03 if it's that expensive it's like okay we achieved it now we can roll it out uh cost effectively in how many years how long will it take
you know that's I think that will be fast I think this this point but for my experience in the other areas and models the once we have the the f it the optimization is quite fast at least two years yeah I agree I think it will be fast but still we're not sure and then if we're going To have like tree of thoughts and then move to graph of thoughts because that can handle nonlinearity better instead of linear and Chain of Thought we can have nonlinearity with graph of thoughts very good for that very good
for Creative uh applications and then even tree of thoughts where you get you get to look at different things simultaneously not exactly nonlinearity but you're looking at things s multaneously looking at various options On your decision tree so so and then the whole idea of simulated reasoning that comes with this fre from Chain of Thought So uh the private Chain of Thought behind 03 so so there will be uh it will be quite an interesting year uh I think next year and the agenic systems will roll out level three no I think we will discover
different kind of intelligence because until now we were comparing intelligence artificial intelligence to what we can do and when We we reach this level where in artificial intelligence can do some things or some specific things that we cannot imagine I think we will discover that artificial intelligence thinks or processes information in a total different way than we can do so maybe we we found out new new abilities of the intelligence behind what can we do and in some aspect this is artificial super intelligence despite it's not so useful or it's not so powerful so it's
it's Interesting well I totally agree I mean will AI make us more intelligent is it is it is it going back and forth where then we make then the models more intelligent or or whatever point I I I think the since since we last spoke and I mentioned to you this privately is the AI scientist idea and that's what our companies my company's doing and we're focused on that product development not cute little Sora stuff I have no interest in that frankly so uh But looking at but looking at really how we can use it
for R&D product development and then we that got validated this year by the Nobel prizes right the Nobel Prize in chemistry the Nobel Prize in physics it got validated so by hopfield and and hiton winning the physics prize uh hopfield for his work when he was at Caltech developing neural Nets the first neural Nets then obviously Jeff Hinton and then of course Demis re Demis Sabas getting the Nobel Prize in chemistry for to work on Alpha fold and such and that's kind of what we're kind of focused on and that's what we want right we
want cures for cancer we want to be be able to have nuclear fusion we want to be able to solve climate problems have climate repair you know look at things that we just really haven't been able to look at and I'm very optimistic you know I'm an effective accelerationist so I'm for it I I know I don't remember who say that But some someone said we have to accelerate because it's the only solutions or the only possible to our problems because with what we have now the only path or the the only Destiny it's the
disaster so we have to accelerate to find the solution yes and I'm even kind of optimistic that we might find even like between America and China ways that maybe we can help maybe it will help us even understand each other and Even accept each other like for example you put in Chinese culture into a model you put American culture into a model and and then they both can see objectively how that culture is going to react to a situation rather than saying my culture is better well or we're better they might say well yeah given
the way you guys are maybe this is appropriate for you and vice versa where we can build some level of Tolerance I'm hopeful that that I'm a little you know We'll see we'll see but but U at this point I am a little bit skeptical because I think that uh people it's tends to be comfortable and tends to have reinfor forcemen in his ideas so artificial intelligence can do that but also can create EOS Camas Camas in in a way that they just reinforce the way that people thinks so I don't know it's possible it's
uh wishable to to have this kind of of situation but I don't think it will be the the scenario in the Future but hopefully it can hopefully yeah we hope we hope it will happen uh then of course there's the race between I I agree with Eric Schmidt it's important that that uh that frankly I I believe it's important that heterogeneous uh liberal democracy which is really going to be two of them right America and the UK are really the only two in the on the planet to any great extent that they actually kind of
lead the charge uh in AI even though we do a Lot of dumb things in the US you know we are a heterogeneous society you know we need an America so every you know there's got to be a place that anybody from anywhere on the planet can go and be successful and their religion and everything else will get accepted yeah their first generation it will be tough but the second generation their kids graduate from Harvard and the third generation they're president of the United States so you know it's just you Know it's that's just the
American way we know we're an immigrant culture we accept that we don't necessarily like all the immigrants we're getting at any particular time and not all of them are good but nevertheless we have that mentality I think it's important that that the US does kind of take a lead on that but we'll see we'll see how that works out I think with the uh with the chip embargo what happened from from mid November That's going to slow down China quite a bit uh it's going to be very hard for them to get around that but
still over the next year they could develop 03 capability I believe that's possible no the the last year every every year it's the same opening show something on January and by July or August the competence seems to be quite there in some way so I think this year will be the same and if it's not in China will be meta or will be ilon mask What do you think about ilon mask know that trumps has won and have so plenty of how to improve or you know I do you may have heard Sam Alman just
made a comment that Elon Musk is kind of a bully and and I do feel that the way he's treating open AI is is inappropriate but if I had to tell you something I mean quite frankly I think Elon Musk is the most important person on the planet because he brings the future to the rest of us he has visions And he just says do it okay just do and he does it okay so that's a crazy point he does I I always say the same it's like a James Bond beine the B of A
James Bond movie in some kind of yeah it is it is I mean I think that's a good analogy you know in the in the tech in the tech entrepreneurship world I know he can do that so but that doesn't mean we agree with everything he does I don't agree with everything he does he's quirky but So is Einstein so was Z Newton so they just didn't have social media to blast all their quirkiness to the world right so Einstein didn't have that and Isaac Newton didn't have that so that just kind of goes you
know part for the course for these these kind of individuals and and da Vinci as well probably the greatest inventor right of the last of the last Millennium so you know musk is musk and and all but I don't like the way he's handling open AA but I don't think he's going to win I don't but again who knows on the legal aspects um of course one thing to make sure your your audience understands you probably have said this uh I translated your transcript from your eight from your 03 but I I don't know if
you know how how that came about how that actually uh how good it was but um I was going to make a comment here I'm sorry I lost my train of thought let's let's go On okay it's it was about something 03 but about ilon M I think that's just today they have RIS it another six billion to weeks AI you know yes I know what I was gonna say that everybody should remember everybody that's watching this open AI cannot say they've achieved AGI because because it's in the charter that if they do their commercial
relationships will be broken which would include their commercial relationship with Microsoft Which is absolutely essential to them I would say they need Microsoft more than Microsoft needs open AI but I think it's a pretty good symbiotic relationship between the two here's the thing here's something probably your listeners have never really contemplated or your your viewers have never really contemplated Microsoft's market cap the value of Microsoft has increased more due to their relationship ship with open AI than open ai's market cap than their Value in total that's absolutely crazy think about that think about that and
so it's been very beneficial to Microsoft and when I was at Microsoft working on the BD team we had a strategy that we would actually buy companies and we would kind of calculate how it would affect our our share price and in essence they were free Acquisitions for most of them we could estimate even if was fractional we Could estimate oh the market cap is going to go up a certain amount that's when Microsoft had the highest market cap of any company in the world it was Microsoft and Cisco if you can believe that so
those were the two and and we would calculate that and you know and they're kind of still playing the same game right I mean it's like yeah invest all this money in open AI but our market cap goes up a lot more than what we invested in open AI it's a free Investment yeah we I I think we thought about this this kind of situation that happened before that uh with the money it's in certain situations you can invest for free even huge amounts of of money so for for Mortals it's just like n situations
but in in these levels it's it's what you say it's free investment at the end of the of the path or or the year yeah it essentially is so um so going back to what you asked about the what's happened over the last couple of Weeks I thought the Google announcements were interesting Astra is kind of cute I don't know how I mean I think it's practical in the sense you can use your phone and look at something and it can tell you what it is and the like uh Mariner you know so here's another
thing we have the five levels of AI right that open AI created that framework we need to have a framework for agents because even though level three is Agents we need another framework Within that a browser that goes and clicks on certain sites is maybe the lowest level agent okay it could even be it could even be a single agent doesn't even have to be a multi-agent system do you really need a multi-agent system and pay the extra compute and all of that to find out instantly whether you can take a trip or can you
wait 10 minutes you know and and I was using I don't I don't know if you ever played with this last year but consens um what was it called Cognosis cognosis and an agent GPT and I was was using those when they when they first came out and I would show them to people it was a it was a single agent system and I could show it show them how it does research it would show its reasoning what it's actually looking for and such uh it didn't do any real reasoning it just showed kind of
the steps it would it would take and and they made sense and and you could approve or disapprove and or just let it Do it and and we saw that and it worked and it worked well but that's kind of very simple we have a multi-agent system doing research and it's looking at papers and running experiments and doing that's a different type of multi-agent that's a different type of agent-based system and and so we need to have different levels I think in the uh in the agent world because matter of fact something I probably never
mentioned to you because it wasn't really appropriate In May but now that agents are all the rage for 2025 uh I was actually the chair of the web agent session at the first International Conference on autonomous agents back in February 1997 and I was introduced to the conference organizers by yoab sham who's one of the founders of AI 21 in Israel and he wrote the first paper on software agents I believe in 1991 so I that time frame 91 93 Something like that so um so I've been involved with agent technology well you could say
I'm the guy that chaired the first session on on web agents back in February of 1997 so I'm glad to see it come come to where it's at but we have certainly a long way to go but there is there is a lot of there's a lot of theory at least that's been developed and that will also be interesting because it's been all these years now right all these years 1997 and you saw people like Patty Mays doing stuff and I've said yoab and and orins etsion and so on all these people that were like
the the leaders of of of agent technology in the 19 mid1 1990s for many years not just then and and now we're you know everybody's looking like let's make it happen so we might see some very interesting progress with agents I am concerned with security issues because obviously hackers are going to look for the weakest link and That's going to be I think a big issue with a multi-agent system is the hackers but but I um I think yeah we might see a lot of interesting developments we didn't see many this year this year was
a lot of false starts yeah like rabbit but do you think with the emergence of Agents we will forget the classical front end backend applications from the point of view that the agent doesn't need a frontend to operate it's funny to see project Mariner doing some stuff with within the computer but when I see this is but why they they need that they don't need the the interface they they are not humans working with tools they can be the tools it it's like a headless browser right you know if you're going to do web scrapping
so so why do you need the actual browser and that's a very good point I suspect that and I think that I think that's an excellent point um I could see that that humans may want to Know what the agent's doing that it's clicking here clicking there rather than like it goes in the black void you know it goes in the the world of dark matter and then it comes back with an answer so I can see from a human comfort point of view they're going to want to see what it's doing um but it
doesn't really need to and that that's your point that's the point that we we are trying to operate with tools like they weren't substitutes of of us but they can operate in Parallel they can operate in different levels they can operate Within them in in a ways we doesn't know and we can have in fact super intelligence specific super intelligence with General super intelligence talking within within them that's that's for me so crazy and sometimes I think we are just trying to replicate what we know it works but it's not efficient it may not be
efficient I think a lot of automation of trying to Do what humans do with computers isn't efficient so so this is just carrying on along along those lines so we'll see I suspect as people may be T trust it more we could see more of these kinds of headless browser kind of uh activities we don't need to actually see it operating just let it go uh just but yeah we'll see we'll see how that how the I think a lot of that's going to be um how users are going to react to it how comfortable
they'll be uh they'll be With the systems and then you know the problem of adoption it's problem adoption of the technology I think it's the biggest problem when someone says artificial intelligence will change the world in five years I say well it depends on us more than on the technology because it's so slow the process many many many companies doesn't use jgpt two years from the the launching so yeah uh now it might start may you know it may change uh with Everything that's going on they may see something more use uh I do believe
on the employee headcount issue that's going to be a big deal I I've said and I still maintain that that some of the best some of the most rampant implementations will be the highest paying jobs so you've got software Engineers lawyers accountants management Consultants I Bankers uh they have to be concerned uh especially people that are doing work like in Contract law they're not in court they're not solicitors they're not actually pleading cases before the court uh and even then it could be a different situation we're already talking about AI arbitrators in international commercial arbitration
deal uh situations so we'll see how that goes but certainly like in contract law things of that nature ai's got to take over but but one of the things I keep you know because I have to deal with a lot of people on a daily Basis about this their careers and what they're looking at and such and it it's it's not so much I'm concerned at first that people will get fired it's that people won't get hired uh this when you were talking I was remember this exact sentence you say the first time we we
met that the problem is not for who are working it's the problem for who are not still working and I think it is a problem for people who are working eventually but not at first uh Because it's easier you know Wall Street doesn't penalize you so much if you don't hire new people but Wall Street you know it's it's mixed your Market your your profitability could go up if you fire people that's true but you're also going to get some media coverage that's not going to be favorable depends how the firings are done whether they're
Justified and so on so but H not hiring people yeah that's going to be a big issue so it's I think the younger people Who have not really developed the expertise who really don't add a lot of value to most corporations when they come out of school let's be honest about that they don't okay well there are good Target as well as the very highly paid employees um I don't think you know I tell people look it's not going to be that they're going to fire all lawyers but one lawyer could replace two others so
you have one instead of three and I think that can certainly happen with What's going on especially with all the legal AI that's going on so I could see that yeah go ahead in this scenario what what will you recommend to youngsters to young people that is ending his college time and it's about to start working what would you recommend to prepare for this fter yeah you may have seen Connor grennan's um uh video in the from NYU and he said English majors of the future of AI you know it's like okay and I understand
what he saying can you talk To the model that might be uh a little bit simplistic I would I would say if you look at where the funding level is and if you look at if you look at what even open AI is saying where going to have the most impact with 01 forget 03 with 01 it's going to be in math and physics uh secondarily in the Life Sciences and so on then we have to look at well is there economic value in what they're doing yeah to a certain extent Maybe Life Sciences more
because they can cure diseases that we all want other things yeah it's harder harder to say I would definitely say don't major in MA physics for a student I would say that's just like a total just just a total waste at time maybe there will be some mathematicians but like I as I mentioned I you know when Alia s was asked s was asked last year at the center for brains minds and machine Symposium at MIT when he was asked uh will in five years an AI Model create a new non-trivial math and he said
what makes you think the current model can't do that and that was four okay we're talking about gp4 and he and and SG saying this at that time so you know these These are problematic data science I I can't imagine there's data Sciences in 10 years how many people actually have people creating their spreadsheets we do our own spreadsheets now right okay we do our own typing yeah no not not just Spreadsheets you can make your own dashboards from some CCF file in just 30 minutes talking with one and it works better than many of
the dice and you don't have to talk to someone in order you this is not exactly what I want uh change it and I have to wait for two days it's just well I don't like it change it five minutes at long long time to have a new diboard it's yeah I mean look at what we have with canvas now right with 40 canvas and And from the metrics I just saw by the way I want to recommend for your viewers they may not be familiar with this Benchmark I'm going to highly recommend it it's
the live bench and and what makes livebench somewhat unique uh of all the major benchmarks right now none of them have 01 okay they have 01 preview but they don't have 01 livebench does so livebench it tends to be pretty quick at getting to these models um so I Would highly recommend people take a look at that but they also put categories like in reasoning ability I actually have in my computer here and and and they you know different categories math and reasoning coding and so on and you you can see where 01 is really
way ahead of of preview mini and such and everything else I mean now you look at anthropic you look a 35 Sonet and it's not terribly impressive even in creative Writing 40 with canas is doing better on the benchmarks to be uh fair with the rest of companies they haven't released yet their reason models so we have to to wait a little bit I hope that open I released GPT 4.5 or something like that in these 12 days in order to because GPT 40 I think it's a little bit above clo 35 3.5 Sonet and
maybe some of the models of Google so I I miss that so they are all in with their reasoning models it's normal if if we taking Intoon what o03 has showed to the wall but uh I hope I I have the since the feeling that they should have released or showing something like TPT 4.5 uh because it seems that it's not their battle anymore they are focusing on reasoning models and the standard models that are really useful and are what the the the kind of models that give you the fast response so the the kind
of Technology you need to build some kind of of applications and they For the time it's seems like it's in a second second page I I agree now I think you've heard the same rumors that I've heard over the last week that they're having some trouble with so-called GPT 5 and it might be delayed as to what they're going to do but they'll have their GPT model they'll have their reasoning line if it's Orion or what the O stands for who knows but you know whether the O and 01 and 03 is Orion but whatever
it is They'll have the two separate lines for different functions one very cheap one very fast I am surpris that they actually went ahead and released 40 with canvas for free you know as long as you register it's for free for free because really really fast you get out of credits but it's more or less for but think about this anthropic took away the free access to Sonic so now you have to use Haiku if You're using the free well Haiku is not very good so you're not getting a lot and most people aren't going
to go to you know to uh to Google AI studio and play with the Gemini models or they might get you they might get advant U um Google Advance a Gemini Advance they may pay the $20 for that but and that has extra features right like deep deep research and so on that you don't get with quite well it's quite surprising I I haven't made a video yet in in the Channel but the Deep research feature it performs perplexity for example in querly did I would say it's uh maybe uh don't don't a feature that
many people doesn't Focus because it's a paid feature and it's not so fancy but it works really really well I would if your if your viewers are not familiar with it I would say to have them check out Stanford storm which is from Stanford it's like deep research I would argue it's better than deep Research uh some of the feedback on deep research shows that if you want to interact with it it it evidently doesn't do well in that that capacity it kind of produces the research but it's hard to interact with it along the
way uh Stanford storm is from Stanford it's from Stanford University uh it produces articles based on a conversation it will generate a lot of references I'll generate like 500 to a thousand references fairly easily uh from this You can you can just have it chat with itself uh or you can say okay time out let's do this let's pretend you're from McKenzie what would McKenzie say let's say you're from Gartner it's an IT thing what would Gartner say no you can do that or JP Morgan if you're doing a financial deal so you can actually
ask it to take on different roles and just throw yourself in the conversation I find it's uh better than than deep research and it's Free it's free and I have to try it Stanford storm Stanford storm uh the one downside just just to let everybody know it runs on 40 mini that's not the best model but it is open source and if you're capable of running it you could run it on 01 if you wanted to do that I'm not sure I would recommend that until you because it's a lot of tokens you're going to
throw if you're generating a thousand references but but still you could you could you could run It and get a better model but even four I I find even with 40 Mini it does a really good job and it's not uncommon for me to do a to run to run that to run storm on a particular topic and then I'll put it into 01 and that's how I see 01 because you know I still maintain that of the three ways of getting the best results of seating context and prompting seating will be the best if
you seat it properly it kind of gets what you're really after the format may Not be what you want and I usually say look this I want to see the format half of the time the format's better than the one I would have picked so you know so let's see and if I don't like it then I say okay follow this format okay that's fine but as we know prompting is getting less important although it's still important but it's getting less important and and then context yeah also context is getting you know we could make
an Argument that's becoming less important I still think it's pretty important to do context mainly to get the kind of output you want but and provide examples and that actually is something that 01 uh should get 01 should get some examples but the seating it so so Storm's great you can seat it you have something that's you won't get all you won't see all the references because of the way they have pop-up menus for the reference pop-ups I'm sorry not menus Pop-ups for the references but but you could have 500 references worth of content throw
it into 01 well actually in all fairness I I should say this after about yeah after maybe maybe about 500 about up to about 500 references it will fit in the 32k context window for 01 but if you're using API access you could do you know close to 2,000 references if you wanted to and then let the reasoning ability of 01 work on on that material Oh that's that's amazing I have I have to try to give a right to to the St for stone because I did didn't know about this tool and it's great
it seems really really really interesting open source free the whole bit well open source I assume that you put your own I API no no you just run it it's totally free it's a web interface I have to try it and maybe make a making a video about it because it's it seems just amazing that's yeah it is it is quite it's quite Recently changed the way it looked you know the the fonts were kind of goofy they kind of made it look a little bit better you know as they're getting more you know how
it is the engineers design it and there's no nothing about there's no design yeah there's no design there's no Aesthetics just get out we're Engineers what do we care what you say that about that prompting it's uh dayto day less important to to use these tools I'm totally agree and I think that the Most important thing is the people who's behind using the tool knowing where he wants to reach the the the final point the goal and making the the structure of the steps uh in order and it's you just just ask the correct steps
it doesn't matter the problem you you get the the information in packages you just filter the information that is good correct it's if it's not perfect but if you are goinging or filtering or uh wiing the the the project you you get The the final goal normally quite fast and quite good yeah you know where I do see prompting becoming necessary for some people let's say they're on the pro uh not the pro they're on plus and they're using 01 they got their 50 uses per week they're coming up on their 50 then you got
to have a really good prompt because you can't take three to five queries to get to say you know you know what I mean because you're Going to run out so then you have to kind of throw it in even though 01 doesn't like that you kind of have to do that so but if you're on Pro it doesn't make any difference right because it's I'm on Pro at least this month yeah at least this month yeah everybody is Right everybody's not pro this month yeah we'll see what happens next month it was funny because
I I I have a a a good uh I don't know how it's a sponsor in in the Channel that I have free to make Things and when proos on the on the show I said okay let's have a one pro that it's thanks to my sponsor that I could I made the video in order to to try s and and and the promote in to the channel and it was quite quite funny yeah yeah it's and yeah s it's a total disappoint I know that you the video part is not so interesting but it's
not it hasn't been what they promised at certain point but the 01 Pro it's surprisingly good it's I I didn't think thought that it would be So so good the inference is a little bit better than 01 certainly much better than 01 preview so so it's it's some good steps open AI seems to be balancing the the test time reality whether it's computer training and uh and also with well with Chain of Thought again I I think a lot of people gota should really if they're not familiar with tree of thoughts and graph of thoughts
they should take a look at That there's there's stuff Beyond there there's there's ways beyond that but at least we have those two next in line especially tree of thoughts will likely be next so what would a tree of thoughts how much better could that be well it could be substantially better than than Chain of Thought So so we'll see how you know we have a lot to look forward to yeah this is yeah we have a lot to look forward I me now that Chain of Thought worked okay now it's like okay that's Three
years old guys let's move on okay let's let's go go to the next level which are your predictions for 2025 this so I do think agenic systems will be uh probably the biggest deal of 2025 and it sub the consumers will get the most out of and the reason I say that even though I think for my own business and such that 03 would be the most important for research and things of that nature that's Enterprise applications for the Consumer uh it's going to be agents it's going to be agent-based systems so whether they're single
or multi we'll see how that pans out but I think that's going to be the the big one uh as you know my company also does rag we do rag and and Rag and agenic systems go like this you have a genic system you have rag so so we're we're looking forward to that um and then will we have the AI scientist level four the innovator will we have that I believe we will I believe We uh we've seen enough progress in these areas we see I see a lot of papers coming out uh similar
to the AI scientist paper from Sakana uh I see a lot of that every week so there's going to be I think we're going to see some interesting advancements there but again that's not consumer facing that's Enterprise facing so and then obviously level five organizational I think that's going to take a while but that won't be next year So I do think we will see on the Enterprise level we'll see some progress uh toward level four uh I think for consumers level four probably doesn't matter that much and I mean it matters as somebody who
wants to get cured of cancer it matters but for them personally it's not going to have you know they're not going to play with with level four very much uh I don't think so at least not initially and and that will be expensive because people will buy 03 Right they'll do that because it's worth it you know Nome Brown you know gome Brown from from open Ai and and gome said once in one of his tweets and I'm probably gonna misquote it but but the essence is correct he said would you wait a week and
pay all the inference cost and all the compute if you could get a cure for a type of cancer and the answer is of course we would okay of course we would we're going to wait a week for a cure for C of course we will So so those kinds of things um I do suspect we're going to be we'll be pushing people it's going to there's enough financial payback for corporations in certain areas that it's going to be worthwhile and Material Science and many other areas besides know Pharmaceuticals and such Fusion I think we'll
see that even renewable energy I've been looking at things like language models that are based on on a battery for an electric vehicle You know so so we can see such such developments going back to domain specific where the model is domain specific you know it's from from the ground up it's domain specific so I I suspect we'll see those but consumers it's a genic I think it's agenic uh I think we'll have more tools like canvas uh I think canvas between you and me I probably still like three5 sonnets writing better than canvas But
I really like the canvas tools where you have the slider and you can say okay do it at second level college or graduate school or whatever make it longer make it shorter I wish they had better controls on longer or shorter you know if it's like 700 words and you say shorter it comes to 300 words you know it's like how about 500 Words you know let's have a little little you know don't have such a dramatic difference but also and then the you know the final Touches the last you know the second from the
top after the Emojis where you have final touches it does the final edits and such and it's it's you know it's it's good and if we had that kind of capability in in uh for for Sonet um well for Claude I should say it may not be Sonet right it might be Opus or whatever it's going to be next year I I really love that precisely openi started to uh make these changes in the interfaces that we interact with uh Artificial intelligence because in my opinion they were they were guilty about the abuse of conversational
interfaces they were so limited and it seems that with a chatbot you could make everything and it's not true so I'm really really happy that uh open corrects in some kind of point this mistake because with the proper interfaces and ways to use the these models everything it's quite smooth quite faster and the results are better because at the end it's humans Working with technology so we cannot just just talk everything we want and hoping that uh an intelligence understand us perfect yeah I was just trying to look up one of the benchmarks but I
don't see it uh don't see it here it has this is the one with live bench with has 01 in the lead as you would expect reasonings way off the chart compared to everybody else but there was one of them I saw and I don't remember which one it was then Because it is not on live bench but it was on it was on creative writing and 40 was higher than 357 on it now I don't feel that way but that's me you know a lot of that's subjective right a lot of that I thought
I have the same feeling that sonnet it's better in in writing in more flexible style and adapting to the context of what you want to explain but I don't know how the Benchmark it's it's made but my on it it's better at that Fi and I don't know if you played too much with h with 20 Flash experimental and two FL and two flash thinking experimental um I've seen people you know I played around a little bit with it looking at creative writing and it's creative writing is pretty good Jimmy nice yeah yeah I wasn't
expecting much I wasn't expecting much I always say that the style of writing of of gin I like it what I didn't like it what it Writes factually but the style it's it's nice H in fact in in the leaderboard in the chatboard arena Google models where they have't put the style correction always go up than other models because they s more I don't know if human but they write in an style that it was nice than nicer than than others so they put better score uh if we take it into account the style when
you retire the style the gemin models always go a little bit Down right yeah and as you know I've been a big fan of Gemini I've been using it a lot primarily because of the context windows and we've talked about that a lot and and it seems now I was telling you ways to get around it you had to kind of talk to it and say did you really analyze everything I I I uploaded then it says no I didn't only the tener know the first ever said then to go with the gun hey yeah
yeah put a gun to it said do it you know then it Does it but now it seems like you know when you ask it did you do only the top first 10 it says no I did it all then I'll ask it a question to see did it do it all and it did it did so so Gemini seems to be instruction following a little bit better okay know if it says good yeah I mean what's the point of having a large context window if it's only going to read the first 10 15% anyway
right I mean what's the point you're just deceiving yourself thinking That you got all the contacts there and it's like yeah it's there but I'm not paying attention to it so so what I always ask for quotes write the quote yes and then I found in which page it is and okay it's in the 500 page it's not bad it seems like you have going quite deep yeah exactly so so I've been pretty impressed with what uh Google's doing now uh let's see how it rolls out we don't have pricing on a lot of this
but it seems like the pricing for Flash for 20 flash is very cheap compared to uh and looking at the benchmarks it's like that might be the best bargain in town right now right 20 flash experimental is probably the the best price performance of any model so so good for Google uh that they you know Google's got some money to throw at this so so they can and I'm glad they're choosing uh to do so uh we'll see how it goes with some of the other models uh I I I really wonder about the Thinking
in their models but we'll see we'll see they you know they haven't said a lot about that they yeah I have read some news but I'm over saturated these days and I have it to to review but I think they have launched something related to thinking models or at least announced something in this this field too yeah I mean it says thinking so okay uh and then you can see like uh even even what is it 12 what is it 12 what's their which model is it their their Latest uh experimental 1206 the one from
uh this earlier this month that one has the 2 million context window 2 million token context window so you get the full context window of course you don't get that with flash uh two flash thinking experimental you only get the 32k context window so but Le there's a large context window that performs well on the benchmarks you know with two million tokens and I hope you know there Deep Mind keeps on saying Yeah it's going to be Limitless okay deep mind let's you know I like Den for yeah I'm waiting for that man I'm gonna
put you can imagine I have 7,000 ebooks I'm gonna put them all in context window you know it's a we we'll see how how that works out but uh yeah I'm obviously I'm I'm kidding about that but but yeah let's see let's see that I think it it does matter I notice when I when I use and take advantage of the full context window I do get more Insights than I would with other models and that's even without with with lesser reasoning capabilities and it's like it's like Jeff Hinton gave an example it's it's just
because it's it's analyzed all that data you know it's analyzing everything you're giving it it kind of knows what you're looking for again my idea of seating the model right you know telling you what you want by by giving it lots of lots of stuff to go and analyze and it and it does a good Job it does a better I mean I would say in some ways it does equal job to 0 one it's different it's a different kind of a job it will it will notice things that the other models won't notice I
think one it's more horizontal about the more more surface to cover and the other it's more deep I I'm Focus on that go right go deeper it's I think that the future it's tools that mix these models and decide when they have to use one or another in order in order in order to to Give at at at any step the correct information to the proper model and I think this will be tools we will see well yeah can see tools like like that in fact perplexity in a way it's that use all the models
and in in in each phase or in each step it uses one or another yeah so what are what are your projections we can talk about those let's talk about some of your forecast well okay my predictions it's first of All Europe will be on the wagon of the Train the last wagon another year in a row I don't know why but it's my feeling that's the I should interrupt and say so which new regulations will the EU put on AI next year I don't know I don't know I think we have enough regulations I
I hope that they could go backwards or or at least understand what they that what are they doing it's not the proper way to to Proper Innovation and but I don't know I I don't really know I I have to invest in dpn's companies right now I don't know uh the other prediction for me it's it's not an it's is a prediction because video tools I don't know you you are not showing in in the video tools but it's amazing what video2 uh the Google model yeah I have been seen some videos make for users
not just demos that are absolutely stunning so my Prediction is that I some months ago I said that in 2026 we will see a movie made with AI technology primary with ey technology in in theaters I don't know if it maybe could be earlier even it's absolutely nuts what we have in in video now in fact I I will translate it and I will send you my Christmas Christmas uh uh a little video for Christmas that that I made it's it's amazing the the video part the the other thing is that I think The all
the education or formation inside Enterprises will boost I think much many companies will understand that there is a real opportunity here until now I think that they say well I try it and it doesn't works as as I expect so I think this will be start using seriously in companies so this will boost the adoption and at technical level I think it's time for agents it's we will as you said we are waiting for them since 2022 Since so time so long time ago but I think for myself since 1997 yeah but when talk may
you say oh and in in last year were were agents and the other time we talk you said oh agents this time it seems to be on the on the wardr the it's on a secondary plane so I think it's time to agent come here and I will see what happens with robotics uh I have seen an unmo of one of his new robots and it's absolutely stunning what Can do and moving through so different surfaces and and materials and I think it's time to to be some advancing practical robotical stuff so because it's something
we have been seen for the last two two three years but it's on the in a box we we don't see anything for the customer and I think we will see something interesting in robotics it's slower always the the robotic thing because the hardware it's expensive and it's difficult to scale the the industry But I think we have to see something in robotics we we want to see that okay yeah I pretty much would agree with with all of that um I do follow robotics fairly closely uh autonomous systems just like with agents right agents
are basically the foundation of that so so I do agree that we'll see a lot of advancements in in those fields yeah and it's hard to say what the the big one would be but maybe it would be in reasoning uh still but again that Won't be for the consumer that will be for Enterprise but but it will get a lot of media coverage if some robot it's practic useful and even it's some really specific point for some highend company but if it really happens we will see this in the media and it will be
some front pages for yeah and I wouldn't be surprised if we see you know what what's interesting if you look at nature the the journal Nature they have a lot of Articles on their various Publications the nature they have lots of different publications of course uh you know review article type journals and plus resch research focused entirely research focused and and nature itself and if you look at the Articles nature machine intelligence nature and such you'll see articles using 35 for Material Science fine tuned and getting great Results now that's 35 it's two years ago
Tey yeah and then think of where we're at and what we might see so I I'm not sure I want to go out limb and say we're going to get a major breakthrough next year but we might get some major breakthrough next year where people are using the more advanced models um and we just might might see that you it depends how quickly they can fine-tune it how how they can adapt Their existing models to fine-tuning uh other models uh we'll see but um that would that will also change things just like this year with
the Nobel prizes next year it's a real cure for something or something of some significance we are on the time of the predictions and I have here one point that we haven't talked about Willow what do you think about quantics have any kind of impact or not We are not quite yet there so so so let me give my take on that because I have followed Quantum U for a long time as well uh the problem with Quantum is this and and this is again something your viewers may not be really aware of if you
look at all the papers that are published in ai ai historically has been very open AI research has been very open Quantum has been completely the opposite of that uh where there's actually very few papers relative to the number of Papers in AI even even before chaty PT you know even when when it was slow even during the AI Winns it was it was uh there were more many more AI papers than there were Quantum and why is that most of the quantum research is classified so you got to think of this how many who
employs the most physicists in the world the US Department of energy Labs Department of energy as in nuclear weapons as in the Human Genome Project okay thing that's the department of Energy but they don't publish that many papers so what's going on we just don't know because it's classified and same with the Chinese they don't publish that either so the stuff you see that's in the Press it's from IBM it's from Google you know you'll see it from from from uh from certain companies but in general you're not going to see so what's the doe
doing with all these thousands of physicists what are they working on yeah Fusion okay I understand they're working on Fusion some other things as well but a lot of them are working on Quantum and they're actually the right people complex systems Santa Fe Institute Department what Santa Fe Institute where's it located near one of the key doe Labs I mean near two of them right where the atomic bomb was built okay so we just don't know so so that's why my that's my comment on Willow is this we I I fundamentally believe we Just don't
know what's happening mod um related to this because many times I think that Quantum and artificial intelligence at some point they can make a perfect mix and boost each other but that's what IBM says yeah but uh for what I have read about Quantum I don't really know how this could be easily happen uh with with the the way we built uh computer uh Computing in in Quantum so how do you think it could happen if it could Happen in an easy way or it's not easy the word but simple way so so the the
concern I would have there is that the programming the coding that goes into Quantum is completely different exactly that's Point yeah so I would really wonder how well are architectures and such that we have now would even translate to taking advantage of quantum maybe May maybe but maybe not so I would really wonder and I would hardly think they're optimized for Quantum so I think There's just a lot I think that's the next big thing but we'll still have a lot of progress in AI the way it is whether it's you know we go beyond
Transformers or whatever that that doesn't matter uh yanon probably is not having a good day these days anyway but anyway but besides besides the trans you know I'm quite uh I I I wouldn't say the the word fun but I I I listen to to him carefully and I said well I'm agree that maybe it's not exactly how the how he predicts but yeah it's always the it's like a I don't know how it's it's in English aalan it's the kill he says the the things that the others doesn't say to you so you uh
can can have the other point of view but this this last announcement of 03 I think it's quite surprising for for all of the community I I Gary Marcus now is just you know he's getting like desperate to To make to make points I saw I read his last two newsletters that he publ on subac and it's like yeah you're kind of like you're groping you're just you know just like randomly trying to explain things that are happening and and it's just and it's not right so I I mean from from my interpretation of what
he's saying it's not you know he's attacking uh AR AGI and he's attacking what happened and to a certain extent and it's just not that Way it's not not really what happened I would say Francois was actually very anti open AI for quite a long time you know you know in fact and so it's at least until now it's quite on the position of Jan leun more or less for what I read from him so it's not uh suspect and look at 03 now that's on high compute but look at you know an 03 that
it beats the average human it's damn expensive but it already beats the average human so I mean that in that Case you would have to say yes if if you believe in nothing else that AGI by definition now I can understand where people might say well it's not really AGI if it cost you a lot of money to do that that isn't what AGI is supposed to be about and we can get the cost down okay if we can get there we got it we can work on the cost yeah that that that's the point
it's e AI it's AI despite the cost it's that's a problem maybe for use it but if It works it works then by the way have you have you played with Devon at all no I haven't played with with yeah Devon Devon now now I wonder on a pricing model because of course everybody's concerned about pricing right we all we all got the shock it's like Pro is how much each month you know it's like okay 20 to 200 oh then maybe you saw a couple of weeks ago the CFO at open AI said well
you know we're gonna have 2,000 a month it's like oh okay that's kind of Cute so you're kind of letting us know there's something coming out right you know we don't know what but something you're thinking of if increasing the pricing uh means making better products instead of making the lower pricer products Dumber or less useful okay so but I I'm afraid that in the process the the lower product will be less useful oh of course I mean that that's to be expect just on the compute that You get uh all the advantages you get
with with the prices uh that they're talking about but but yeah she said that so but Devon's coming in at 500 for a team it's unlimited unlimited use I don't know how they're going to do that for unlimited use but uh we'll see you know I don't see Devin replacing programmers I don't think they I don't think they do either but it's more the idea that you're G to take one programmer who's so efficient then in in That sense you're replacing programmers no background noise here that I think that's the the whole point it's people
using technology and if people using technology one people or one person it's as useful as a team of 10 you have nine people less in the in the in the crew need it yeah if you can get that kind of a productivity gain I don't know if Devon's going to actually produce that much but even if it's even If it's one and a half times for $500 a month $6,000 a year that's not so bad for a software engineer so yeah I wish them good luck because I think they could be one of the first
that actually because they have a coding application a direct coding application versus you know 01 is good at coding 03 is good at coding but it's not like directly a coding application maybe they'll have that you know in some ways it seems like 03 is really heavily Focused on math physics and coding and such um but Devon is a direct application of that so that will be interesting to watch I think next year adoption of Devon uh it's a good team Smart Guys we'll see I don't think it's so expensive because when you think think
about lenses of spe specific software in architecture wall or in some specific Fields it is expensive and it's just a tool that someone has to make the real work with with it and this is a Tool that it's maybe expensive but it does the work it doesn't need or doesn't need so much humans to work with it that's a a crucial difference and I know that there are many Tools in in architecture that cost about more or less 200 300 a month and it's just a regular tool yeah and I think a lot of us
are going to watch next year I think the 03 Mini pricing will be closely watched you know 03 yeah you know you got to have a Lot of changes to get that pricing to a reasonable level but or have certain applications like pharmaceutical drug development something where there's extremely high payback for uh for using that but but o but 03 mini will be interesting to watch how how cheaply can they get that and that might be that might be something we'll see by the end of the year that will be super interesting in 2026 we
will have this this must be a Traditional Christmas to making new new prediction yeah yeah pred it's it's really really fun I think we have cover I think we have everything we I have I have in the in the list so well for me we can uh end this special Christmas podcast here I don't know you have anything else to to talk about no just uh look forward to 2025 uh we'll have ai will not sleep and we'll have a lot of interesting things we know claw we know anthropics got to Come up with something
soon or become irrelevant so I'm sure they're going to release something soon and I think we've all been pretty impressed with what Gemini is doing of course we're all hoping that Google will actually have a real functioning product and not just good demos so so we'll see they they arrived late but they arrived with with muscle to to surprise us so uh so we'll see so it should be a very interesting very interesting year I'm looking Forward to the agents everything else so let's let's continue this dialogue and we'll see uh we'll see what happens
and of course Merry Christmas to all Fel Navidad merry Merry Christmas to all all your view okay byebye byebye okay I start here and record