so vahed kazimi someone who has a PHD in machine learning and is a member of technical staff at openai recently stated on X or Twitter that in his opinion they've already achieved AGI so the video is going to be about how an open a employee has basically come out and publicly stated clearly that in his opinion they've already achieved AGI there's a pretty crazy statement because that would be a landmark event but I think it's worth this cering for the Wilder community so we can truly understand what's going on here so he says in my
opinion we have already achieved AGI and it's even more clear with 01 referring to the newer model that was released recently in Fall he says we have not yet achieved better than any human at any task but what we have is better than most humans at most tasks some say llms only know how to follow a recipe but firstly no one can really explain what a trillion parara deep neural network can learn but even if you believe that the whole scientific method can be summarized as a recipe observe hypothesize and verify good scientists can produce
better hypotheses based on their intuition but that intuition itself was built by many trial and errors there's nothing that can't be learned with examples now he's basically breaking down how you solve problems and talking about the fact that you know good scientists can produce better hypotheses based on their intuition but of course they've already gone through many trial and errors and he's basically saying that look you've built an1 model that is pretty much the same and even though this might not be better than every human at every task it's better than most humans at most
tasks and I do agree with this because when you look at the benchmarks it's a pretty beefy model now this statement we have already achieved Aji is of course going to ruffle some feathers because I know for a fact that some people are going to disagree and some people would agree however I want to take you back down memory lane to where samman actually said in 2023 AGI has been achieved internally at open ey and I think this date was really really important because I do remember around this time there were leaks about qar which
is of course the model that is currently embedded into 01 now whether or not AGI was achieved or not of course some people could say it's a matter of opinion but we did also recently get the CEO of open AI actually state that we would be getting AGI in 2025 remember this interview on y combinator samman clearly stated it is what he's most excited for what are you excited about in 2025 what's to come AGI yeah excited for that and one of the craziest things that Sam Alman has recently come out and said with his
interview with the New York Times was that AGI is coming sooner than you think but it will kind of matter a lot less my my guess is we will hit AGI sooner than most people in the world think and it will matter much less and a lot of the safety concerns that we and others Express rest actually don't come at the AGI moment it's like AGI can get built the world goes on mostly the same way the economy moves faster things grow faster but then there is a long continu continuation from sort of what we
call AGI to what we call Super intelligence and essentially what he means here is that like 01 is pretty incredible at reasoning like if you've seen the benchmarks for 01 you'll know that these benchmarks are pretty crazy because this is pretty much above the human expert you can see on the right hand side here 01 preview 1 on PhD level questions it's performing at the level of around an expert human which is a pretty significant statement and the reason I do agree with this sentiment that when ASI gets here or when AGI even gets here
completely in full most people won't react to it is because I don't think the average person has a real use for AGI in the sense that like if this model was able to do really complex and advanced mathematics that doesn't really apply that much to the average person which is why it's going to matter a lot less on a societal scale when we do achieve it of course there's going to be some KnockOn effects that are going to be embedded into other technology but I don't think the average person really has use for PhD level
science questions competition math or even competition code now another indicator that we actually might be very close to AGI is of course the fact that Microsoft is going to be you know negotiating its contract with open aai and open AI are the individuals that actually want to remove the clause about AGI from its Microsoft contract to encourage more Investments and if you aren't familiar with what this contract is it's basically a contract to where openi have something in there it basically says once they achieve AGI Microsoft no longer get access to that technology and of
course this is something that they're trying to change because they need additional funding in order to keep the company going now in this article they talk about how openai refers to AGI the definition as a highly autonomous system that outperforms humans at most economically valuable work and that definition is important because the definition is going to be decided by open aai now currently the report actually states that the open ai's board was still discussing the options and currently no decision has been made however Sam Alman Still Remains bullish that the company will achieve AGI in
the near future honestly I just believe that this is something that they're being deliberately vague on that way they can truly Bend and mold the definition of AGI to one that where if they need additional funding from Microsoft they can get it and if they don't need that funding they can quickly say look okay we haven't achieved AGI or we have achieved AGI and they can of course be off to the races with that now if you remember 3 weeks ago I actually spoke about how new research proves that AGI was achieved and basically in
this video I spoke about how MIT researchers managed to get state-ofthe-art public validation accuracy of 61.9% matching the average human score and in this video I was basically stating that look maybe the AI Community unknowingly passed the AGI threshold which was actually referencing a paper titled the surprising effectiveness of test time training and it actually goes into the details of how the arc Benchmark is one of the hardest benchmarks but yet they manag to get human performance on this Benchmark and that's really important because that specific benchmark is designed to be resistant to memory now
some individuals from that Benchmark have actually commented on the recent 01 Paradigm and the 01 Paradigm is arguably one of the craziest paradigms because it is arguably the most ation we've had according to him since gpt2 and that's a very profound statement because it means that everything that's going to come after this is going to be quite impactful in terms of the technology so here he actually says that the two precursors to AGI test time compute and the ability to incorporate that information back into the model are used ni models which could make them pass
the threshold of general intelligence I do believe that oan is truly the biggest Improvement in generalization power we've seen in a commercial model going all the way back to G to this idea of being able to do test time search where they're allowing the bottle to not just generate one Sheen of thought but multiple and search over them do backtracking allows these models to more sort of fully like kind of navigate the situation space around the prompt that the user was given now I think there's a debate of like well is it AGI and I
I would claim no I think there's one thing that's really missing from it that conceptually limits it from reaching AGI at least by this definition the way I just shared which is it's still operating over the pre-training distribution that it was fundamental trained on and you know the way they did this was they went and generated lots of synthetic change of thought over you know formal domains like math and code and programming and things like that there was likely informal pre-training as well things maybe like gold sub goal breakdown that got sort of scored by
humans they use that signal as a reward signal for the pre-train but you're still fundamentally limited by what data got put into the pre-train here and I think in order to at least conceptually relax the constraints on your architecture sufficiently you both need some form of like test time search like what 01 is doing like what a lot of the top on our prise are doing and you need the ability to incorporate information from test time back into your bottle and Carry It Forward right you need the ability to sort of make contact with reality
and learn from it instead having to learn from it sort of every single time we actually take a look at that of course it's pretty crazy because someone from The Arc Benchmark stating that we don't have AGI but we're on the right track is still a very good sign now even if we don't have AGI at this current moment in time I still think the technology is going to be very PCT for because when we look at how this scales we know that there is basically no stopping it here you can see noran brown the
person who worked on reasoning at open Ai and 01 they actually talk about how there's pretty much no stopping this level of scaling and it seems that with the more compute we add the more accuracy we get from these benchmarks now I'll I'll get to 01 so this is a figure that we released in our research blog post and on the x-axis here you have test time compute on a log scale and on the y-axis we have accuracy on the Amy so this is the qualifier for the US mathematics Olympia team it's a very difficult
math test um all the answers are are integers um and you can see that as you scale up the amount of test time compute the amount of inference compute in A1 you go from 20% to over 80% in this exam and there's actually no sign of this stopping I mean obviously it's not going to go past 100% but it does seem like if we were to push this further you would get even more uh performance on this exam so my question is now to you do you think that we actually have achieved AGI quite like
this person who's working at openi have said that they've already achieved AGI I personally think it's 50/50 I think we're like 70% the way there I think once we manage to get a system that can actually interact in real time and learn from its mistakes I think that's going to be a completely different ball game and I think we're definitely going to see that with potentially 02 or even 03 and I think even now considering with this new paradigm being opened I think things are going to move a little bit faster because these companies definitely
realize what is at stake here