hey guys I'm Sarah Wing General partner on the a16z growth team welcome back to our AI Revolution series where we talk to Industry leaders about how they're harnessing the power of generative AI Our Guest this episode is Alexander Wang the founder and CEO of scale AI a company that has become synonymous with Gen and the data needed to power advances in large language models and Beyond with scales work across Enterprise automotive and the public sector Alex is also building the critical infrastructure that will allow any organization to use their proprietary data to build thepoke Genai
applications for those of you who don't know Alex he is one of the most impressive CEOs we've ever met and that's saying something given a16z first met Alex when he was 21 and already the CEO of one of the fastest growing companies at its scale which he founded right before dropping out of MIT in 2016 in this conversation with a6z General partner David George Alex discusses the three pillars of AI models compute and data and how creating abundant data is core to the evolution of gen Alex also shares his learnings from the growth of scale
his approach to leadership and what he thinks growth stage founder CEOs tend to get wrong about hiring let's get started we're very excited today to have Alex Wang the uh founder and CEO of scale AI with us thanks for being here thanks for having me I always love talking to you and I always learn a ton um but maybe to start why don't you just tell us a little bit about what you're building at scale Ai and then we'll dive in yeah so at scale we're building the data Foundry for AI so you know taking
a step back AI boils down to three pillars uh all the progress we've seen has been the uh has come from compute data and algorithms um and the progress among all three of these pillars um computer has been powered by folks like Nvidia the algorithmic advancements have been by the large Labs like openi and others um and uh and data uh is fueled by scale and so our goal is to produce the Frontier data necessary to fuel Frontier level advancements with all the large in partnership with all the large Labs as well as uh enable
every Enterprise and government to make use of their own proprietary data to fuel their Frontier AI development so on this topic of Frontier data um practically like how do you actually get it yeah I think it's um I think this will be one of the great human projects of our time um if that makes sense and I think that the you know the the only model that we have in the world for the level of intelligence that we seek to create is human is humans um is humanity and so the the the production of Frontier
data looks a lot like a sort of marriage between human experts and and Humanity with you know uh Technical and algorithmic techniques around the model to produce you know huge amounts of of of of this kind of data and by the way all the data that we've produced today you know the internet has has looked like that too the internet in many ways is this like collaboration between machines and humans to produce large amounts of of of content and data um it you know it'll look like the internet on steroids like what happens if the
internet basically instead of just being a human entertainment device with this like by product of data generation what if it were just this large scale data generation experiment so you have a very unique perspective into the state of the industry so how would you characterize the state of models the language models right now um and you know I'd love to sort of get into things like Market structure but just sort of what's the state of the of the industry right now yeah I think we're we're sort of closing in at the end of maybe phase
phase two of of language model development I think sort of phase one was the early years of almost like pure research so phase one you you know sort of Hallmarks are the original Transformer paper um the original smallscale experiments on gpts um you know all the way leading up probably until like gpt3 was this sort of phase one all research very very uh focused on sort of like small scale tinkering and algorithmic advancements and then phase two which is which is sort of maybe GPD 3 till now um is uh is really the sort of
like initial scaling phase so we we had gpt3 that worked pretty well and then uh openingi to start with really scaled up these models to gbd4 and Beyond and um and then many companies uh you know Google anthropic meta xai now um you know many many companies have sort of also uh joined on this sort of Race To scale up these models to uh to incredible capabilities so I think for the past two-ish years uh over the past maybe three four uh three years let's say three years um it's almost been more about execution than
anything it's a lot of just engineering like how do you actually have large scale uh training work well how do you make sure there aren't weird bugs in your code um how do you just sort of uh how do you set up the larger clusters like you know a lot of a lot of executional work to get to where we are now where we have kind of a number of very Advanced models and then um I I think we're entering a phase where the research is going to start mattering a lot more like I think
there will be a lot more uh Divergence between all the labs in terms of what research directions they choose to explore and which ones ultimately have breakthroughs at various times and um and so it's an it's sort of an exciting alternating phase between maybe just raw execution versus uh sort of a more Innovation powered cycle they've kind of gotten to a point where I wouldn't say there's like abundant compute but they've had enough compute that they've needed in order to get to the models you know where they're at that's not a constraint necessarily they've kind
of exhausted as much data as they possibly can all the frontier Labs yep and so the next thing will be breakthroughs on that and then advancing the ball on the on the data side is that fair yeah no I think I think basically yeah if you look at the pillars um compute we're obviously continuing to scale up the training cluster so I think that that direction is pretty clear um on the algorithms um I think there has to be a lot of innovation there frankly I think that's where there's a lot of that's where a
lot of the labs are really working hard I think on the peer research of that and then data um you kind of alluded to it you know uh we've kind of run out of all the easily accessible and easily available data out there you know and and um yeah Comon crawl is all done everybody's got the same everyone's have the same access to it yeah exactly and so so a lot of people talking about this is the data wall you know we're kind of hitting this wall where we've U where we've we've leveraged all the
publicly available data and so one of the I Hallmarks of this next phase is actually going to be data production um and you know what is the method that each of these is going to is going to use to actually generate the data necessary to get you to the to the next levels of intelligence um and and how do we get towards data abundance uh and I think you know this is going to require a number of of fields of sort of advanced work and advanced study I think the first is really pushing on the
complexity of the data so moving towards Frontier data um so you know a lot of the a lot of the capabilities that we want to build into the models um the the biggest uh blocker is actually a lack of data so for example agents has been the buzzword for the past 2 years and basically no agent really works well you know it turns out there's just no agent data on the internet um there's no just like pool of really valuable agent data that's just sitting around anywhere and so we have to figure out how to
produce a really high quality give an example of like what would you have to produce so we actually have a uh we have some work coming out on this soon which demonstrates that like right now if you look at all the frontier models they suck at composing tools so if they have to use one tool and then another tool let's say they have to look something up and then write a little python script and then chart something you know if they use multiple Tools in a row they just suck at that they just really really
bad at utilizing multiple Tools in a row and that's something that's actually very natural for humans um to do so like yeah but it's not captured anywhere right is that the point that's the point right you can't actually go take the capture of somebody going from one window to another into a different application and then feed that to the model so it learns right exactly yeah yeah so so these sort of reasoning chains through like um you know when when humans uh are solving complex problems we naturally will use a bunch of tools we'll think
about things we'll we'll reason through what needs to happen next we'll we'll hit errors and failures and then we'll go back and sort of like reconsider you know a lot of this sort of um these uh these reasoning chains these agentic chains are the data just doesn't exist today so that's an example of something that needs to be produced but yeah taking big step back what what need to happen on data first is increasing data complexity so moving towards Frontier data um the second is just is just data abundance increasing the the data production so
capturing more of what humans actually do in the field of work yeah capturing both capturing more of what humans do and I think investing into things like synthetic data um hybrid data so utilizing synthetic data but having humans be a part of that Loop so that um you can generate uh much more high quality data um you know we need basically just in the same way I think with chips we talk a lot about you know chip foundaries and how do we ensure that we have like enough means of production of chips I think the
same thing is true for data we need to have effectively data foundaries and the ability to generate huge amounts of data um to fuel the the training of these models um and then I think the the last leg of the stool which is often underrated is is measurement of the models and ensuring that we actually have um you know I think for a while uh the industry is just sort of like oh yeah we just add a bunch more data and we say how good the model is and we add a bunch data we see
how good the model is but um you know we're going to have to get pretty scientific around you know exactly what is the model not capable of today and therefore what are the exact kinds of data that need to be added to improve the model's performance how much of an advantage do the big tech companies have with their Corpus of data versus the independent Labs yeah well there's a lot of um there's a lot of regulatory issues that they have with utilizing the data um the exist their existing data corpuses um and there's like you
know you can look through through this is before all this generative AI work but there was you know um at one point meta did some research uh that utilized basically all the Instagram photo all the public Instagram photos along with their um hashtags uh uh to train really good image recognition algorithms um that uh you know they had a lot of regulatory problems with that in Europe like it it was it turned out to be a huge pain in the ass so um so I think that that that's one thing that's that's kind of um
you know difficult to reason through which is to what degree from a regulatory perspective particularly in Europe um these companies are going to be able to to utilize their data advantages um so I think that one's kind of TBD um I think that the the real way way in which a lot of the large Labs have just dramatic uh advantages is just they have very profitable businesses that can provide near infinite sources of capital for these AI efforts and I think that um you know that that'll be that's something that I'm watching Pretty intently I'm
very curious to see how it plays out there's this whole you know question in the industry is like are they over investing and if you listen to their earnings calls of the big tech companies they're like look our risk is underinvestigated you know if they really nail this AI thing they could generate another trillion dollars of market cap probably very easily like you know if they if they really are ahead of the competition and they product is it in a good way like trillion dollars of market cap kind of no-brainer um and if they don't
invest the extra whatever it is 20 or 30 billion of capex per year um and they miss out on that and then there's some real existential risk I think too for each of the large yeah in their each in each form yeah yeah you know all their businesses are potentially deep disruptable uh by AI technology so you know the the risk reward for them is very obvious and then I think just from a so that's I think the the big picture thinking and then from a more tactical level you know I think all of them
are going to be able to pretty easily recruit their CAPIC investments just by you know worst case making their core businesses more efficient and effective so uh for example like you know yeah GPU utilization for Facebook advertising yeah yeah if yeah Facebook Google they make their advertising systems a little bit better they can recoup billions of dollars um just by uh the yeah better performance better performance there Apple can easily recoup the Investments if it drives an upgrade cycle I mean these are these are things that I think are pretty um are pretty clear look
it's generally great for the industry that they're investing so much Capital because they also are in the business of renting renting this compute out um or at least in the case of Google and Microsoft they are and the models are making their way like you know uh llama 3.1 is open source and um and so even the even the the literal fruits of all the investment are becoming you know broadly accessible and so you know the uh the Surplus generated from the open sourc GES models is is kind of insane that's insane okay so that
that's a great uh great segue into Market structure at the model layer so what do you think actually happens are there the few players that we've all identified now the handful and they all kind of compete like do you think it's a profitable business what what impact does open source have on the quality of the businesses like take us a couple years ahead and give us your forecast yes we've seen over the past um even just like year and a half the pricing for model imprints you know fall dramatically dramatically dramatically order of magnitude yeah
orders yeah two orders of magnitude two orders of magnitude um over two years and so it's this shocking thing that um it turns out intelligence might be a commodity but uh but uh but no I mean I think that this huge um sort of uh lack of pricing power let's say on the on the pure model layer certainly indicates that renting models out on their own may or may not be the best long-term business I think it I think it's likely to be a relatively mediocre long-term business um well I guess it depends on the
Breakthrough thing which is the earlier point right to the extent that someone actually has a durable breakthrough or multiple people have durable breakthroughs like them potentially mark Market structure stuff so so two things um a if meta continues open sourcing that puts a pretty strong cap as to the the value that you can get from um from uh uh from the model layer and then two if at least a handful of the labs are able to have similar performance over time then that also dramatically uh changes the the pricing equation so so we think that
the um you know it's not 100% but uh chances are the the pure model renting business is um is not the highest quality business um where there where there are much higher quality businesses uh are are going to be above and below yeah course so uh so below I mean Nvidia is obviously an incredible business but the clouds also have really great businesses too because you know it turns out it's pretty hard uh logistically to actually set up large clusters of gpus and so the cloud providers actually have pretty good margins when they rent out
and the traditional Data Center business is very much a game right so that that they are massively benefited relative to smaller players yeah exactly so so I think pick and shovels so so if you're under the model aay um I think there's great businesses there uh and if you're above the model if you're building applications um you know cha jpt is a great business um and uh a lot of the apps uh in the startup realm are actually are working pretty well I mean none of them are quite as big as tgbt obviously but um
a lot of apps uh if they nail the sort of like early early product Market fit end up being pretty good businesses you know great businesses as well because the value that they generate for customers is uh if they get the whole user experience correct um far exceeds you know the inference cost of the models there's some cool stuff here right like I think I think an anthropics launch of of artifacts in CLA is like a um it's like the first pin drop of this major theme of you know uh all the labs are going
to be uh pushing much deeper product Integrations to be able to drive higher quality businesses um and uh and I think that you know so that that'll be the other story is I think we're going to see a lot of iteration at the product layer and the product level um you know the sort of uh boring chat Bots is not is is not going to be the end product this not that's not the end all be disappointing outcome yeah exactly and so and so I think and you know product iteration and product um the product
Innovation cycle is very hard to predict because you know it it's not I mean opening I was surprised how good chat PT was or how popular chat PT was I don't think it's like super obvious to me or anyone in the industry frankly what exact products are going to be the ones that hit um and what's going to what's going to provide the next legs of growth but um that you know you have to believe that that an open Aon anthropic can build great uh applications businesses to uh for them to be longterm uh independent
and sustainable yeah for sure yeah and then it's like what drives competitive Advantage like obviously you have the model A tightly integrated product on top of it and then like the good oldfashioned modes from there right workflows Integrations you know all that stuff and I think I think you can clearly see their thinking I mean like both um open Ai and anthropic hired Chief product officers within like I don't know two months of each other yeah they're figuring it out yeah and then like it's a sort of a change of tune where they're like oh
no we're very purely focused on this and it's like okay I think there's the realization hit so yeah exactly makes total sense you've got an application business with some really interesting customers what are you hearing from Enterprises as to like how they're actually putting this into place I think what we've seen is um you know there's there was a huge amount of excitement from the Enterprise a lot of Enterprise uh Enterprises were like we have to start doing something we have to get ahead of this we have to we have to start experimenting with AI
I think that that led them to this this like Fast P Cycle where they're like okay what are like all the like low hanging fruit ideas that we have go buy AI stuff yeah and let's go try all of it and some of those things are good some of them aren't good but I think regardless um it's uh it's been this big frenzy you know uh much much fewer of the poc's have made it to production than I think um than I think the industry overall expected and I think a lot of Enterprises are looking
at now and you know the Doomsday that they thought might have happened hasn't really happened like AI has not fully terraformed and transformed uh most of the major industries like it's not like totally you know kill it's sort of marginal stuff it's like efficiency gains in support you know and then some of the some of the creative tasks and things like that yeah exactly otherwise it's pretty late you know the the thing that um we think a lot about is like what what AI improvements or AI Transformations or AI efforts that we're working on actually
can meaningfully drive the stock price of the companies that we're working with and um and so that's what we encourage all of our all of our customers to really be thinking about because um at the end of the day the potential is there there's Laden potential for almost every Enterprise to implement AI at a level that would meaningfully boost their stock price um mostly in the form of cost savings efficiency gains yeah mostly in the for well today in the form of cost savings but then also much better customer experiences like I think in a
lot of industries that where there's um uh which where there's a lot more manual interaction with customers um you should be able to drive much better customer interactions if you had more standardization and you were able to use more Automation and then those eventually would would make their way to gains of market share with respect to competitors so um that's what we're pushing our customers towards and we have uh I see it some of the CEOs that we work with they're they're they're all on board and they understand that it's going to be a multi-year
investment cycle they might not see gains you know next quarter but if they actually pull through the other side they're going to uh they're going to see massive Transformations yeah um I think that the the sort of you know a lot of the frenzy around small use cases and and sort of the more the more you know marginal use cases I think that's good I think it's exciting I think they should be doing it but um that's not the to me that's not what we're all here to to do yeah yeah yeah yeah it's very
much like the application layer is like very much like phase one right now which is it's I mean yeah there's some automation but it's largely like chatbots my hope as a startup investor is that over time there's a window that opens the startups where product Innovation will help them to win and beat the you know beat the incumbents like my my partner alexel has this phrase which is like you know is the startup going to get to distribution before the incumbent finds Innovation and uh I think there's an opportunity for it but it's like it's
like the tech is too early right now yeah I don't know if you would agree with that but I think the tech is too early to imagine yeah again because it's mostly cost saving I think if if most of the benefit is on the cost-saving side then that's not really enough to disrupt large incumbent that is already kind of you know there kind of like pushed their way through all the cost of get of growing and distribution do you think how valuable do you think is the data inside of Enterprises like you've said like oh
JP Morgan has whatever 15 pedabytes of data or like I remember what the numbers but like is that overrated like how much of it is actually useful because like today most of that data has not given them some meaningful competitive Advantage so do you think that actually changes uh yeah I think I think think AI is the first time you could see that potentially change because I think you know basically obviously there's the whole big data wave Big Data boils down to better analytics which is helpful like marginally helpful for business decision- making but um
but not not deeply transform it doesn't massively change the way their products work yeah exactly um whereas now you actually can imagine um some massive transformation in the way the products work so you know if you take let's take any big Bank um uh a lot of the valuable interactions between um a user and a large Bank like um like a JP Morgan or Morgan Stanley or whatnot are human driven are people driven and um you know they try their best you you have lots of lots and lots of people you try your best to
ensure that you know the quality of experience is very high across the board but obviously with any large manual process there's only so much you can do to assure that so sure um so I think that like but all of your prior customer interactions and all the ways in which you know your your um the way that your business has worked historically uh is the only available data to be able to train models to do well at this particular task and um there's no you know if you think about like wealth management there's very little
in distribution data that on the internet that like you know you could trade a model off of so so there Behind the Walls there's actually quite a bit it's very rich yeah yeah huge amounts of data so so I think that the um a lot of the data is probably not super relevant actually transforming your business but but some of the data is hyper valuable so I think that the um but this is you know Enterprises have a lot of trouble and challenge around actually utilizing any amount of data that they have I mean right
um it's it's poorly organized it's sort of all over the place um they pay consulting firms tens of millions of dollars hundreds of millions of dollars to do these data migrations and it's it's you know even after that no change in results yeah yeah no change in results so so I think it's like it's a historically um very difficult place for Enterprises to really Drive transformation and so in some ways this is the race like are they going to be able to figure out how to utilize and and leverage their data faster than you know
some startup figures out how to like somehow get access to create a massively different product with a little bit subset of the data yeah exactly shifting gears to how you run your company and how you how you've built your company um one of the things that you've talked about is a mistake that you made during the go- go times of 2020 and 2021 uh around hiring and this notion that in order to scale you had to hire a ton and it's something we saw with all of our portfolio companies it was like Hey this war
for talent and it meant like oh we got to go higher we got to go higher we got to go higher so what were the lessons that you learned through that process and then how have you changed how you've done things uh afterward so over the past few years um we've basically kept our headcount flat I mean we've grown it very slightly uh as as the business grown but the business itself is you know 5x 6X like you know the business has grown dramatically and the the takeaway from this entire process is um you know
it feels very logical that more people equals better results and more people equals you know more stuff being done but um rather paradoxically I think if you have a very high performing team and a very high performing org it's in near impossible to grow it dramatically without losing all that high performance um and all of the the winning culture yeah reducing the communication and coordination overhead actually increases productivity and that's that's definitely true and I think it's actually something even deeper which is that like you know a a a very high performing team of a
certain size is almost like like this very intricate um sculpture of and this interplay between all the people on the team and if you just if you add a bunch of people into that even the people are great like it just it just screws the whole thing up and no matter what as you add people you're going to have regression to the mean you know if you kind of observe companies that like uh that do scale H count a lot and that's pretty core to their financial results I think they they acknowledge that regression that
mean regression um so if you think about like the scaling of large sales teams for example yeah sure you acknowledge that you're going to have that mean regression but you just operationalize so that you're like a little bit above the mean and if you're able to do that then the whole equation still works financially yeah and sales I'd say sales is different than product in yeah totally of course but but our observation is just you know you uh startups work because you have very high performing teams and you want to keep those High performing teams
um intact as long as you possibly can you know I think a a common startup failure mode is that you uh you you know you have something that works uh but everybody in the companies really Junior so then uh you're you know things are scaling but there all the wheels are kind of falling off your investors tell you hey you should hire some Executives um you go through these searches that uh are somehow uniquely Soul crushing every time but you you go through the and if you're gred it it works half the timeful so you
go through the exact searches you're bringing exact and then um and then you give the exact a lot of rope and and your ex you know your exact say hey we need to hire a massive team um for us to hit our results and you're like yeah I mean I'm pretty experienced you seem experience like let's do what you say um and you let them you let these big teams sort of um uh uh be built and um and the reality is like I think this almost always results in ruin I think that this isn't
to say that you can't hire Executives from the outside but I think what you need to do when you hire Executives from the outside is you really like you like they really get steeped in how the company works and before they make any major sweeping suggestions they like they get into the Rhythm and the oper ations of the company and they understand why why does the whole thing work in the first place why are the things that are working working um and then they they make thoughtful suggestions initially you know they take small steps and
you sort of like you trust and verify each of these small steps and eventually maybe they can make more sweeping suggestions but it should be at a point where they have a clear track record of making um small steps that that have that have been really beneficial that's that's interesting and very tangible right is like start small when you hire a big executive and it's a little bit counterintuitive and it's not the way that any of those Executives want to go yeah I think that the like um there's kind of an exec fantasy that I've
noticed which is and not like by the way I think executives are great people and they're like you know they they're they're incredible but there is a tendency for an executive fantasy um particularly for like you know Silicon Valley companies with young Founders and whatnot which is oh I'm going to come in and I'm going to fix this and I that but like you can that I'm going to I'm going to come in I'm going to fix this whole thing like there's I'm going to make this a professional operation um you're recruiting teammates at the
end of the day you're not recruiting like some magic wand you're recruiting a teammate who um you believe over an extended period of time is going to have great judgment in making repeated decisions about the business but I think you're not you're not buying some like and this is where we've made mistakes is like you're not buying some magical bag of goods that like knows you know is going to bring this magic formula into your business that will like all of a sudden make the whole thing work you know the other thing on the flip
side there's a Founder fantasy the founder fantasy uh or the founder CEO fantasy which is oh I'm going to just like hire a bunch of incredible execs they're all going to be pros and then I'm going to go they'll do the stuff I don't want to do they'll do yeah they'll do all the stuff I don't want to do and I'm going to be able to sit back and like just sort of watch the cogs happen and watch the machine work and that's also extremely unrealistic because the flip side is also true the reason that
you are a good founder CEO is because you make very good decisions over and over and over again um over extended period of time and to pull yourself out of those decision-making Loops is you know would be kind of crazy that's a pattern we've seen a lot which is I'm going to hire Executives I'm going to step back a bit and then it's like oh a realization that like hey some big decisions go wrong and like wait this is the point of me being here yeah um I think it can work if your industry is
very stable potentially um well look at any public company when they change CEOs and the stock price moves like 2% and it's like oh okay well actually it doesn't really matter like that is a cog but that is very different from a high growth startup that's run by a Founder exactly yeah yeah and like I think that the um a lot of startups and a lot of companies are valuable because of an innovation premium like you know um we I you know investors believe that that uh founder Le companies are going to out innovate the
market and so your job is to out innovate the market so you better be in the Strategic decisions in make yeah for sure how about Mei so you recently rolled out this concept I think like half of my X feed was praising you and you know that's probably more than half some portion of my X feed was yelling at you uh talk about the concept and and how you know what are your observations of rolling it out so far yeah so Mei we basically rolled out this idea of uh Merit excellence and intelligence um and
uh the the basic idea is um in every role we're going to hire the best possible person regardless of their demographics um and we're not going to do any sort of uh you know um quab based optimization of our Workforce to meet certain demographic targets uh uh that doesn't mean we don't care about diversity we actually care about having diverse p Pines and diverse top of funnel for all of our roles but at the end of the day the best most capable person for every jobs wouldn't be the one that we hire it's one of
these things that um was mildly controversial but I think is also you know if we were to just take a big step back as to who should companies be hiring I think it's kind of an a sort of Common Sense yeah yeah it feels kind of obvious lost the plot companies should hire the most talented people and I think there's there's obviously this like became this big question of like how much social responsibility do companies have in uh in what they do you know our my take is I operate in a very competitive industry um
you know scale's role is to help fuel artificial intelligence it's very important technology um we need incredibly smart people to be able to do this and we need the best people to be able to accomplish this um I think that this is something that you know I think most people at scale would say was sort of like implicitly true or sort of it wasn't like a departure from how many of us thought of what we do at scale but it was um it was really valuable for us to codify it because it means you know
I think it gives everybody confidence that um even if this is how we operate today uh over you know companies change over time we're not going to change this quality well this has been awesome I want to close with an optimistic question and forecast uh which is uh what is your sort of own view of or definition of AGI and what is your expected timeline to when we reach that um yeah I I like the definition of this that's sort of like you know um let's say 80 plus per of uh jobs that you that
people can do purely at computer so digital Focus jobs um are accomplishable with you know a AI can accomplish those jobs it's not like imminent it's not like immediately on the horizon so you know on the order of you know four plus years um but uh you can see the you can see the glimmers and you know depending on the algorithmic Innovation cycles that we talked about before you know could could make that much sooner yeah that's awesome very exciting well Alex thanks for being here great to chat with you as always learned a ton
uh really appreciate it yeah thanks for having me awesome