substantial evidence that what deep seek did here is they distilled the knowledge out of open ai's models and I don't think open AI is very happy about this so open AI are basically saying that it has evidence that China's deep seek used its model to train a competitor they're basically alleging that deep seek used the outputs from open AI to actually train their model and this is pretty big news because the news about deep seeks performance relative to the price it cost to actually create the model have led to some very interesting developments in the
stock market and this is pretty big news now this news right here this is some pretty big news because this actually means that there is the potential for open AI to sue deep seek on the grounds of alleged intellectual property fed and one key piece of evidence that I think is floating around the internet right now is this screenshot here so one of the things that people have started to realize is that when you ask deep seek who are you often times in random conversation it may refer to itself as a large language model developed
by openi and this is because if a model was trained on the output of open ai's models then in those responses that it would be training on the model would essentially be stating that as an AI developed by open AI XY Z and that's essentially what we see here in deep See's outputs this is something that's really important because if deep seek was an original model which wasn't training on the output of open AI then it should say as a model developed by the company deep seek but you can see in many any cases this
model actually refers to itself as a model developed by open AI which of course isn't going to look good in the context of all the news going around now I do know that recently deep seek did change this so if you do message it now and ask who developed you it will say that I'm developed by Deep seek but previously it actually did say that I'm large language model developed by open aai and of course that is probably because they did train on the outputs from open a I can see here that it also states
that open AI actually says that they found other evidence that the Chinese AI startup deep seek used the US company's proprietary models to train its own open source competitor as concerns grow over a potential breach of intellectual property and what open ey are essentially alleging here is that they had seen some evidence of model distillation in machine learning you basically want to put the large model capabilities into a smaller model so you have a large model trained on a comprehensive data set capturing the intricate patterns of knowledge and then essentially the student model you have
a smaller model which is initialized to be more resource efficient and then that student model will be trained using the outputs of the teacher model and it basically involves adjusting the student model to mimic the teachers output distributions capturing the nuanced information that the teacher model has learned and in this case that would be the open eyes1 model or whatever model they have used in this case and open air have actually used this themselves they've actually distilled very smart models like they have an internal model called GPT 5 or they might might call it Orion
I'm honestly not sure what the model is called but essentially they've been distilling that model's capab abilities down into models like gbt 40 so they can actually serve them to us at a much cheaper cost take a look at what the gro CEO Jonathan Ross says about distillation and how deep seek was able to rapidly Advance its AI capabilities from open Ai and how this actually works is if there's a really good model already right here just have it generate the data and you go whoop right up to where it is and that's what they
did so it is true that they spent about6 million or whatever it was on the training they spent a lot more um distilling or scraping the open AI model open AI was effectively subsidizing accidentally the training of this model because they they were using open AI right and you know rumors are that open AI may not be completely profitable yet um in terms of every token in the API like on the subscriptions maybe but but in the API and so each one that they generate effectively they were losing a little bit of money while deep
seek was getting and he says you know deep were getting the training data for their model which is pretty crazy when you actually think about it so overall it seems like maybe deep seek were stealing from OPI now the article continues to state that you know this technique used by developers to obtain better performance on smaller models by using the outputs from larger more capable ones allowing them to achieve similar results on specific tasks at a much lower cost and that distillation is a common practice in the industry but the concern was that deep seek
was using it to build of course their own rival which is against opening eyes terms of service which we will get into into a moment now the crazy thing about all of this is that open AI actually doesn't want to provide details of its evidence in its terms of its service you know they talk about how you cannot use you know copy of it in any of its services to compete with open AI That's not the point I'm trying to focus on here the point is that openi haven't you know provided details and I think
it's really important that they haven't because if they did deep SE could potentially you know go and cover their tracks like for example when I spoke about how sometimes deep seek refers to itself as open AI they've actually now removed that so anytime it refers to itself it always says I'm deep seek but that is a key piece of evidence because if your model was trained by yourself it shouldn't really say you know I'm a large language model by open ey they will have some good defenses stting that you know they trained on a large
portion of the internet and a large portion of the internet is going to contain pieces of text that do contain op eyes model so it's not like they're going to be stuck in the mud there but take a look at what David saxs the aisar which is the government appointed leader who's responsible for overseeing and managing AI policies regulations and strategies and he actually said in a recent interview that it's quite possible that this is exact what deep seek did to get the capabilities of the current model possible there's a there's a technique in AI
called distillation which you're going to hear a lot about and it's when a one model learns from another model effectively what happens is that the student model asks the parent Model A lot of questions just like a human would learn uh but AIS can do this asking millions of questions and they can essentially mimic the the reasoning process that they learn from the parent model and they can kind of suck the knowledge out of the parent model and there's substantial evidence that what deep seek did here is they distilled the knowledge out of uh open
ai's models and I don't think open AI is very happy about this and I think one of the things you're going to see over the next few months is uh our leading AI companies taking steps to try and prevent distillation and so we'll see if uh if if the leading AI companies can prevent distillation by third party companies uh that would definitely slow down some of these copy so with opening eyes terms of service you can clearly see here that they state but one of the things you simply cannot do is use their outputs to
develop models that compete with open Ai and of course in this case deep seek is a direct competitor to open AI so if they did do this it would be possible for open AI to potentially sue them on multiple different grounds you can also see here that they have said that you also cannot attempt to assist anyone to reverse engineer decompile or discover the source code or underlying components of our services including our models algorithms or systems so this is clear C some kind of you know breach of terms of service and I wouldn't be
surprised if openi actually sues deep seek on these terms or these grounds and the article continues to state that it's very common practice for startups and academics to use outputs from Human aligned commercial llms like chat gbt to train another model that's what someone who has a PHD at University of California in Berkeley said and it says that means you get all of this human feedback step three and it's not surprising to me that deep seek supposedly would be doing the same if they were stopping this practice precisely may be difficult that's pretty true how
on Earth do you figure out when someone is just chatting with a model versus actually using the model to get certain specific outputs so that they can build their own this would actually be a really difficult thing to figure out and it's not like open ey can just you know now potentially imply rate limits or whatever because that would just render the surface pretty much useless to developers so they are going to be in a bit of a pickle figuring out exactly how individuals are doing this you can see right here in the article it
states that we know China based companies and others are constantly trying to distill the models of leading us AI companies that's what OPI said in a latest statement and they also said that you know we engage in counter measures to protect our IP including a careful process for which froner capabilities to include in released models and it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take us technology and of course it wouldn't be the first time that China
has copied some us technology there have been numerous cases of China actually stealing Trade Secrets by various different means on the seventh from March 2024 there was actually an ex gooogle engineer charged with stealing AI Secrets a formal Google software engineer was actually charged by the US from stealing trade secrets about AI while working secretly for two Chinese companies this guy you know he stole more than 500 confidential files and it was pretty crazy so this isn't the only time that you know China has done this before so it wouldn't be surprising to me if
they managed to just steal SL distill this information from a publicly available model now whilst all of this information is there something really funny is kind of happening a lot of people are basically saying that okay maybe they did distill the knowledge from open eyes o1 model into R1 but can open ey really even complain about this because recently open ey accidentally deleted the potential evidence in a training data lawsuit if you aren't familiar to what they're referring to I want you to understand it like this open ey are basically saying hey deep seek why
are you copying our data but most people are like wait a minute how is open AI saying people are copying our data when openi basically just scraped all of available public data on the internet and didn't pay anyone for that data at all most people are basically saying that look open a eye you guys script all the data from the internet without giving anyone credit and of course with that being said you also accidentally wink wink accidentally erased potential evidence in a trading data lawsuit and it's crazy because open ey Engineers accidentally erased critical evidence
in a certain lawsuit about the training data so this is something pretty incredible and I mean it's it's it's a wild ride in a air right now now something as well that I do find super interesting is the benchmarks actually like to look at the LMS chatbot Arena because this is one that is somewhat of an independent evaluation considering it's based on actual humans in a chatbot Arena if you aren familiar with how this Benchmark Works essentially you input one prompt into a chatbot and you get two responses and you are blind you don't know
which response is from which model and then you rate which one you think is better and then over time we do see which model people do prefer when they essentially blind tested by the models and right now you can see the deep seek is actually tied first with chat gbt and Gemini's new model so it's actually a pretty decent model even despite me being skeptical about certain benchmarks however on certain benchmarks like simple bench that actually do take a look at reasoning and this is real human style reasoning like for example the questions might ask
about how many minutes will it take for ice to melt or you know just simple questions that you know for a human would actually be really easy so I don't really want to say them here but you can see the human Baseline is like 85% but deep seek R1 actually SC