[Music] hi welcome to another video so I did a video on the Deep seek V3 testing although at that time there weren't many details because there was no official announcement about the model and it was just updated on their chat platform and API but now we have proper details about the model let's talk about it and let me tell you everything about it and how you can also use it with Klein and ader but before we do that let me tell you about today's sponsor photogenius ai photogenius ai is an all-in-one AI powered art generator
that allows you to type anything and get stunning visuals instantly photogenius AI gives you all kinds of image generation models in one place whether it be flux stable diffusion cardinski or any image generator model that you can think of not just that it also gives you the option to do Advanced AI image editing as well with their cool AI tools like an AI Avatar generator background removal logo generator Emoji generator or even add an app icon generator and the best part is that it starts at only $10 and you can get an additional 25% off
these already great deals by using my coupon code king2 so make sure that you check out photog genius. through the link in the description and generate some cool stuff with it so the model is a 671 billion parameter model that is a mixture of experts a mixture of experts basically means that the model architecture comprises multiple smaller models each trained on specific tasks based on the user's prompt The Prompt is sent over to the specific expert that can handle it best it has 37 billion activated parameters which means that each expert is about 37 billion
in size it is also trained on 15 trillion tokens and the model has open weights and the model license is pretty lenient which means that you you can use it for commercial purposes as well if we talk a bit about the benchmarks then the model beats or matches the performance of sonnet in most benchmarks which is really great for an open-source model the model is based on top of the same architecture as deep seek V2 it also has multi-token prediction which is similar to speculative decoding that basically increases the inference speeds of the models by
up to two times which is why the new models are pretty fast previously the Deep seek models were slow in inference and many people complained about it but it's now faster and really more usable another thing that they have done is that they have post-t trained the model on distilled Knowledge from their R1 model they say that they basically distilled the long Chain of Thought capabilities into the deep seek V3 model which makes it perform well on complex tasks because it can go through a chain of thought and do some pretty complex stuff which is
awesome to see as well there's also the technical paper of it that holds much more detail specifically one of the best parts about it is that they have even shared the pricing and gpus that they trained it on which is almost near $5.5 million this means that this model was trained for just three times the price of how much money open AI splurged on just testing their 03 model on Arc AGI which shows you what real gains are versus publicity gains there are a ton of details here that you can have a look at the
instruct variant as well as the base variant models are available on hugging face and you can use them if you have enough Hardware requirements which most of us wouldn't have anyway the model is available on the Deep seek chat platform for free without any limits to try out they also have an API that is extremely cheap and only costs .014 with cashing or 14 without cashing per 1 million input tokens and 28 cents per 1 million output tokens up until February 8th after that it would cost 7 cents with cashing or 27 cents without cashing
per 1 million input tokens and $110 per 1 million output tokens so to be honest even after the discounts it's pretty cheap I mean it's about 15 times cheaper than Sonet and has pretty similar results to Sonet which is just insane it's surely better than High hu and Haiku costs six times more than this anyway for those who don't have that you can use hyperbolic hyperbolic gives you $10 of free credits and has this new model which means you can use it for quite a while before that expires also have a look at ghf because
they should also start supporting this model soon enough I believe and it's free with some rate limits now let me tell you how how you can use this model with Klein and ader and get some pretty awesome results so starting with Klein you can just get it upgraded and in the settings choose the open AI compatible API and enter the base URL of either deep seek or hyperbolic here which will be something like this if you use deep seeks platform then enter the model name as deep seek chat and if you're using hyperbolic then enter
deep seek V3 now enter the API key and then you can start using it it works pretty amazingly I have been using it a lot and it's very similar to Sonet in almost all tasks plus it has more recent knowledge like the next js14 knowledge which means that it generally gives you much better and cohesive programs than Sonet which is pretty amazing as well it's fast as well now which is also amazing I have a basic Expo app and I'm asking it here to build a simple synth keyboard and it's all on auto approve so
after a little bit of prompting I was able to build this it took about a million tokens to build all this which is pretty good considering Sonet also takes about the same amount of tokens to build this and it's super cheap deep seek is also now extreme extremely fast compared toet as well as the previous deep seek model I think this model is now the new Sonet at least for me because I don't need to care about how many tokens it's consuming with 1 million tokens Claud would have cost me about $20 or $25 whereas
this is just under $1 or $2 which is insane value for this model and I don't mind having to prompt a bit more plus it's very much like Sonet in the fact that it doesn't produce errors in your code as often and it's a godsent to use with Klein and it's great with Klein as well so I'm going to be using this model from now on instead of Sonet because this is literally sonnet for a lower cost I think sonnet's time is now finally over and this time it's by an open- Source model some of
you may also want to use it with ader so what you can do is just export the Deep seek API key and use it with it by just running it with deep seek like this if you use hyperbolic or some other then you can just export the open AI base URL like this and then export your API key and then start AER with the Deep seek model it scores above Sonet in the ater leaderboards so you should have a good time with it I think this is the best that you should use instead of sonnet
as well because it's really what it claims to be finally open source is giving the closed Source models a run for their money overall it's pretty cool anyway let me know your thoughts in the comments if you liked this video consider donating to my Channel Through the super thanks option below or you can also consider becoming a member by clicking the join button also give this video a thumbs up and subscribe to my channel I'll see you in the next video till then bye [Music]