Okay. So there are three universal truths out there in the world. First.
Sunsets go down at the end of the day, water flows downhill and LLM prices are supposed to go down, not up. Clearly Anthropic, didn't get the memo about this. So in this video, I'm going to look at Claude 3.
5 Haiku Anthropic's latest, greatest, small model. as opposed to their latest, greatest medium model and possibly the latest, greatest, big model that is just around the corner. And we're going to look at some of the advantages of this new model And some of the disadvantages that come along with this new pricing that Anthropic has rolled out today.
And I'm going to talk about why this model may not actually be what we're thinking here. So let's jump in and have a look at this. So early today Anthropic rolled out the latest version of Haiku.
So this is Claude 3. 5 Haiku. Like the Sonnet model, it's great to see that it's out.
Not only on their platform, but also on Amazon Bedrock on Google Vertex AI from day one. so if we remember back When they announced the updated 3. 5 Sonnet.
not calling it 3. 6, but sticking with the 3. 5 Sonnet, they also mentioned that they were going to drop the 3.
5 Haiku model. And they talked about this model as being a serious contender to a lot of other models out there, not just the small models. In fact, you can see here in the benchmarks, when they compare this, 3.
5 Haiku, they've actually got it in the comparisons with the new 3. 5 Sonnet and GPT-4o, Gemini 1. 5 Pro in here.
Now while no one's denying that this is obviously a really good model in here. and certainly in my early testing, this looks very promising for a lot of things like agentic workflows. Like being able to do 90% of the tasks that you perhaps want an LLM for.
obviously the big thing that's got the entire community sort of worked up is the whole increase in pricing So if we look at the original Claude three Haiku model, when that was announced was such a game changer. and the reason why it was such a game changer, it was just, it was so cheap. Don't forget this was announced long before GPT-4o-mini, long before a lot of the other models that we're out there.
And it basically gave us a really effective model that could even do simple reasoning tasks, extraction tasks, all this kind of things and it was only 25 cents per million tokens in and $1 25 per million tokens out . Now, if we jump forward to the new 3. 5 Haiku, suddenly all of those costs are now four times what the Claude 3 Haiku was.
So we've got a dollar in per million tokens and we've got $5 out per million token. So this is not just a slight increase. This is a four X increase.
So really we've got to ask ourselves here, are we getting a four X increase in intelligence to justify this? So Anthropic seems to think so here. They mentioned that Haiku surpassed Claude 3 Opus on many benchmarks at a fraction of the costs.
As a result, we've increased pricing for Claude 3. 5 to reflect its increase in intelligence. So I think this is the bit that is sort of most interesting in here is that as these models get smarter, are we prepared to pay more money for them?
Especially when the competitors' models seems to be getting a lot smarter as the pricing often gets halved as it's going down. Another thing to mention is this new Claude Haiku model can not take any images in, So they are keeping the original Claude 3 Haiku available for use cases that benefit from sort of image, analysis and, obviously the lower price point there. So to be honest, I think myself, like many people in the community, are not only surprised that this has happened.
But also wondering about the implications of this for things like the Opus model, when it comes around the corner. Is that going to be substantially more expensive than, the models out there like your standard Gemini 1. 5 Pro, GPT-4o, et cetera.
So for example, with OpenAI, we've seen the new o1 models be quite a lot more expensive than the GPT models for the time being. So if we're coming here and look at the artificial analysis quality index, so they do really good analysis of a lot of models and sort of give us a sense of where models fit in. they've basically got this model coming in the new Claude 3.
5 Haiku as coming in just below the previous Claude 3 Opus, but certainly above Llama 3. 1, Mistral Nemo above Lama 3. 1 70 B, which is really interesting.
Now they do sort of show that it's still behind models like GPT-4o, . Gemini 1. 5, but surprisingly also behind GPT-4o-mini.
Now, if we come over to the artificial analysis site, we can see the actual breakdown of quality versus price in here. We can see here that while the 3. 5 Haiku speed is a lot faster than a lot of models, it's still slower behind GPT-4o-mini, GPT-4o, and Flash 1.
5. If we look at the price, we can see sure enough, that GPT-4o-mini and Gemini 1. 5 Flash, clearly beating this here.
not to mention Gemini 1. 5 Flash 8B, which I think is even a fraction of the price for 1. 5 Flash from memory, you're talking about a cent or a few cents per million tokens.
for using that. So really this model is going to have to sort of justify its existence against those other models in your particular tasks. Now it certainly may do that.
I think for things like coding and like agentic use and stuff like that, my early testing it is showing that this model is clearly pretty good at that kind of thing. I guess the sad thing is just that did it require coming with this large increase in pricing that we're seeing So one of the big questions here is to ask, is this new Haiku and also the new Sonnet that we're seeing the same size. Are they really the same size as their previous ones?
are we, in fact actually seeing that one of the reasons why the pricing is increased is that the new Haiku model is more akin to the old Sonnet model. And the new Sonnet model is more akin to the old Opus model in relation to size and compute and stuff like that. Unfortunately with these proprietary models, we don't know this.
In fact, we saw even with the updated Sonnet 3. 5 they're still sticking to calling it 3. 5 and not 3.
6 or something that makes it clear. And this certainly has to make us wonder when are we going to see the 3. 5 Opus and exactly how powerful and also, I guess now how expensive is that going to be?
the trend that we've seen time and time again, has been introduced a new model reduce cost on the old model reduced cost on the new model over time. It's been a formula that has worked really well for OpenAI. it's a formula that Google's also adopted and many others have adopted as well.
So over the next few days, I will be testing this model out more specifically for agentic tasks, and doing some benchmarks of where it makes sense to use this model or the new Sonnet 3. 5 model and where it actually makes more sense now to stick with the old, cheaper Haiku model, which I've always been a fan of for these stock sort of uses for things like information extraction, curation of data, and just generally being the sort of workhorse model. and I do think now with this change, that workhorse model has the ability to be changing away from Claude Haiku now to perhaps more the Gemini Flash models for anything that you really are concerned about pricing and doing large amounts of tokens, et cetera.
and then sticking to the Claude Sonnet to perhaps the new o1 models for your orchestration layer, et cetera. So I'm going to go play with these over the next few days. I love to hear what you guys think.
Please, let me know in the comments, how do you feel about the pricing? one of the things I should point out, is that still when we're talking about $5 per million tokens, this is insanely cheaper than what we were paying last year for models that were nowhere near as good as this quality. So while the temptation is to take for granted everything that we've got, maybe Anthropic is setting a new trend here in that we're going to be able to see new models come out that are going to be newer versions of old ones, but then cost more money probably because they cost more to run.
And also probably cause you could imagine that training, the Claude 4 family of models is going to be seriously expensive and perhaps even unsustainable for things where for models that are constantly going to be getting cheaper and cheaper over time. So, let me know what you think in the comments. As always, if you found the video useful, please click like and subscribe, and I will talk to you in the next video.
Bye for now.