As always, there is a lot going on in the creative AI space. So, today we're going to do a quick roundup of all of the cool stuff coming out of not only the open- source scene, but the big platforms as well. I've also got to check in with MidJourney and Runway, plus uh an interesting script to storyboard platform.
Kicking off, LTX have released a distilled model of their new 13B model. We talked a bit about the updated 13B model a few videos ago, and we'll be swinging back in just a minute to actually see it in action, but the focus here is on the 13B distilled model, or essentially, you know, the turbo version. Obviously, the distilled version is open source.
Uh, you know, same with its big brother. But if you don't run things locally, there are still a couple of different places where you can try it out. The big news with this turbo version is, of course, well, the speed.
uh with LTX saying that it can achieve high quality results in just four to eight steps. Uh multiscale rendering completed in as little as 12 seconds. So like the undistilled version doesn't sound quite right.
Let's call it barrelaged. So like the barrelaged version, this turbo model can be run on consumer grade hardware. Does that mean it'll be outputting in 12 seconds on your like pennium 3 Dell potato with 2 gigs of VRAM?
No, it does not. But for most reports, if you have a fairly reasonably modern GPU, you should be fine. That said, you know, obviously faster does not always equal better.
But I do have to say the LTX Turbo model does hold up pretty well. It It does better in my opinion than Gen 4's Turbo does. Uh I don't use Gen 4 Turbo.
I only use uh Gen 4 itself. Turbo is just way too unpredictable. I do speculate that part of the reason that the Turbo model is performing so well has to do with something that's actually baked into the LTX video architecture, namely that of multiscale rendering where it essentially, you know, starts off with a sketch version uh let's just call it of uh your video output and then you know slowly adds passes on top of it refining it.
So, if you want to try out LTX Turbo and you're not really savvy with running locally, uh you can of course run it over on HuggingFace again because it is open source. Um so, yeah, just uploading this image, uh giving it the prompt machine rotates, uh and then setting it to a 5second generation, running that, we end up with this as an output and you know, that looks pretty good. Now, to note on hugging face, this did not generate in 12 seconds as I was just using, you know, the free GPU.
Uh but still pretty fast. It actually generated in 41 seconds. Now, if you want to go fast, you can, of course, go over to the LTX Studio platform, uh, where I uploaded this image of our Cyberpunk Viking going out daydrinking with some rando.
Uh, we generated this image up in, uh, our last video on Runways references. Uh, yeah, so running this in LTX video turbo. Um, we ended up with these results, which look pretty solid.
Uh, and that came in at 13. 3 seconds. So, you know, just a little over 12 seconds.
Moving on, WAN 2. 1 or WAX as uh, it was known for a very brief 48 hours, but I will never forget, well, they have released WAN 2. 1 vase.
And if you're wondering, well, what is Vase? Well, Vase is an all-in-one video creation and editing model that does not anacronize out to vase. Uh, that would be video and creation editing, but that doesn't make any sense.
It does a lot of the stuff that we've seen sort of on the bleeding edge of AI video. Uh things like reference to video or uh subject recreation, you know, uh outpainting, canvas expansion or like the swap anything thing. So yeah, vase is now available for the WAN 2.
1 model and the results are really pretty impressive. We have things like character motion control uh simply by it looks like we have our initial image input here uh and then just providing like a slightly an animated box moving to the right. We have control over where you know this kid stands up and walks to.
We also have masked video in painting. Uh obviously we have source video here live action. Uh kind of like this either 3D animated or maybe like uh uh action figure character.
U you the mask here replace looks pretty good. Some other really interesting use cases here. Uh this guy, you know, obviously painting uh the landscape here.
Uh I think is that the dude uh in the Parisian cafe from those original Sora videos? It does look like him. Anyhow, the cool thing here is that uh the actual driving video here for this is actually this.
Uh so yeah, somebody ended up masking out the painting there and uh replaced it with this. That's pretty cool. In general, the videotovideo stuff does look pretty good.
Um here is I presume actual footage uh drone shot kind of thing. Uh and then run through um Wanto's vase. Yeah, looks pretty neat.
But one that I thought was really cool was uh this this is AI generated image of this eagle coming down for you know to try to get a meal uh and coming up empty-handed is actually just this with a text prompt and that's pretty interesting given the fact that we've been seeing some interesting like geometric shape stuff coming out of like the big LLMs like Chad GBT and Gemini. Um, yeah, that might make for an a really interesting kit bash there of, you know, creating something via an LLM and then using it as driving video uh for something like, you know, uh, WAN with vase. Code for WAN 2.
1 vase is available now in two flavors, uh, 1. 3B model and a 14B model. Now, I don't know what the requirements are on the WAN vase model.
Uh, I will say that the, you know, the the standard 2. 1 model, uh, only required 8 gigs of VRAM. And again, if you don't run models locally, I mean, I wouldn't sweat it too hard.
It's probably going to be implemented just about everywhere uh, by like the end of the day. Super quick breaking story as I am recording this. Uh, Hunan has a new image model coming out.
Um, that's all the information that I have. That's all this is all they put out. Uh, I do presume that we're going to find out a little more um, by the end of this week.
they they seem to be dropping uh open source updates uh at at least this month every Friday. So, um yeah, we'll see. Moving over to the Google side of the street, looks like we will see V3 next week.
And it's got a little extra spank going on, too, in advance of Google's big IO event next week, which I will be covering. uh testing catalog uncovered uh some change log notes here including um well we do see uh imagen 4 here as well as an image 4 ultra model and additionally mentioned in the change log is a V3 preview model. Now, here is where things get interesting because legit uncovered new AI model refs uh for I mean obviously we have VO3 here, but we also have a VO model unspecified and VO Echo's pose custom image to video.
That sounds pretty interesting. What is it? I mean, I don't know, but it sounds awesome.
I mean, I guess we'll find out next week. I of course will be watching the event and rolling into coverage as soon as it wraps up. So, let's go check out a script to storyboard generator.
But first, a quick word from our friends at Spotter Studio. So, I'm willing to wager that anyone watching this channel is probably producing some kind of video. Be it films, AI or not, music videos, or you know, just some kind of experimental stuff.
Basically, you are making something that is meant to go in someone else's eyeballs, which yes, is a very weird thing to say. And I figure that most of you probably do have a YouTube channel, and maybe you're thinking about growing it into a sustainable source of income, which I mean, it's actually doable. I'm kind of I'm kind of testament to that.
Well, that is where the sponsor of today's video can help you out. Spotter Studio is a creative and datadriven platform that's focused on helping out pro YouTubers and those on their way to going pro. And what's cool is that Spotter has a number of AI tools that will help you brainstorm, prioritize, and package your next video uh using personalized insights.
For example, one of my like least favorite things to do in the YouTube production process is titles and thumbnails. And I got to do them like two to three times a week. So, a great feature in Spotter Studio is just about that.
For example, one of the ideas on the ever growing to-do list is a video on the best AI lip sync tools. Yeah, I want to see that video, too. So, with Spotter, I can workshop some ideas on the title, the thumbnail, and even the hook for the video.
So, let's start off with title, which does give us a pretty good starting point. Now, what's cool here is that we can continue to workshop and ideulate on it with, you know, a number of different options. Uh, let's give the power up a shot where we do end up getting some more, I guess, like kind of buzzy titles.
Uh, I do like next level AI lip-sync tools revealed. Uh, I'm going to go ahead and keep that one. From here, we can brainstorm some ideas up on a thumbnail.
And we end up with results like these. Now, to note, I know on the channel, obviously, we cover a lot of AI imagery wizardry. So, these images are more meant to serve as inspiration and uh you know, honestly, like text placement.
Hey, that guy's like totally stealing my look. I'm going to I'm going to take this guy. One thing I really do like about Spotter Studio is the this butt function.
So, uh let's go ahead and brainstorm a hookup here where we are provided with a number of ideas for a video hook. But what I can do is come down to the this butt function and tell it to make it more about using lip-sync for cinematic AI films. And of course, we end up with more refined results.
Now, what's kind of interesting is that because, you know, Spotter knows my channel, the results end up becoming more personalized. It's really more about rapidly coming up with packaging ideas to find that winning concept. You can try out Spotter Studios for 7 days totally for free at the link down below.
Hey, my thanks to Spotter Studio for sponsoring today's video. Moving on, you know, one of the things that pops up in the comments uh from time to time are requests for uh script to storyboard uh generators. Uh and we have seen a few stabs at them in the past.
This one is the latest. Uh this one is called rubber band uh which essentially allows you to again drop your script in uh hit create storyboard and then it will begin generating up storyboards for your feature. I do think that what's interesting about this one is that it isn't going for like an overly stylized realistic look that it is trying to sort of maintain more of a traditional kind of like sketchy look.
So, this one is fairly new. Not a lot of bells and whistles here, which is actually kind of refreshing in all honesty. Uh, it is free to try out.
I think they give you like three scenes. So, I definitely recommend at least giving it a shot. Uh, so what I did was cribbed up.
I should say I didn't crib it up. I had chat GBT crib it up. Uh this is a it's just a two-pager uh short scene of uh kind of a Game of Thrones inspired uh scene uh called the shadow of Ashen Moore uh that involves a king, a knight, and a dark wizard.
So importing that into rubber band, you can see that it goes through and kind of like tags uh relevant uh you know either characters, locations, items, etc. From there, it does begin to generate up storyboards. Uh again what I like about this is the fact that there are you know some problems like uh our king does obviously does not stay super consistent uh all the way through these generations.
It did put like a random knight back here as well but I will say it does a fairly serviceable job of breaking down your script and you know kind of giving you the shots that you need. Of course, it does look like you can edit your shots. Like there is an editor here, um, where you can, you know, choose different shot types.
And then ultimately, of course, you can go through and add or rearrange shots if you need. Is it perfect? Of course, it is not.
In fact, it's actually missing a few like key story board things. That's okay. I'll explain in a minute.
Uh, but first, I did want to show this. This is actual like it's like a real story board from uh I believe this is the first Avengers movie. For one, we do have like these arrows uh that indicate essentially like character movement.
Uh like in this one, obviously Iron Man is moving towards the screen. Um one thing that I do want to point out is this is why I'm not too hung up on necessarily like the character consistency throughout is that like the characters, I mean, you can tell this is Captain America, you can tell this is Iron Man, you can tell this is Thor, but it's not like they're really intricately rendered. Storyboards, while you know, I think they're pretty cool, are really not meant to ever be seen.
it is 100% supposed to be a pre previsisualization tool. That said, what Rubber Band does have going on, especially if they're like really honing in on this, um, like this is something that has been asked for a number of times. So, if you're interested in this, I mean, head over, give it a shot, and more importantly, provide them with feedback on what uh, you know, is lacking and what you would like to see.
I definitely have firstirhand experience knowing that a lot of times developers are sort of shooting in the dark. don't know what like they should be prioritizing first. Um so definitely, you know, um again, if you're interested, give it a shot.
It's free. Provide some feedback. Moving on, it does look like MidJourney has been tweaking the omni reference feature.
It definitely isn't 100% there yet. Uh but, you know, it's starting to get closer. For example, me as a detective in a noir crime alley uh ends up getting this.
Midnight me is always so angry. But it is definitely looking a lot better than it did like a week ago where it kind of outputed a version of me that looks like I look like the bully in a high school movie from the 80s. Is it absolutely hitting 100% all the time?
No, it is not. Uh for example, in the uh Andor inspired image that we highlighted a while ago, I still kind of come off looking like uh I don't know, like the love child of Josh Brolan and Billy Bob Thornton. As a quick follow-up on my continual experimentation with this, I have found that you do end up with some pretty decent results if you uh really lower your stylization and turn your personalization off.
Um yeah, I mean that does end up looking like me. It just ends up it's really kind of a bland image. So it kind of loses all of its like all of the midjourneyiness of it all.
So I will of course be keeping an eye on this and continuing to experiment with it. We'll definitely let you know when I mean at least I feel that they're over the beta hump. Um yeah, it's okay.
I mean, the life and AI, it's perpetual beta. Rounding out, is runways act two around the corner? Well, maybe.
Now, act one, as I'm I'm sure you're aware, is runways feature that allows you to use driving video uh and then target another video source. Now, I've always liked Act One. It does have its quirks.
But, you know, I do have to say that at the end of the day, you know, performance will always be better than prompt. In fact, because I never uploaded it here, here's a quick snippet from a short that I did uh utilizing act one. Your CV here uh it's quite colorful.
My skills are transferable. Yeah. Okay.
But it does kind of read like you might be a professional assassin. I prefer the term troubleshooter. I get that.
That's really funny. Now, of course, act one is starting to look a bit on the dated side. Uh namely because it generates utilizing Gen 3.
Uh so of course the question is uh what will it look like once it upgrades to Gen 4? Well, we don't necessarily have an answer to that. Uh runways Crystal Ball Valenzula who is showing up on the channel a lot lately uh did tweet this out which began with an image of this character followed by an image of this guy in a mocap suit and then finally bringing it all together we get this.
Now obviously these are stills. are not in motion, but this definitely does give off an like an act two kind of vibe. Uh that would allow for full body replacement via, you know, reference characters as well.
Doesn't look like we'll have very long to wait as runways Nicholas Nubert just uh posted up tomorrow. Tomorrow feels like a good day to drop something. Man, they love their Friday drops.
Now, I should clarify, is this act two? I mean, I don't know, but they're dropping something tomorrow. So, I guess I'm going to go brew up another pot of coffee.
As always, I thank you for watching.