Google just unveiled its latest AI tools and this time they're aiming to dominate the video and image generation game with VO2 and an updated image in 3 Google is showing that AI generated visuals are moving closer to professional grade quality there's also a new creative experiment called whisk which lets users generate images by remixing other images without relying on Long wordy prompts V2 is where things start to get serious Google claims their newest video generator on understands real world physics better meaning the movements lighting and general flow of what it generates look more natural and
believable it's a big step forward for AI video which has struggled to produce results that don't feel awkward or artificial the model has been trained to understand human movement and expression more accurately so things like facial gestures or a character walking through a scene won't look as stiff or exaggerated as they sometimes do with other models what makes V2 standout is its focus on the details that professional filmmakers care about it's not just about slapping visuals together based on a text description this model understands cinematography specific lenses angles effects it's all in play now if
someone prompts V2 for a close-up with a shallow depth of field or asks for the softness of an 18mm lens the model knows exactly what that means and delivers on top of that the outputs can reach up to 4K resolution which is a huge leap in quality earlier AI generated videos often looked low reses or blurry when pushed to larger screens but V2 is closing that Gap the model doesn't just create short clips either it can extend sequences to minutes in length making it more useful for creators who want longer flowing visuals and while AI
videos still have their quirks like the infamous extra fingers problem Google says VO2 hallucinates those details far less often for now VO2 is only available through Google lab's video FX platform and access is limited anyone interested needs to sign up for the wait list where Google is rolling it out slowly the original voo model is still available on vertex a I primarily for Enterprise users videos created with VO2 also include a synth ID Watermark which helps identify them as AI generated that's part of Google's focus on safety and preventing misuse like AI deep fakes being
passed off as real content now the competition between AI video tools is heating up open AI Sora grabbed headlines earlier this year for its ability to generate detailed videos from text prompts but the results have been inconsistent users noticed some physics defying moments or anatomical Oddities and while Sora is impressive it still has flaws Google's own testing says that ve2 is preferred by human evaluators over Sora and other rival models so this is based on two metrics how well the output matches the prompt and overall preference meaning which videos people liked more that kind of
edge matters when content creators are deciding which tool to use Google is positioning V2 as a serious option for filmmakers YouTube creators and and visual storytellers one of the biggest early use cases has been on YouTube shorts where creators are using video FX to generate backgrounds quickly and save time during production highquality AI videos are becoming a powerful tool for creators who need professional results on a tighter budget or timeline alongside VO2 Google has also rolled out a major upgrade to its image and 3 image generator image and 3 improves on the previous version with
brighter visuals richer details and better adherence to prompts the model can now handle a a wider range of styles more accurately whether that's photo realism anime impressionism or abstract art image and 3 also captures textures and lighting with greater Precision producing results that stand out when compared to other top image generators image in 3 is already available through Google lab's image FX tool and has been rolled out to over 100 countries like V2 image and outputs include synth ID watermarks to ensure they're recognizable as AI generator to add a creative twist to image generation Google
has also introduced whisk an experimental tool that lets people generate visuals using other images as prompts instead of typing out a detailed description users can feed whisk a subject a scene and a style through images the tool combines those elements to create new outputs making the process faster and more visual for instance someone could upload a cartoon image of a bear a photo of a snowy mountain and a watercolor painting style and whisk would generate a visual blend blending those ideas there's also an option to add text prompts for more refinement but it's not required
whisk uses image and 3 alongside Google's Gemini model which analyzes the input images and writes detailed descriptions for them those descriptions are then passed to image in 3 to generate the final result it's a clever approach that simplifies the process for people who might struggle to write precise text prompts Google calls whisk a tool for Rapid visual exploration meaning it's built for Creative brainstorming rather than perfect polished outputs AI video and image generation have come a long way but there's still work to do even the best models including ve2 and imin 3 aren't immune to
quirks or imperfections however the improvements are undeniable Google's focus on cinematic details in V2 and the stylistic flexibility of Imagine 3 are big steps toward making AI tools more useful for professionals other companies are pushing forward too Runway ml one of the early players in AI video recently added advanced controls to its gen 3 Alpha turbo model P Labs released Pica 2.0 which allows users to add their own characters to videos meanwhile Luma AI expanded its dream machine and partnered with AWS to make its tools more accessible for Enterprise use the growing interest in AI
tools for video and image generation is starting to reshape creative Industries some filmmakers and artists remain skeptical especially after seeing AI results that don't quite hit the mark for example audiences at the game awards recently criticized ized a trailer that felt like AI SLO and many people still distrust ai's ability to replace human creativity that skepticism hasn't stopped progress though big names like James Cameron and Andy Circus are already exploring ai's potential in filmm showing that the industry is beginning to adapt Google's improvements to vo and imagin put them ahead in the race by focusing
on professional grade tools these updates give creators more options to produce polished video sequences with cinematic effects or highquality AI generated art tools like VO2 imagine 3 and whisk simplify the creative process while delivering impressive results VO2 will expand to YouTube shorts and other platforms next year making it more accessible for creators imagin 3's roll out on image effects is already Global and whisk adds an interesting layer for experimentation together these tools are pushing AI generated visuals closer to becoming mainstream in Creative workflows this focuses on providing creator with new ways to work whether they're
producing short films creating marketing visuals or experimenting for fun tools like VO2 and image in 3 unlock significant potential with improvements that continue to push ai's limitations further away with open AI Google and other companies all racing to improve their models AI generated visuals are evolving at a pace we haven't seen before each new release brings more realism more control and better results making it easier for creators to turn their ideas into into reality for now access to V2 remains limited but Google's strategy of careful rollouts ensures they can fine-tune the tool and address any
lingering issues as V2 continues to improve and reach more users it's going to be fascinating to see how creators use it and how it Stacks up against open AI Sora and other competitors let me know your thoughts in the comments and if you enjoyed the video don't forget to like And subscribe thanks for watching and see you in the next one