Google DeepMind has launched Veo 2, its next-generation AI video generation model, just days after
OpenAI made its video generation model Sora available to public.
The new model can generate videos up to 4K resolution and extending beyond two minutes in length, significantly outpacing OpenAI's Sora, which is currently limited to 1080p resolution and 20-second clips. In internal comparisons, 59% of human raters preferred Veo 2 over Sora Turbo, indicating a potential breakthrough in AI-generated video technology.
The second-generation model improves upon various factors when compared to Google’s first generation Veo model. The Veo 2 understands cinematographic language, allowing users to specify genres, lens types, and cinematic effects. Creators can request specific shot types like low-angle tracking shots or close-ups, with the model delivering nuanced visual interpretations.
The model demonstrates improved understanding of real-world physics, human movement, and expression. Google claims Veo 2 produces fewer "hallucinated" details like extra fingers or unexpected objects, enhancing the realism of generated videos. The technology can generate content across various styles, from hyper-realistic scenes to animated sequences.
Currently, Veo 2 is available through Google Labs' VideoFX platform on a limited, waitlist basis. To mitigate potential misuse, Google has implemented SynthID, an invisible watermarking technology to identify AI-generated content.
While the model promises upgraded capabilities, Google acknowledges ongoing challenges in maintaining long-term coherence and character consistency in generated videos.