Google Unveils Gemini Omni, An AI Model That Can Create Videos From Any Input

By Amit Chowdhry May 21, 2026

Google announced Gemini Omni, a new family of AI models designed to combine Gemini’s reasoning capabilities with advanced content creation features, starting with video generation and editing. The first model in the lineup, Gemini Omni Flash, is rolling out to the Gemini app, Google Flow, YouTube Shorts, and YouTube Create App.

Gemini Omni enables users to generate and edit videos using combinations of text, images, audio, and video inputs. According to Google, the platform allows conversational editing where each instruction builds upon previous prompts while maintaining character consistency, scene continuity, and realistic physics.

The company said Gemini Omni can transform videos using natural language prompts, including changing objects, modifying environments, adding effects, and altering actions within scenes. Users can iteratively refine videos across multiple prompts without losing continuity from the original scene.

Google emphasized Gemini Omni’s ability to reason about physics and real-world concepts. The model incorporates an understanding of gravity, fluid dynamics, kinetic energy, history, science, and cultural context to generate more realistic and meaningful video outputs. The company highlighted examples including chain reaction animations, educational explainers about protein folding, and stylized alphabet-themed video sequences.

The platform also supports multimodal inputs, allowing creators to reference images, video clips, text prompts, and audio together to produce cohesive outputs. Initially, audio support will focus on voice references, with broader audio input support planned later.

Google additionally introduced avatar functionality that allows users to create digital versions of themselves using their own voice and likeness for video generation. The company noted that broader audio and speech editing capabilities are still being evaluated for responsible deployment.

As part of its AI safety initiatives, Google said all videos generated with Gemini Omni include SynthID digital watermarks. Users can verify AI-generated content through the Gemini app, Gemini in Chrome, and Google Search.

Gemini Omni Flash is being made available globally to Google AI Plus, Pro, and Ultra subscribers through the Gemini app and Google Flow. Google also said the model will be available at no cost to users on YouTube Shorts and the YouTube Create App beginning this week.

KEY QUOTES:

“We’re introducing Gemini Omni, where Gemini’s ability to reason meets the ability to create. Omni is our new model that can create anything from any input, starting with video.”

“Gemini Omni doesn’t just build scenes that look real, it reasons about what should happen next. It combines an intuitive understanding of physics with Gemini’s knowledge of history, science and cultural context, bridging the gap from photorealism to meaningful storytelling.”

Koray Kavukcuoglu, CTO, Google DeepMind and Chief AI Architect, Google