Skip to main content

Introduction

Learn how to generate videos with audio in your app. Veo Video Generation uses Google Veo on Vertex AI to create high-quality videos from text or image inputs. It supports synchronized audio, multiple aspect ratios, and short-form clips, making it ideal for rich, dynamic content creation.

What You Can Build

With Veo Video Generation, your app can support:
  • Text-to-Video with Audio – Generate videos with synchronized sound from user prompts.
  • Image-to-Video Creation – Animate images into dynamic video clips.
  • Product Demo Generators – Create short product videos from descriptions.
  • Marketing Video Tools – Generate engaging content for ads and campaigns.
  • Social Media Video Creators – Help users produce short-form video content quickly.

How It Works

When Veo Video Generation is enabled, your app sends user input to Google Veo via Vertex AI, which generates video clips with synchronized visuals and audio. Your app can:
  • accept text descriptions or image uploads
  • generate videos with built-in audio
  • support multiple aspect ratios (vertical, square, landscape)
  • produce short clips (up to ~8 seconds)
  • display or export generated videos
This allows users to create polished, audio-enabled video content directly within your app.

Example Prompts

You can use prompts like these to implement features: Add a video generator with audio Add a video generator to my app where users describe a scene and get a short video with audio back using Google Veo. Add product demo videos Add product demo videos to my app that are generated from text descriptions using Veo on Vertex AI.

Common Use Cases

Veo Video Generation is commonly used for:
  • marketing and promotional videos
  • product demos and showcases
  • social media content creation
  • storytelling and creative apps
  • short-form video generation tools

Best Practices

To get the best results:
  • write descriptive prompts including motion and sound cues
  • choose aspect ratios based on target platforms
  • keep clips short for faster generation and better performance
  • allow users to regenerate and refine outputs
  • provide preview and download options