Chapter 13: Vertex AI (Google's VO3) Integration for Text-to-Video¶
Video: Watch this chapter on YouTube (2:43:10)
Overview¶
This chapter demonstrates how to access Google's VO3 video generation model directly through Vertex AI on Google Cloud Platform, as an alternative to using third-party platforms like WaveSpeed. This approach provides direct access to Google's infrastructure with full control over API calls.
Detailed Summary¶
Why Direct Vertex AI Access?¶
While platforms like WaveSpeed AI provide convenient unified APIs, direct Vertex AI access offers:
- Direct Google infrastructure: No intermediary
- Full API control: All parameters available
- Potentially lower costs: No platform markup (though still expensive)
- GCS integration: Output directly to Google Cloud Storage
Prerequisites¶
- Google Cloud Platform account
- Billing enabled
- Vertex AI API enabled
- OAuth2 credentials configured
Workflow Architecture¶
Step 1: Manual Trigger¶
Start with a simple Manual Trigger for testing the API connection.
Step 2: Understanding the API Endpoint¶
From Vertex AI documentation (cloud.google.com/vertex-ai), the endpoint format is:
https://us-central1-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/us-central1/publishers/google/models/{MODEL_ID}:generateVideo
Parameters to Configure¶
- PROJECT_ID: Your Google Cloud project ID (not project number)
- MODEL_ID: e.g.,
veo-3-fast-generate-0.1
Step 3: OAuth2 Authentication Setup¶
Vertex AI requires Google OAuth2, not simple API keys.
Configuring Credentials in n8n¶
- HTTP Request node → Authentication
- Select Predefined Credential Type
- Choose Google OAuth2 API
Creating OAuth2 Credential¶
- Create new credential
- Need from Google Cloud Console:
- Client ID
- Client Secret
- Important: Add scope for cloud platform access:
Enabling Vertex AI API¶
- Go to Google Cloud Console
- Search "Vertex AI API"
- Click Enable
Complete OAuth Flow¶
- Click "Sign in with Google"
- Select account
- Grant permissions
- See "Connection successful"
Step 4: Configure POST Request¶
URL Configuration¶
Full URL with your project ID:
https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/veo-3-fast-generate-0.1:generateVideo
JSON Body¶
Use raw JSON body format:
{
"instances": [
{
"prompt": "A serene walk down the beach at sunset, gentle waves lapping the shore"
}
],
"parameters": {
"aspectRatio": "16:9",
"durationSeconds": 8,
"sampleCount": 1
}
}
Notes:
- durationSeconds must be 8 (VO3 requirement)
- Remove outputGcsUri to receive Base64 response instead of GCS storage
Step 5: Wait Node¶
- Add Wait node
- Set 15 seconds
- Video generation takes time
Step 6: Poll for Results¶
Understanding the Response¶
The POST returns an operation name that must be polled.
GET Request Configuration¶
- Add HTTP Request node
- Method: POST (for polling)
- URL: Long-running operation endpoint
- Body: Include operation name from POST response
Same OAuth2 Authentication¶
Use the same Google OAuth2 credential.
Step 7: If Loop for Status Check¶
- Add If node
- Condition type: Boolean
- Check:
doneequalstrue - True branch: Continue to processing
- False branch: Wait 15s → Loop back to poll
Step 8: Handling Base64 Response¶
Without GCS storage, Vertex AI returns video as Base64.
Extract to Field¶
- Add Set/Edit Fields node
- Create field
b64 - Value: The Base64 video data from response
Convert Base64 to File¶
- Add Convert node
- Operation: Move Base64 to file
- Input: The
b64field - Output: Binary file data
Step 9: Download/Deliver¶
Options for the final output:
- Download directly: For testing
- Upload to Drive: Store for later use
- Send via Telegram/Slack: Direct delivery
- Email with attachment: Gmail with binary attachment
GCS Alternative¶
For production, using Google Cloud Storage is recommended:
- Create GCS bucket
- Include in API request:
- Video saves directly to GCS
- Retrieve via GCS API or public URL
Cost Comparison: Direct vs WaveSpeed¶
| Factor | Vertex AI Direct | WaveSpeed |
|---|---|---|
| Setup complexity | Higher | Lower |
| Control | Full | Limited |
| Multi-model access | Google only | Many providers |
| Pricing | Google rates | Platform markup |
| Integration ease | OAuth required | API key |
When to Use Each Approach¶
Use Vertex AI Direct when: - Already in Google Cloud ecosystem - Need maximum control - Want GCS integration - Building enterprise solutions
Use WaveSpeed when: - Need multiple model providers - Want simple API key auth - Rapid prototyping - Cost comparison shopping
Key Takeaways¶
-
Direct Vertex AI access is possible: n8n can call Google's APIs directly.
-
OAuth2 is required: Unlike simple API keys, Google Cloud needs OAuth authentication.
-
Scope configuration is critical: The cloud-platform scope must be added.
-
Base64 without GCS: Remove storage URI to receive video as Base64 data.
-
8-second duration required: VO3 text-to-video requires specific duration.
-
Polling pattern still applies: Operation name returned, poll for completion.
-
Boolean status check: Vertex AI uses
done: trueinstead of "completed". -
Conversion needed for Base64: Additional nodes required to handle Base64 output.
-
GCS simplifies production: Direct storage avoids Base64 handling.
-
Platform choice depends on use case: Direct access vs unified platform is a tradeoff.
Conclusion¶
Direct Vertex AI integration demonstrates n8n's flexibility in connecting to any API, even complex OAuth-authenticated Google Cloud services. While this approach requires more setup than using a platform like WaveSpeed, it provides complete control and native Google Cloud integration. The pattern of OAuth authentication, async polling, and result handling applies to many Google Cloud AI services beyond video generation. For organizations already invested in Google Cloud, this direct approach may be preferable; for others, the simplicity of unified platforms justifies any additional cost. Both approaches produce the same high-quality VO3 output—the choice depends on infrastructure preferences and integration requirements.