Deployment Guide
This guide walks through deploying the chat widget backend to Google Cloud Run using the modular v3.0 architecture with Google ADK (Agent Development Kit).
Prerequisites
- Google Cloud account with billing enabled
gcloudCLI installed and configured- Gemini API key from Google AI Studio
- Basic Python knowledge
Architecture Overview
+-------------------------------------------------------------+
| Your Static Site (GitHub Pages, Netlify, etc.) |
| -- Chat Widget JavaScript |
+-------------------------------------------------------------+
|
| HTTPS POST /
v
+-------------------------------------------------------------+
| Cloud Run (agent-chat-proxy) |
| +-- Flask application with rate limiting |
| +-- Google ADK multi-agent orchestration |
| +-- Session memory for conversation continuity |
| +-- Security: CORS, prompt injection detection |
| | |
| | Delegates to sub-agents |
| v |
| +-------------------------------------------------------+ |
| | Agent Registry | |
| | +-- site_about - General site questions | |
| | +-- who_is_brandon - About the author | |
| | +-- page_context - Page-specific help | |
| | +-- project_info - Project details | |
| | +-- resume_skills - Career/skills info | |
| | +-- docs_navigation - Site navigation | |
| | +-- learning_coach - Algorithm learning | |
| +-------------------------------------------------------+ |
| | |
| | Each agent has: |
| | - Google Search sub-agent |
| | - URL Context sub-agent |
+-------------------------------------------------------------+
|
| API request
v
+-------------------------------------------------------------+
| Google AI (Gemini API via ADK) |
| -- Gemini 2.5 Flash model |
+-------------------------------------------------------------+
Project Structure
The v3.0 architecture uses a modular pattern for maintainability:
agent-chat-proxy/
+-- app/ # Flask application
| +-- __init__.py
| +-- main.py # Entry point, app factory
| +-- routes/ # HTTP endpoints
| | +-- chat.py # Main chat endpoint
| | +-- health.py # Health checks
| +-- security/ # Security layer
| | +-- cors.py # CORS configuration
| | +-- injection.py # Prompt injection detection
| | +-- rate_limit.py # Rate limiting
| | +-- safety.py # ADK safety settings
| +-- session/ # Session management
| +-- memory.py # Conversation memory
|
+-- agents/ # Agent definitions
| +-- __init__.py # Agent registry exports
| +-- base.py # Base agent class
| +-- registry.py # Agent registration
| +-- sub_agents/ # Individual agents
| +-- site_about.py
| +-- who_is_brandon.py
| +-- ...
|
+-- prompts/ # Static prompt files
| +-- root.txt # Root orchestrator
| +-- site_about.txt
| +-- ...
|
+-- config/ # Configuration
| +-- settings.py # Environment settings
| +-- models.py # Model configurations
|
+-- Dockerfile # Container config (uses uv)
+-- requirements.txt
+-- deploy.sh # Deployment script
+-- pyproject.toml
Step 1: Create the Backend
Option A: Clone the Template
# Clone the reference implementation
git clone https://github.com/BA-CalderonMorales/my-life-as-a-dev
cd my-life-as-a-dev/cloud/agent-chat-proxy
# Copy to your own directory
cp -r . ~/my-chat-proxy
cd ~/my-chat-proxy
Option B: Create from Scratch
Create the directory structure above. Key files:
config/settings.py
"""Application settings loaded from environment."""
import os
from dataclasses import dataclass, field
from typing import List
@dataclass
class Settings:
gcp_project: str = field(
default_factory=lambda: os.environ.get('GCP_PROJECT', 'your-project-id')
)
port: int = field(
default_factory=lambda: int(os.environ.get('PORT', 8080))
)
# CORS - add your domains here
allowed_origins_static: List[str] = field(default_factory=lambda: [
'https://your-domain.github.io',
'http://localhost:8001',
'http://localhost:8000',
])
# Codespaces pattern for dynamic CORS
codespaces_pattern: str = r'https://.*-800[01]\.app\.github\.dev$'
@property
def prompts_dir(self) -> str:
import pathlib
return str(pathlib.Path(__file__).parent.parent / 'prompts')
def get_prompt(self, name: str) -> str:
import pathlib
prompt_path = pathlib.Path(self.prompts_dir) / f'{name}.txt'
return prompt_path.read_text().strip()
settings = Settings()
agents/base.py
"""Base agent class for all sub-agents."""
from abc import ABC, abstractmethod
from google.adk.agents import LlmAgent
from google.adk.tools import agent_tool
from google.adk.tools.google_search_tool import GoogleSearchTool
from google.adk.tools import url_context
from config.settings import settings
def create_search_agent(name_prefix: str) -> LlmAgent:
return LlmAgent(
name=f'{name_prefix}_google_search_agent',
model='gemini-2.5-flash',
description='Agent specialized in performing Google searches.',
instruction='Use GoogleSearchTool to find information.',
tools=[GoogleSearchTool()],
)
def create_url_context_agent(name_prefix: str) -> LlmAgent:
return LlmAgent(
name=f'{name_prefix}_url_context_agent',
model='gemini-2.5-flash',
description='Agent specialized in fetching URL content.',
instruction='Use UrlContextTool to retrieve content from URLs.',
tools=[url_context],
)
class BaseAgent(ABC):
@property
@abstractmethod
def name(self) -> str: pass
@property
@abstractmethod
def description(self) -> str: pass
@property
@abstractmethod
def prompt_file(self) -> str: pass
def get_instruction(self) -> str:
return settings.get_prompt(self.prompt_file)
def build(self) -> LlmAgent:
return LlmAgent(
name=self.name,
model='gemini-2.5-flash',
description=self.description,
instruction=self.get_instruction(),
tools=[
agent_tool.AgentTool(agent=create_search_agent(self.name)),
agent_tool.AgentTool(agent=create_url_context_agent(self.name)),
],
)
Dockerfile (uses uv for fast builds)
FROM python:3.11-slim
WORKDIR /app
# Install uv for fast dependency installation
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc curl \
&& curl -LsSf https://astral.sh/uv/install.sh | sh \
&& rm -rf /var/lib/apt/lists/*
ENV PATH="/root/.local/bin:$PATH"
COPY requirements.txt .
RUN uv pip install --system --no-cache -r requirements.txt
COPY . .
ENV PYTHONUNBUFFERED=1
ENV PORT=8080
EXPOSE 8080
CMD ["gunicorn", "--bind", "0.0.0.0:8080", "--workers", "4", "--timeout", "120", "app.main:app"]
requirements.txt
flask>=3.0.0
gunicorn>=21.0.0
google-adk>=0.3.0
google-genai>=0.5.0
google-cloud-secret-manager>=2.20.0
python-dotenv>=1.0.0
Step 2: Configure Your Agents
Create Prompt Files
Each agent needs a prompt file in prompts/. Example:
# prompts/site_about.txt
You help visitors understand what this site is about.
This is [Your Name]'s personal documentation hub.
Key themes:
- [Theme 1]
- [Theme 2]
When someone asks what the site is about, share these highlights.
Keep it conversational and welcoming.
Create Agent Modules
# agents/sub_agents/site_about.py
from agents.base import BaseAgent
class SiteAboutAgent(BaseAgent):
@property
def name(self) -> str:
return 'site_about'
@property
def description(self) -> str:
return 'Handles questions about what this site is.'
@property
def prompt_file(self) -> str:
return 'site_about'
Register in Registry
# agents/registry.py
from agents.sub_agents import SiteAboutAgent, YourOtherAgent
class AgentRegistry:
def _load_agents(self):
agent_classes = [SiteAboutAgent, YourOtherAgent]
for cls in agent_classes:
agent = cls()
self._agents[agent.name] = agent
Step 3: Set Up Google Cloud
Enable APIs
# Set your project
gcloud config set project YOUR_PROJECT_ID
# Enable required APIs
gcloud services enable \
run.googleapis.com \
secretmanager.googleapis.com \
cloudbuild.googleapis.com
Create Secret for API Key
# Create secret with your Gemini API key
echo -n "YOUR_GEMINI_API_KEY" | \
gcloud secrets create gemini-api-key --data-file=-
# Grant Cloud Run access
PROJECT_NUMBER=$(gcloud projects describe $(gcloud config get-value project) \
--format='value(projectNumber)')
gcloud secrets add-iam-policy-binding gemini-api-key \
--member="serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"
Step 4: Deploy
Using the Deploy Script
# Make executable
chmod +x deploy.sh
# Dry run first
./deploy.sh --dry-run
# Full deploy
./deploy.sh
Manual Deploy
gcloud run deploy agent-chat-proxy \
--source . \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--memory 1Gi \
--cpu 1 \
--timeout 120s \
--min-instances 0 \
--max-instances 10 \
--set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
--set-env-vars="GCP_PROJECT=YOUR_PROJECT_ID"
Step 5: Verify Deployment
# Get service URL
SERVICE_URL=$(gcloud run services describe agent-chat-proxy \
--region=us-central1 \
--format='value(status.url)')
# Test health
curl $SERVICE_URL/health
# Expected: {"status":"healthy","version":"3.0.0-adk"}
# Test chat
curl -X POST $SERVICE_URL/ \
-H 'Content-Type: application/json' \
-H 'Origin: https://your-domain.github.io' \
-d '{"question":"What is this site about?"}'
Security Features
The v3.0 architecture includes multiple security layers:
| Feature | Description |
|---|---|
| CORS | Only approved origins can make requests |
| Rate Limiting | 30 req/min per IP, 1000 req/min global |
| Burst Protection | Max 10 requests in 5 seconds |
| Prompt Injection | Blocks manipulation attempts |
| Safety Settings | Google ADK harm category filters |
| DDoS Protection | Cloud Run infrastructure-level |
Rate Limit Headers
Responses include rate limit information:
Adding New Agents
- Create prompt: prompts/my_agent.txt
- Create module: agents/sub_agents/my_agent.py
- Register in agents/sub_agents/init.py
- Add to agents/registry.py
- Deploy: ./deploy.sh
See Update Agent Flows for details.
Cost Optimization
| Setting | Value | Purpose |
|---|---|---|
| min-instances | 0 | Scale to zero when idle |
| max-instances | 10 | Prevent runaway costs |
| memory | 1Gi | Sufficient for ADK agents |
| CPU | 1 | Good balance |
| timeout | 120s | Allow complex agent chains |
Estimated cost: Free tier covers ~2M requests/month.
Troubleshooting
| Issue | Solution |
|---|---|
| 403 Forbidden | Add origin to allowed_origins_static |
| Prompt file not found | Check filename matches prompt_file property |
| Rate limit exceeded | Wait or adjust limits in rate_limit.py |
| Cold start slow | Consider min-instances=1 |
| ADK import error | Verify google-adk in requirements.txt |
View Logs
gcloud logging read \
"resource.type=cloud_run_revision AND resource.labels.service_name=agent-chat-proxy" \
--limit 20 \
--project=YOUR_PROJECT_ID