Deployment Guide¶

This guide walks through deploying the chat widget backend to Google Cloud Run using a modular multi-agent architecture.

Prerequisites¶

Google Cloud account with billing enabled
gcloud CLI installed and configured
Gemini API key from Google AI Studio
Go 1.22+ (for local development)

Architecture Overview¶

+-------------------------------------------------------------+
|  Your Static Site (GitHub Pages, Netlify, etc.)             |
|  -- Chat Widget JavaScript                                  |
+-------------------------------------------------------------+
                              |
                              | HTTPS POST /
                              v
+-------------------------------------------------------------+
|  Cloud Run (agent-chat-proxy)                               |
|  +-- Go HTTP server with rate limiting                      |
|  +-- Multi-agent orchestration via function calling         |
|  +-- Session memory for conversation continuity             |
|  +-- Security: CORS, prompt injection detection             |
|                             |                               |
|                             | Delegates to sub-agents       |
|                             v                               |
|  +-------------------------------------------------------+  |
|  |  Agent Registry                                       |  |
|  |  +-- site_about          - General site questions     |  |
|  |  +-- who_is_brandon      - About the author           |  |
|  |  +-- page_context        - Page-specific help         |  |
|  |  +-- project_info        - Project details            |  |
|  |  +-- resume_skills       - Career/skills info         |  |
|  |  +-- docs_navigation     - Site navigation            |  |
|  |  +-- learning_coach      - Algorithm learning         |  |
|  +-------------------------------------------------------+  |
+-------------------------------------------------------------+
                              |
                              | API request
                              v
+-------------------------------------------------------------+
|  Google AI (Gemini API)                                     |
|  -- Gemini 2.0 Flash model                                  |
+-------------------------------------------------------------+

Project Structure¶

The backend uses a modular Go architecture for maintainability:

agent-chat-proxy/
+-- cmd/                          # Application entrypoints
|   +-- server/
|       +-- main.go              # Entry point
|
+-- internal/                     # Private application code
|   +-- api/                     # HTTP handlers
|   |   +-- chat.go              # Main chat endpoint
|   |   +-- health.go            # Health checks
|   +-- adk/                     # AI/Agent orchestration
|   |   +-- client.go            # Gemini client setup
|   |   +-- orchestrator.go      # Multi-agent routing
|   |   +-- executor.go          # Sub-agent execution
|   +-- agents/                  # Agent definitions
|   |   +-- registry.go          # Agent registration
|   |   +-- types.go             # Agent types
|   +-- prompts/                 # Static prompt files
|   |   +-- root.txt             # Root orchestrator
|   |   +-- site_about.txt
|   |   +-- ...
|   +-- config/                  # Configuration
|   |   +-- config.go            # Environment settings
|   +-- security/                # Security layer
|   |   +-- cors.go              # CORS configuration
|   |   +-- injection.go         # Prompt injection detection
|   |   +-- rate_limit.go        # Rate limiting
|   +-- session/                 # Session management
|       +-- manager.go           # Conversation memory
|
+-- Dockerfile                    # Container config
+-- go.mod                        # Go module definition
+-- go.sum                        # Dependency checksums
+-- deploy.sh                     # Deployment script

Step 1: Create the Backend¶

Option A: Use Your Own Implementation¶

Create a Go HTTP service that:

Accepts POST requests with {"question": "...", "session_id": "..."}
Routes to appropriate sub-agents via function calling
Returns {"answer": "...", "session_id": "..."}

Option B: Create from Scratch¶

Create the directory structure above. Key files:

internal/config/config.go¶

package config

import (
    "os"
    "time"
)

// Config holds all application configuration
type Config struct {
    Port             string
    Environment      string
    GoogleAPIKey     string
    CurrentModel     string
    MaxMessageLength int
    SessionTTL       time.Duration
    AllowedOrigins   []string
}

// Load creates a Config from environment variables
func Load() *Config {
    return &Config{
        Port:             getEnv("PORT", "8080"),
        Environment:      getEnv("ENV", "development"),
        GoogleAPIKey:     os.Getenv("GOOGLE_API_KEY"),
        CurrentModel:     getEnv("MODEL_ID", "gemini-2.0-flash"),
        MaxMessageLength: 2000,
        SessionTTL:       30 * time.Minute,
        AllowedOrigins: []string{
            "https://your-username.github.io",
            "http://localhost:8001",
        },
    }
}

func getEnv(key, fallback string) string {
    if value, ok := os.LookupEnv(key); ok {
        return value
    }
    return fallback
}

internal/agents/types.go¶

package agents

// Agent represents a distinct capability or persona within the system
type Agent struct {
    Name        string
    Description string
    PromptFile  string
    Instruction string // Loaded content from the prompt file
}

Dockerfile (Go multi-stage build)¶

FROM golang:1.22-alpine AS builder

WORKDIR /app

# Copy go mod files
COPY go.mod go.sum ./
RUN go mod download

# Copy source code
COPY . .

# Build
RUN CGO_ENABLED=0 GOOS=linux go build -o server ./cmd/server

# Final stage
FROM alpine:latest

WORKDIR /app

COPY --from=builder /app/server .
COPY --from=builder /app/internal/prompts ./internal/prompts

ENV PORT=8080

EXPOSE 8080

CMD ["./server"]

go.mod¶

module github.com/your-username/agent-chat-proxy

go 1.22

require (
    github.com/google/generative-ai-go v0.18.0
    github.com/google/uuid v1.6.0
    google.golang.org/api v0.200.0
)

Step 2: Configure Your Agents¶

Create Prompt Files¶

Each agent needs a prompt file in internal/prompts/. Example:

# internal/prompts/site_about.txt

You help visitors understand what this site is about.

This is [Your Name]'s personal documentation hub.

Key themes:
- [Theme 1]
- [Theme 2]

When someone asks what the site is about, share these highlights.
Keep it conversational and welcoming.

Register Agents¶

Agents are registered in the registry and exposed as function declarations for Gemini's function calling:

// internal/agents/registry.go
package agents

type Registry struct {
    RootAgent *Agent
    Agents    map[string]*Agent
}

func NewRegistry(promptsDir string) *Registry {
    r := &Registry{
        Agents: make(map[string]*Agent),
    }

    // Register agents
    agents := []Agent{
        {Name: "site_about", Description: "Handles general site questions", PromptFile: "site_about"},
        {Name: "project_info", Description: "Handles project questions", PromptFile: "project_info"},
        // Add more agents...
    }

    for _, a := range agents {
        agent := a
        agent.Instruction = loadPrompt(promptsDir, a.PromptFile)
        r.Agents[a.Name] = &agent
    }

    return r
}

Step 3: Set Up Google Cloud¶

Enable APIs¶

# Set your project
gcloud config set project YOUR_PROJECT_ID

# Enable required APIs
gcloud services enable \
    run.googleapis.com \
    secretmanager.googleapis.com \
    cloudbuild.googleapis.com

Create Secret for API Key¶

# Create secret with your Gemini API key
echo -n "YOUR_GEMINI_API_KEY" | \
    gcloud secrets create gemini-api-key --data-file=-

# Grant Cloud Run access
PROJECT_NUMBER=$(gcloud projects describe $(gcloud config get-value project) \
    --format='value(projectNumber)')

gcloud secrets add-iam-policy-binding gemini-api-key \
    --member="serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
    --role="roles/secretmanager.secretAccessor"

Step 4: Deploy¶

Using the Deploy Script¶

# Make executable
chmod +x deploy.sh

# Dry run first
./deploy.sh --dry-run

# Full deploy
./deploy.sh

Manual Deploy¶

gcloud run deploy agent-chat-proxy \
    --source . \
    --platform managed \
    --region us-central1 \
    --allow-unauthenticated \
    --memory 1Gi \
    --cpu 1 \
    --timeout 120s \
    --min-instances 0 \
    --max-instances 10 \
    --set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
    --set-env-vars="GCP_PROJECT=YOUR_PROJECT_ID"

Step 5: Verify Deployment¶

# Get service URL
SERVICE_URL=$(gcloud run services describe agent-chat-proxy \
    --region=us-central1 \
    --format='value(status.url)')

# Test health
curl $SERVICE_URL/health
# Expected: {"status":"healthy","version":"3.0.0-adk"}

# Test chat
curl -X POST $SERVICE_URL/ \
    -H 'Content-Type: application/json' \
    -H 'Origin: https://your-domain.github.io' \
    -d '{"question":"What is this site about?"}'

Security Features¶

The architecture includes multiple security layers:

Feature	Description
CORS	Only approved origins can make requests
Rate Limiting	Configurable per-IP and global limits
Burst Protection	Prevents rapid-fire requests
Prompt Injection	Blocks manipulation attempts
Safety Settings	Gemini harm category filters
DDoS Protection	Cloud Run infrastructure-level

Rate Limit Headers¶

Responses include rate limit information:

X-RateLimit-Limit: 30
X-RateLimit-Remaining: 29
X-RateLimit-Reset: 1767693633

Adding New Agents¶

Create prompt: internal/prompts/my_agent.txt
Register in internal/agents/registry.go
Deploy: ./deploy.sh

See the skills documentation for details on adding agents.

Cost Optimization¶

Setting	Value	Purpose
min-instances	0	Scale to zero when idle
max-instances	10	Prevent runaway costs
memory	512Mi-1Gi	Go is memory-efficient
CPU	1	Good balance
timeout	120s	Allow complex agent chains

Estimated cost: Free tier covers ~2M requests/month. Go's fast cold starts make min-instances=0 very practical.

Troubleshooting¶

Issue	Solution
403 Forbidden	Add origin to allowed origins in config
Prompt file not found	Check filename and path in prompts directory
Rate limit exceeded	Wait or adjust limits in security config
Cold start slow	Consider min-instances=1 (less needed with Go)

View Logs¶

gcloud logging read \
    "resource.type=cloud_run_revision AND resource.labels.service_name=agent-chat-proxy" \
    --limit 20 \
    --project=YOUR_PROJECT_ID