AI agent deployment server: production infrastructure setup

A complete guide to setting up a production AI agent deployment server. VPS provisioning, Docker, Nginx, SSL, monitoring, CI/CD, and zero-downtime deployments.

TL;DR: I moved my first production agent from Railway to a Hetzner VPS when the platform bill hit ₹8,000/month. The server cost dropped to ₹670/month. This guide is the exact setup I use now.

When your agent outgrows platforms like Railway or Fly.io, you need a dedicated server. More control, lower cost at scale, custom networking, and the ability to run multiple agents on one machine.

This guide assumes you’ve already read the beginner’s hosting guide and have a working agent container. Now let’s build the production infrastructure around it.

Key takeaways:

A ₹830-1,700/month VPS ($10-20) handles most production AI agents

Docker Compose simplifies running the agent + supporting services

Nginx reverse proxy + Let’s Encrypt gives you SSL and custom domains

Blue-green deployment pattern enables zero-downtime updates

CI/CD with GitHub Actions automates the entire deploy pipeline

Always monitor costs. LLM API calls are the real expense, not the server

Step 1: Provision the VPS

Choose a provider and create a server:

Provider	Spec	Cost
Hetzner CX22	2 vCPU, 4 GB RAM, 40 GB SSD	€8/mo
DigitalOcean Basic	2 vCPU, 2 GB RAM, 60 GB SSD	$15/mo
Linode Shared 2GB	1 vCPU, 2 GB RAM, 50 GB SSD	$12/mo

All recommendations are for Ubuntu 24.04. After provisioning, SSH in and run the setup:

# Update system
apt update && apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com | sh

# Install Docker Compose
apt install -y docker-compose-plugin

# Add your user to docker group
usermod -aG docker $USER

Step 2: Nginx reverse proxy with SSL

Nginx sits in front of your agent container, handling SSL termination, domain routing, and request buffering.

apt install -y nginx certbot python3-certbot-nginx

Create the Nginx config:

server {
 listen 80;
 server_name agent.yourdomain.com;

 location / {
 proxy_pass http://localhost:8080;
 proxy_set_header Host $host;
 proxy_set_header X-Real-IP $remote_addr;
 proxy_read_timeout 120s;
 }
}

Enable SSL:

certbot --nginx -d agent.yourdomain.com

Step 3: Docker Compose setup

Create a docker-compose.yml for your agent and any supporting services:

version: '3.8'
services:
 agent:
 build: .
 ports:
 - "8080:8080"
 env_file:
 - .env
 restart: unless-stopped
 healthcheck:
 test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
 interval: 30s
 timeout: 10s
 retries: 3

The .env file holds your API keys and configuration:

ANTHROPIC_API_KEY=sk-ant..
OPENAI_API_KEY=sk-proj..
AGENT_LOG_LEVEL=info

Security: Set .env permissions to 600 so only the owner can read it.

Step 4: Zero-downtime deployment

Blue-green deployment keeps your agent running during updates. Two containers: one active (blue), one standby (green). Update the standby, swap traffic, then update the old one.

# docker-compose.blue.yml
services:
 agent-blue:
 build: .
 container_name: agent-blue
 ports:
 - "8081:8080"
 env_file:
 - .env
 restart: unless-stopped

# docker-compose.green.yml (same but port 8082)

# Deploy script
# 1. Build and start new version on standby port
docker compose -f docker-compose.green.yml up -d --build

# 2. Health check
sleep 10
curl -f http://localhost:8082/health || exit 1

# 3. Swap Nginx to point to new version
sed -i 's/proxy_pass http:\/\/localhost:8080/proxy_pass http:\/\/localhost:8082/' /etc/nginx/sites-available/agent
nginx -s reload

# 4. Stop old version
docker compose -f docker-compose.blue.yml down

Step 5: Monitoring

What	Tool	Why
Health checks	UptimeRobot (free)	Ping /health every 5 minutes, alert on failure
Error tracking	Sentry	Catch crashes and exceptions
LLM cost tracking	Custom logger	Log token counts per request, alert on spikes
Server metrics	Netdata or Grafana	CPU, memory, disk, network

Step 6: CI/CD with GitHub Actions

# .github/workflows/deploy.yml
name: Deploy Agent
on:
 push:
 branches: [main]

jobs:
 deploy:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 - name: Deploy to VPS
 uses: appleboy/[email protected]
 with:
 host: ${{ secrets.VPS_HOST }}
 username: ${{ secrets.VPS_USER }}
 key: ${{ secrets.VPS_SSH_KEY }}
 script: |
 cd ~/agent
 git pull
 docker compose -f docker-compose.green.yml up -d --build
 sleep 10
 curl -f http://localhost:8082/health || exit 1
 sed -i 's/proxy_pass http:\/\/localhost:8080/proxy_pass http:\/\/localhost:8082/' /etc/nginx/sites-available/agent
 nginx -s reload
 docker compose -f docker-compose.blue.yml down

With this in place, every git push to main builds your agent, runs health checks, and deploys with zero downtime. No SSH required.

Read the beginner’s guide first if you’re new to hosting, then come back here when your agent outgrows platforms.

FAQ

What do I need for a production AI agent server? A VPS (2 GB RAM, 2 CPU cores), Docker for containerization, Nginx as a reverse proxy, and a monitoring setup (health checks, uptime monitoring, error logging). Expect ₹830-1,700/month ($10-20) for the server.

How do I deploy with zero downtime? Use Docker with a blue-green deployment pattern: run two containers, swap the Nginx proxy between them, then stop the old one. Your agent never goes offline during the swap.

How do I handle API key security on a server? Store API keys in a .env file with 600 permissions, never commit them to git. For team deployments, use a secrets manager like Doppler or 1Password CLI to inject keys at deploy time.

What monitoring should I set up? At minimum: health check endpoint (pinged every 60s), uptime monitoring (UptimeRobot or Better Uptime), error tracking (Sentry), and cost tracking per LLM call. Add PagerDuty or Slack alerts for downtime.

AI agent deployment guide: from localhost to production. The full production deployment playbook from containerization to monitoring
AI agent logging and monitoring. Structured logging, metrics, and debugging patterns for production agents
AI agent error handling patterns. Retry strategies, circuit breakers, and fallback behaviors for reliable agents

Self-hosted AI agents guide covers Docker deployment, infrastructure setup, and production monitoring.

This article was published on Agentic Up (https://agenticup.dev): practical guides for developers and founders building with AI agents. Reach me at [email protected]