AI agent deployment server: production infrastructure setup
A complete guide to setting up a production AI agent deployment server. VPS provisioning, Docker, Nginx, SSL, monitoring, CI/CD, and zero-downtime deployments.
TL;DR: A production AI agent deployment server needs: a VPS (2 GB RAM, $10-20/mo), Docker for containerization, Nginx reverse proxy with SSL, health checks, and a CI/CD pipeline. This guide walks through setting up each piece from scratch.
When your agent outgrows platforms like Railway or Fly.io, you need a dedicated server. More control, lower cost at scale, custom networking, and the ability to run multiple agents or supporting services on one machine.
This guide assumes you’ve already read the beginner’s hosting guide and have a working agent container. Now let’s build the production infrastructure around it.
Key takeaways:
- A $10-20/month VPS handles most production AI agents
- Docker Compose simplifies running the agent + supporting services
- Nginx reverse proxy + Let’s Encrypt gives you SSL and custom domains
- Blue-green deployment pattern enables zero-downtime updates
- CI/CD with GitHub Actions automates the entire deploy pipeline
- Always monitor costs. LLM API calls are the real expense, not the server
Step 1: Provision the VPS
Choose a provider and create a server:
| Provider | Spec | Cost |
|---|---|---|
| Hetzner CX22 | 2 vCPU, 4 GB RAM, 40 GB SSD | €8/mo |
| DigitalOcean Basic | 2 vCPU, 2 GB RAM, 60 GB SSD | $15/mo |
| Linode Shared 2GB | 1 vCPU, 2 GB RAM, 50 GB SSD | $12/mo |
All recommendations are for Ubuntu 24.04. After provisioning, SSH in and run the setup:
# Update system
apt update && apt upgrade -y
# Install Docker
curl -fsSL https://get.docker.com | sh
# Install Docker Compose
apt install -y docker-compose-plugin
# Add your user to docker group
usermod -aG docker $USER
Step 2: Nginx reverse proxy with SSL
Nginx sits in front of your agent container, handling SSL termination, domain routing, and request buffering.
apt install -y nginx certbot python3-certbot-nginx
Create the Nginx config:
server {
listen 80;
server_name agent.yourdomain.com;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_read_timeout 120s;
}
}
Enable SSL:
certbot --nginx -d agent.yourdomain.com
Step 3: Docker Compose setup
Create a docker-compose.yml for your agent and any supporting services:
version: '3.8'
services:
agent:
build: .
ports:
- "8080:8080"
env_file:
- .env
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
The .env file holds your API keys and configuration:
ANTHROPIC_API_KEY=sk-ant..
OPENAI_API_KEY=sk-proj..
AGENT_LOG_LEVEL=info
Security: Set .env permissions to 600 so only the owner can read it.
Step 4: Zero-downtime deployment
Blue-green deployment keeps your agent running during updates. Two containers: one active (blue), one standby (green). Update the standby, swap traffic, then update the old one.
# docker-compose.blue.yml
services:
agent-blue:
build: .
container_name: agent-blue
ports:
- "8081:8080"
env_file:
- .env
restart: unless-stopped
# docker-compose.green.yml (same but port 8082)
# Deploy script
# 1. Build and start new version on standby port
docker compose -f docker-compose.green.yml up -d --build
# 2. Health check
sleep 10
curl -f http://localhost:8082/health || exit 1
# 3. Swap Nginx to point to new version
sed -i 's/proxy_pass http:\/\/localhost:8080/proxy_pass http:\/\/localhost:8082/' /etc/nginx/sites-available/agent
nginx -s reload
# 4. Stop old version
docker compose -f docker-compose.blue.yml down
Step 5: Monitoring
| What | Tool | Why |
|---|---|---|
| Health checks | UptimeRobot (free) | Ping /health every 5 minutes, alert on failure |
| Error tracking | Sentry | Catch crashes and exceptions |
| LLM cost tracking | Custom logger | Log token counts per request, alert on spikes |
| Server metrics | Netdata or Grafana | CPU, memory, disk, network |
Step 6: CI/CD with GitHub Actions
# .github/workflows/deploy.yml
name: Deploy Agent
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Deploy to VPS
uses: appleboy/[email protected]
with:
host: ${{ secrets.VPS_HOST }}
username: ${{ secrets.VPS_USER }}
key: ${{ secrets.VPS_SSH_KEY }}
script: |
cd ~/agent
git pull
docker compose -f docker-compose.green.yml up -d --build
sleep 10
curl -f http://localhost:8082/health || exit 1
sed -i 's/proxy_pass http:\/\/localhost:8080/proxy_pass http:\/\/localhost:8082/' /etc/nginx/sites-available/agent
nginx -s reload
docker compose -f docker-compose.blue.yml down
With this in place, every git push to main builds your agent, runs health checks, and deploys with zero downtime. No SSH required.
Read the beginner’s guide first if you’re new to hosting, then come back here when your agent outgrows platforms.
Related Posts
- AI agent deployment guide: from localhost to production. The full production deployment playbook from containerization to monitoring
- AI agent logging and monitoring. Structured logging, metrics, and debugging patterns for production agents
- AI agent error handling patterns. Retry strategies, circuit breakers, and fallback behaviors for reliable agents
A guide to self-hosted AI agents in 2026 (https://doneclaw.com/blog/self-hosted-ai-agent-2026/) covers Docker deployment, infrastructure setup, and production monitoring.
Self-hosted AI agents guide covers Docker deployment, infrastructure setup, and production monitoring.
This article was published on Agentic Up (https://agenticup.dev): practical guides for developers and founders building with AI agents. Reach me at [email protected].