---
title: "Cloudflare + Flue: the open agent harness stack"
canonical: "https://agenticup.dev/posts/cloudflare-flue-agent-harness-stack/"
pubDate: "2026-06-19T00:00:00.000Z"
description: "Cloudflare + Flue ship a three-layer agent stack: Cloudflare Agents SDK (platform), Pi/Think (harness), and Flue (framework). Durable Objects give every agent its own SQLite database and zero cost when idle. Fibers checkpoint agent turns so they survive crashes. Code Mode runs LLM-generated TypeScript in <10ms isolates. No containers needed."
tags: [cloudflare, flue, agent-harness, durable-objects, pi-harness, agents-sdk, project-think, open-source, agent-platform]
---

**TL;DR:** Cloudflare and Flue ship a three-layer open agent stack. Cloudflare Agents SDK provides Durable Objects, durable execution fibers, code execution in V8 isolates, and a durable filesystem . All without containers. Flue wraps the Pi harness into a declarative framework with Markdown skills, multi-cloud deploy, and out-of-the-box integrations for Slack, GitHub, Linear, and Discord. The agent platform market now has two competing visions: managed harness (AWS AgentCore) and open composable stack (Cloudflare + Flue).

> **Key takeaways:**
> - Cloudflare + Flue is a three-layer stack: Agents SDK (platform) → Pi/Think (harness) → Flue (framework)
> - Durable Objects give every agent its own SQLite database. Zero cost when idle. 10,000 agents cost like 100 when usage is sparse
> - Fibers checkpoint agent turns mid-execution. A crash mid-turn means resume, not restart
> - Code Mode runs LLM-generated TypeScript in V8 isolates in under 10ms at $0.002 per load. No containers needed for most agent work
> - Flue is MIT-licensed open source from the Astro team. Deploys to Cloudflare, Node.js, GitHub Actions, GitLab CI, Render, Fly.io

---

The first time I ran an agent on a serverless function, it crashed mid-turn. The model had generated a plan, called two tools, and was waiting for a third result. Then the function timed out. The plan was gone. The tool results were gone. The agent rebooted with no memory of what it was doing. The user saw a spinner that never resolved.

I fixed it by adding a database. Then a queue. Then a state machine. Then I realized I was rebuilding what every platform should give you for free.

AWS solved this with AgentCore Harness : a managed service that wraps everything in two API calls. Cloudflare solved it differently. They turned Durable Objects into an agent runtime, built durable execution on top, and opened it so any framework can use it.

Flue is the first framework to ship on that foundation. [Cloudflare announced the integration](https://blog.cloudflare.com/agents-platform-flue-sdk/) on June 17, 2026, alongside Flue's 1.0 Beta. It's an open-source TypeScript framework from the Astro team, built on the Pi harness : the same harness that powers OpenClaw. It ships 1.0 Beta on June 17, one day before AWS AgentCore went GA.

Two visions, same week. This is the open one.

## Why does the agent runtime need to be different from everything else?

Traditional applications serve many users from one instance. A restaurant has a menu and a kitchen built to churn out dishes at volume. One instance, many customers.

Agents are one-to-one. Each agent is a unique instance serving one user, running one task, with its own state, its own conversation history, and its own tool configuration. A personal chef, not a restaurant kitchen.

That changes the scaling math. If a hundred million knowledge workers each use an agentic assistant at even modest concurrency, you need capacity for tens of millions of simultaneous sessions. At current per-container costs, that's unsustainable.

Cloudflare's answer: the actor model. Each agent is a Durable Object : an addressable entity with its own SQLite database, its own storage, and its own lifecycle. It consumes zero compute when hibernated. When something happens (an HTTP request, a WebSocket message, a scheduled alarm, an inbound email), the platform wakes the agent, loads its state, and hands it the event. The agent does its work, then goes back to sleep.

| | VMs / Containers | Durable Objects |
|---|---|---|
| Idle cost | Full compute cost, always | Zero (hibernated) |
| Scaling | Provision and manage capacity | Automatic, per-agent |
| State | External database required | Built-in SQLite |
| Recovery | You build it (process managers, health checks) | Platform restarts, state survives |
| Identity / routing | You build it (load balancers, sticky sessions) | Built-in (name → agent) |
| 10,000 agents, 1% active | 10,000 always-on instances | ~100 active at any moment |

The last row is the economics that matter. If agents are one-to-one by nature, a runtime that charges per-provisioned-instance makes every agent a line-item decision. A runtime that charges per-active-instance makes agents a scaling variable. You can afford to spawn an agent per customer, per task, per workflow.

## How does durable execution work with fibers?

An agent turn isn't a single request. The model streams tokens, calls tools, waits for results, maybe asks a human for approval, or delegates work to a sub-agent. That sequence can take seconds or minutes. At any point the process can be interrupted or crash.

When that happens, all the agent state that was in memory is gone: the streaming connection, the pending tool calls, where the agent was in its turn. The conversation history might persist on disk, but the user sees a spinner that never resolves. The agent restarts from scratch, burning tokens on work it already did.

Fibers solve this by providing a native checkpointing mechanism inside the Durable Object:

```
Agent turn starts
    │
    ▼
runFiber("research") registers in SQLite
    │
    ├──► Call LLM → checkpoint with stash({step: 0, result})
    ├──► Execute tool → checkpoint with stash({step: 1, result})
    ├──► Call LLM again → checkpoint with stash({step: 2, result})
    │
    ▼
If crash at step 2 → onFiberRecovered() delivers {step: 1, result}
    │
    ▼
Resume from step 2, not step 0
```

The code pattern:

```typescript
import { Agent } from "agents";

export class ResearchAgent extends Agent {
  async startResearch(topic: string) {
    void this.runFiber("research", async (ctx) => {
      const findings = [];
      for (let i = 0; i < 10; i++) {
        const result = await this.callLLM(`Research step ${i}: ${topic}`);
        findings.push(result);
        // Checkpoint: if evicted, we resume from here
        ctx.stash({ findings, step: i, topic });
        this.broadcast({ type: "progress", step: i });
      }
      return { findings };
    });
  }

  async onFiberRecovered(ctx) {
    if (ctx.name === "research" && ctx.snapshot) {
      const { topic } = ctx.snapshot;
      await this.startResearch(topic);
    }
  }
}
```

The SDK keeps the agent alive automatically during fiber execution. For work measured in minutes, `keepAlive()` / `keepAliveWhile()` prevents eviction. For longer operations (CI pipelines, design reviews, video generation), the agent starts the work, persists the job ID, hibernates, and wakes on callback.

**The tradeoff:** Fibers add complexity to the agent code. You need to know what state to stash and how to recover it. For simple stateless agents, the overhead isn't worth it. For agents that run production workloads where a mid-turn crash wastes tokens and user trust, fibers are the difference between a toy and infrastructure.

**The tradeoff the framework did not predict:** AWS solves crash recovery implicitly : the microVM handles it. Cloudflare makes you write the recovery hook. More control, more work.

## Why code execution matters more than tool calling

Agent harnesses give models access to the outside world through tool definitions. Each tool is a JSON schema: name, description, parameters. The model reads them, picks one, calls it, reads the result.

The problem: tool surfaces grow fast. The model gets worse at selecting the right tool as the list gets longer and the context window fills with tool definitions. Cloudflare's internal MCP portal originally exposed 34 GitLab tools consuming roughly 15,000 tokens per request. That's 7.5% of a 200K window gone before the first question.

Code Mode inverts the pattern. Instead of N tools, give the model one tool that executes code. The model writes a TypeScript function that calls the APIs it needs, and the harness runs it.

The question is where that code runs. A container per tool call is expensive and slow. A V8 isolate starts in under 10 milliseconds and costs $0.002 per load. No kernel boot, no filesystem mount, no network stack initialization. Just a fresh JavaScript environment, your code, and a result.

```typescript
// The agent writes this at runtime
async () => {
  const files = await state.glob("src/**/*.ts");
  const results = [];
  for (const file of files) {
    const content = await state.readFile(file);
    const todos = content.match(/\/\/ TODO:.*/g);
    if (todos) results.push({ file, todos });
  }
  return results;
};
```

Cloudflare's Agents SDK provides `@cloudflare/codemode`, which wraps Dynamic Workers to execute LLM-generated code in an isolate. A fresh Dynamic Worker for each snippet, run it, discard it. No credentials leak into the generated code : the agent uses bindings, and bindings never enter the execution context.

Flue uses `@cloudflare/codemode` on its Cloudflare target for exactly this. The agent writes JavaScript against the workspace and runs it with Code Mode.

**The tradeoff:** Code Mode requires writing TypeScript. If you structure your agent around predefined tool calls (call this API, call that API), inverting to "write code for everything" changes the architecture significantly. It also means the model needs to generate syntactically correct TypeScript on every invocation : one syntax error and the isolate returns an error instead of a result.

## What does a durable filesystem look like without containers?

Coding agents live in the filesystem. They read files, search through code, understand diffs, write patches. But if the harness runs in a serverless environment, how do you get a filesystem that persists across executions?

The usual answer is a container. That works, but it's expensive for what agents mostly do. The majority of filesystem operations in an agent turn are text reads and writes. Grepping source code. Reading a config file. Writing a patch. You don't need a full Linux boot for that.

`@cloudflare/shell` gives your agent a durable virtual filesystem inside the Durable Object, backed by SQLite. It provides typed file operations : read, write, edit, search, grep, diff. These are the typed operations that agent harnesses use as tools.

Flue agents on the Cloudflare target write JavaScript against this workspace virtual file state API. More operations within the Durable Object means more efficient execution, entirely avoiding container overhead.

For agents that need a full OS : npm install, git, compilers. Cloudflare Containers provides that. Cloudflare is also building `@cloudflare/workspace` to keep the virtual filesystem of a Durable Object in sync with a container's, allowing seamless transition from lightweight Workers to a Linux environment only when needed.

**The tradeoff:** The virtual filesystem covers text operations well. Binary files, large datasets, and performance-sensitive I/O still need containers. The workspace sync pattern addresses this, but it's not GA yet.

## Can agents write their own workflows?

A harness manages a single turn. What happens when an agent needs to orchestrate a multi-step pipeline that runs consistently over time? A code review workflow that checks out the branch, runs tests, scans for security issues, and posts results. A research workflow that searches the web, reads sources, drafts a document, and iterates on feedback.

Claude Code recently shipped dynamic workflows : Claude writes a JavaScript script at runtime to hand off work to dozens of sub-agents, and the runtime executes it durably. Cloudflare's `@cloudflare/dynamic-workflows` provides the same pattern for any harness running on the Agents SDK.

Your agent generates a workflow at runtime. The Workflows engine persists each step, retries failures, and can sleep for hours or wait for external events like human approval. From the Agent class, `runWorkflow()` connects your agent to the Workflows engine. The agent kicks off the workflow and hibernates. The workflow calls back into the agent via RPC to report progress, update state, or request approval. When the workflow finishes, the agent wakes up with the result.

**The tradeoff:** Dynamic workflows are powerful but add operational complexity. Each workflow step needs error handling. The model generates the workflow definition, which means it can have bugs. You need monitoring and alerting on top. For simple agent tasks, a single runFiber with checkpointing is sufficient.

## How does the Session API handle long conversations?

Agents that run for days or weeks need more than a flat list of messages. Cloudflare's experimental Session API models conversations as trees, where each message has a `parent_id`. This enables:

- **Forking:** Explore an alternative path without losing the original conversation. The agent can try a different approach, and if it fails, resume from the fork point.
- **Non-destructive compaction:** Summarize older messages rather than deleting them. The summary sits as an overlay on the tree. The original messages persist for reference.
- **Full-text search:** FTS5 across conversation history. Find that decision the agent made three days ago about which API to use.

```typescript
import { Agent } from "agents";
import { Session, SessionManager } from "agents/experimental/memory/session";

export class MyAgent extends Agent {
  sessions = SessionManager.create(this);

  async onStart() {
    const session = this.sessions.create("main");
    const history = session.getHistory();
    const forked = this.sessions.fork(
      session.id,
      messageId,
      "alternative-approach"
    );
  }
}
```

The Session API is the storage layer that the Think base class builds on. Think is Cloudflare's opinionated harness : 3 lines to subclass, everything wired: `getModel()`, `getSystemPrompt()`, `getTools()`, and the agent loop runs.

**The tradeoff:** Tree-structured sessions are more powerful than flat message lists and more complex to work with. If your agent only runs for 5-10 turns per session, the tree adds complexity without benefit. For long-running agents with branching exploration, it's essential.

## What is Agent Memory?

[Cloudflare announced Agent Memory](https://blog.cloudflare.com/introducing-agent-memory/) as a private beta on April 17, 2026. It's a managed service for persistent agent memory, with an opinionated API:

- **Ingest:** Bulk path : typically called when the harness compacts context. Dumps a conversation into the memory store.
- **Remember:** Model-initiated storage. The agent decides something is important and calls this explicitly.
- **Recall:** Full retrieval pipeline. Returns a synthesized answer from stored memories.

Agent Memory is a separate managed service, not built into the Agents SDK directly. It runs alongside your Durable Object, providing a higher-level memory abstraction than the raw SQLite store.

**The tradeoff:** You now have two storage concerns : the Durable Object's SQLite (for session state, fibers, conversation history) and Agent Memory (for long-term recall). Managing consistency between them is your responsibility.

## Flue: the framework that ties it together

Flue isn't a harness. It's a framework built on a harness. The distinction matters.

**Pi** is the harness : the agentic loop that calls the model, executes tools, manages context, and streams results. OpenClaw uses Pi. Flue uses Pi. You can use Pi directly without Flue.

**Flue** is the framework : the project structure, CLI, conventions, and integrations that make agents productive to build.

The key difference from most agent frameworks: Flue is declarative. You don't script what your agent does. You describe what it knows : its model, skills, sandbox, and instructions : and it solves whatever task you give it, autonomously.

```typescript
// A triage agent in ~25 lines
import { createAgent } from "@flue/runtime";

export default createAgent(() => ({
  model: "anthropic/claude-sonnet-4-6",
  instructions: `You are a triage agent. When a bug report arrives:
1. Fetch the issue details using the GitHub channel
2. Reproduce the bug in the sandbox
3. Diagnose the root cause
4. Propose a fix`,
  skills: ["reproduce-bug", "diagnose-issue"],
  sandbox: "virtual",
}));
```

### Durable Streams

Flue's answer to crash recovery is Durable Streams. Each event in the execution history is added to an append-only log, built on the [Electric Durable Streams](https://electric.ax/blog/2026/04/08/data-primitive-agent-loop) pattern. Every prompt, tool response, and model choice is recorded as an unchangeable ledger. If a process dies, another picks up the log and continues from the exact step it left off.

On the Cloudflare target, this maps to the Agents SDK's fiber system. Off Cloudflare (Node, CI), it maps to files on disk.

### Channels

Flue ships pre-configured integrations for where users already work:

```
flue add channel slack     # Generates Markdown blueprint for Slack
flue add channel github    # Generates Markdown blueprint for GitHub
flue add channel linear    # Generates Markdown blueprint for Linear
flue add channel discord   # Generates Markdown blueprint for Discord
```

Each channel handles event verification and dispatch boilerplate automatically. The generated Markdown blueprint is something your own coding agent can read, modify, and integrate cleanly into your codebase.

### Frontend streaming

`@flue/react` provides React hooks that stream an agent's state, tool execution, and live messages directly into a frontend application. No custom real-time plumbing needed.

### Multi-cloud deployment

Flue deploys to Cloudflare Workers (each agent becomes a Durable Object), Node.js (long-lived process), GitHub Actions, GitLab CI/CD, Render, and Fly.io. The same agent code runs everywhere. The harness adapts, not the agent.

## How does Flue map to Cloudflare primitives?

When you deploy Flue to Cloudflare, every agent becomes a Durable Object. Flue uses:

| Flue concept | Cloudflare primitive | What it does |
|---|---|---|
| Agent module | `Agent` class extending Durable Object | Identity, state, lifecycle |
| Durable Streams | `runFiber()` / `stash()` / `onFiberRecovered()` | Crash recovery, checkpointing |
| Code tool | `@cloudflare/codemode` | Dynamic Workers for TypeScript execution |
| Workspace filesystem | `@cloudflare/shell` + SQLite-backed virtual FS | Read, write, grep, diff without containers |
| session() | Session API / Agent Memory | Tree-structured conversations, persist |
| task() | Sub-agents via Facets | Child DO with isolated SQLite |
| workflow() | `@cloudflare/dynamic-workflows` | Multi-step durable pipelines |
| Integrations | Bindings (AI Gateway, Browser, Email, etc.) | Credential-safe access to ecosystem |

The mapping is clean because the Flue team built the Cloudflare target to mirror the Agents SDK primitives directly. You can dig into the Flue source code to see how Pi, the underlying harness, adapts to Cloudflare.

## What does this mean for your agent architecture?

The AWS AgentCore approach says: the harness is a managed service. You configure it, AWS runs it. You trade customization for speed.

The Cloudflare + Flue approach says: the harness is an open-source framework. You run it on platform primitives. You trade setup effort for control and portability.

Both arrived this week. Both are production-grade. The right choice depends on where you're in your agent journey and what you're willing to operate.

For teams that want to ship an agent in an afternoon and never think about the infrastructure: AWS AgentCore is the faster path.

For teams that want to understand every layer of their agent stack, deploy across clouds, and build agent infrastructure that compounds: Cloudflare + Flue is the more durable path.

The economics of Durable Objects change the conversation too. When each agent costs zero until it wakes up, "one agent per customer" becomes viable. That's a product decision, not just an infrastructure decision.

## FAQ

> **Is Flue production-ready?**
> Flue shipped 1.0 Beta on June 17, 2026. The Pi harness underneath powers OpenClaw in production. The Cloudflare integration uses GA primitives from the Agents SDK.
>
> **Can I use Flue without Cloudflare?**
> Yes. Flue deploys to Node.js, GitHub Actions, GitLab CI, Render, and Fly.io. Cloudflare is one target, not the only target.
>
> **What is the difference between Pi, Think, and Flue?**
> Pi is an open-source agent harness (the agentic loop). Think is Cloudflare's opinionated base class built on the Agents SDK. Flue is a framework built on Pi with CLI, integrations, and multi-cloud deploy.
>
> **Do I need containers to run Flue agents on Cloudflare?**
> No. Most agent work (file reads, code execution, shell commands) runs in V8 isolates via `@cloudflare/codemode` and `@cloudflare/shell`. Containers are available when you need npm install, git, or compilers.
>
> **How does Flue handle secrets?**
> Cloudflare uses bindings : credentials are injected at the platform level, never visible to LLM-generated code. Flue's `defineCommand()` pattern on other targets does the same: env vars bound at definition, injected at execution, never in the prompt.

## Related Posts

- [AWS just turned the agent harness into a managed service](/posts/aws-bedrock-agentcore-harness-managed/). The managed alternative to the Cloudflare + Flue approach, shipped the same week.
- [The 15 jobs every agent harness must do](/posts/agent-harness-15-jobs/). The framework that every agent harness implements : mapped across AWS, Cloudflare, and Flue.
- [Code as agent harness: a survey](/posts/code-as-agent-harness-survey/). How Claude Code, Codex, and Kiro implement the harness pattern before the platform era.
- [Claude Code harness architecture](/posts/claude-code-harness-architecture-98-percent/). The 98.4% non-model code that makes Claude Code work, and how Pi and Think build on the same ideas.

---

This article was published on Agentic Up (https://agenticup.dev): practical guides for developers and founders building with AI agents. Reach&nbsp;me&nbsp;at&nbsp;hello@agenticup.dev