← screencli   /   blog

The 6 Layers of an AI Agent Infrastructure Stack in 2026 (And the Missing 7th)

2026-05-27 · screencli
ai-agentsinfrastructurecloudflarebrowser-automationdeveloper-tools

TL;DR: During Agents Week 2026 (April 13–17), Cloudflare assembled a six-layer infrastructure stack for AI agents: compute, sandboxes, orchestration, memory, browsing, and commerce. According to InfoQ, it’s the most complete update any cloud provider has shipped. But every team running agents in production hits the same wall — there’s no native layer for recording and sharing what the agent actually did. screencli is the open-source CLI that fills that gap with one command.

What is an AI agent infrastructure stack?

An AI agent infrastructure stack is the set of cloud primitives an autonomous agent needs to do real work: somewhere to run code, somewhere to keep state, a way to call out to the web, and a way to spend money. In 2025 these were stitched together by hand. In 2026, providers ship them as one product surface.

Cloudflare’s stack is the clearest example shipped to date. Here’s every layer, followed by the one that’s still missing.

The six layers (and the seventh)

#LayerCloudflare productWhat it does
1ComputeDynamic WorkersBoot AI-generated code in milliseconds on V8 isolates
2SandboxesCloudflare SandboxesRun untrusted code in isolated containers
3OrchestrationDynamic WorkflowsCoordinate long-running, multi-step agent runs
4MemoryAgent Memory (beta)Persist and retrieve agent context across sessions
5BrowsingBrowser RunDrive headless Chrome on demand for web tasks
6CommerceStripe ProjectsLet agents authorize payments and check out
7Recording / shareMissing — screencliCapture what the agent did as a shareable video

1. Compute — Dynamic Workers

Cloudflare Dynamic Workers run AI-generated code in milliseconds. They use V8 isolates, the same primitive that powers Workers, and according to Cloudflare are roughly 100× faster to boot and 10–100× more memory-efficient than containers for the small, ephemeral tasks agents generate (lint, typecheck, API calls, glue code).

This matters because agents emit hundreds of small code blobs per session. Spinning up a fresh container for each one is wasteful; isolates make per-step execution practical.

2. Sandboxes — for untrusted execution

Sandboxes are the heavier sibling of Dynamic Workers. When an agent needs to run a full Linux process — pip installs, headless browsers, shell scripts — it gets a dedicated container with filesystem, network policy, and resource limits. Cloudflare’s Sandboxes launched alongside Dynamic Workers during Agents Week, completing the compute spectrum from millisecond isolates to long-running containers.

The pattern is: route trusted, fast code to isolates; route untrusted or stateful workloads to sandboxes.

3. Orchestration — Dynamic Workflows

Dynamic Workflows coordinate multi-step agent runs that outlive a single request. Agents in production rarely complete a task in one shot — they plan, execute, validate, retry, hand off to other agents. According to Salesmate’s 2026 trends report, most production deployments now rely on multi-agent orchestration with specialized roles (planner, executor, validator).

Dynamic Workflows replaces the bespoke state machine every team used to build. It durably checkpoints each step so a workflow can resume after hours, retries, or restarts.

4. Memory — Agent Memory

Agent Memory extracts and retrieves long-term context for agents. Cloudflare’s beta service uses a dual-pass ingestion pipeline to pull structured memories from conversations, and a five-channel parallel search with Reciprocal Rank Fusion to retrieve them at runtime. The point: agents stop hallucinating yesterday’s context and stop blowing the context window on irrelevant history.

This is the layer that turns “stateless chatbot” into “coworker who remembers.”

5. Browsing — Browser Run

Browser Run gives agents a headless Chrome instance, on demand, near the user. The May 13, 2026 rebuild on Cloudflare Containers delivered measurable gains:

According to Cloudflare, agents can now spin up 60 browsers per minute per binding with state managed transactionally. This is the layer where most “what did my agent see?” debugging starts.

6. Commerce — Stripe Projects

Stripe Projects lets agents transact safely. Scoped payment authorizations, spending caps, audit logs — the primitives needed for an agent to book a flight, top up an API key, or pay a vendor without handing it your root credit card. This is the layer that turns “agent that recommends” into “agent that executes.”

7. The missing layer — Recording and sharing

Every layer above produces work. None of them produce visual evidence. When your agent finishes a 47-step browser workflow on a customer’s behalf, you have:

What you don’t have is the thing every stakeholder actually asks for: a video of what happened. Sales wants it for demos. DevRel wants it for tutorials. Support wants it for incident retros. End users want it as proof. Compliance wants it for audit. Agent observability platforms like AgentOps offer session replay for engineers, but the output is timestamped JSON, not a video your CEO can post on X.

The status quo is that someone manually re-runs the workflow with Loom or Screen Studio. That’s slow, breaks on every UI change, and defeats the whole point of having an agent in the first place.

How screencli fills the gap

screencli is an open-source CLI that gives any AI agent a screen recorder. One command produces a polished video — auto-trimmed, auto-zoomed to actions, with click highlights, cursor trails, gradient background, and a shareable link.

npx screencli record \
  --url https://your-app.com \
  --prompt "Sign up, create a new project, invite a teammate" \
  --share

Run it locally, in CI, or inside a Cloudflare Container next to Browser Run. Output: an MP4 and a screencli.sh/v/... link. No editor, no retakes.

Where it slots into the stack:

The CLI is MIT-licensed, runs in any Node 20+ environment, and uses FFmpeg under the hood for composition — single-pass, no GPU required.

How the recording layer compares to alternatives

ToolCLI-nativeAgent-drivenAuto post-productionOpen sourceBuilt for headless
screencli✅ (MIT)
Loom
Screen Studio✅ (manual)
OBS StudioPartial
AgentOps (replay)N/A (JSON only)Partial

Loom and Screen Studio assume a human is driving. OBS is a recording engine, not a post-production pipeline. AgentOps gives you session traces, not videos. screencli is the only option built for the “agent finishes a task, we need a shareable artifact” workflow.

Why this matters now

According to Google Cloud’s 2026 trends report, the honeymoon for AI demos is over — every agent is expected to justify itself with measurable impact. That means showing the work, not just shipping the result. Teams that can produce a fresh agent-driven demo per release, per customer, or per PR will outrun teams that re-record by hand on every UI change.

Cloudflare’s six-layer stack handles the execution. screencli handles the proof.

FAQ

What is an AI agent infrastructure stack? The set of cloud primitives an autonomous agent needs to run: compute, sandboxing, orchestration, memory, browsing, commerce, and (in production) recording. Cloudflare assembled six of these into one product surface during Agents Week 2026.

What did Cloudflare announce during Agents Week 2026? Cloudflare shipped or upgraded six layers: Dynamic Workers (compute), Sandboxes, Dynamic Workflows (orchestration), Agent Memory (beta), Browser Run rebuilt on Containers (4× concurrency, 50% faster), and Stripe Projects (commerce). InfoQ called it the most complete agent infrastructure update from any cloud provider to date.

How is Browser Run different from browser-use or Playwright MCP? Browser Run is infrastructure — managed headless Chrome with per-region pools. browser-use, Stagehand, and Playwright MCP are agent frameworks that use a browser. You pair them: framework drives, Browser Run hosts.

Why isn’t recording part of cloud agent platforms yet? Agent observability platforms (AgentOps, Maxim, Galileo) cover session replay for engineers, but the output is JSON traces and trace UIs. There’s no native primitive in any major cloud for “give me a polished MP4 of what the agent did.” screencli fills that gap.

Can screencli run inside a Cloudflare Container? Yes. screencli is an MIT-licensed Node CLI that spawns FFmpeg. It runs anywhere Node 20+ runs, including Cloudflare Containers. The screencli cloud renderer itself runs on Cloudflare Containers.

Try it

npx screencli record --url https://your-app.com --prompt "Sign up and create a project" --share

One command, one link. screencli.sh


Related reading: How AI coding agents create demo videos, 5 ways to automate a browser with an AI agent, Claude can now use your computer — here’s the one thing it’s still missing.