01 Execution Model
Edge runtimes are not Node.js. They implement the
WinterCG Web-compatible runtime API — a subset of the
browser's global API surface. This means fetch,
Request, Response, URL,
TextEncoder, crypto.subtle, and
ReadableStream are available, but not
fs, net, child_process, or any
native Node.js module.
Execution is event-driven. Each inbound HTTP request is dispatched to a
fetch event listener (Cloudflare Workers model) or a default
export function (Next.js / Vercel Edge Runtime model):
// Cloudflare Workers — fetch event handler
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
if (url.pathname === '/api/data') {
return handleData(request, env);
}
return new Response('Not Found', { status: 404 });
}
};
Workers are isolated per-request using V8 isolates, not OS processes. Cold-start latency is typically sub-millisecond — orders of magnitude lower than Lambda cold starts. The tradeoff is tight memory and CPU budget constraints (see §8 Constraints).
02 Request Routing
Route matching at the edge is defined declaratively (config files) or imperatively (runtime code). Both approaches are used; the choice affects where logic executes and at what cost.
Declarative — config-based routing
Platforms like Vercel and Netlify read a vercel.json /
netlify.toml manifest. Rewrites, redirects, and header rules
are applied before any JavaScript runs — zero CPU cost.
// vercel.json — declarative rewrites
{
"rewrites": [
{ "source": "/api/:path*", "destination": "https://api.internal/:path*" }
],
"headers": [
{
"source": "/static/:path*",
"headers": [
{ "key": "Cache-Control", "value": "public, max-age=31536000, immutable" }
]
}
]
}
Imperative — runtime routing
When routing depends on request data (auth tokens, geolocation headers, A/B buckets), it must be done in a Worker. Keep this path as lightweight as possible — a routing Worker that calls a heavyweight downstream service adds latency instead of removing it.
export async function middleware(request: NextRequest) {
const country = request.geo?.country ?? 'US';
const locale = resolveLocale(country); // pure, no I/O
const url = request.nextUrl.clone();
url.pathname = `/${locale}` + url.pathname;
return NextResponse.rewrite(url);
}
03 Caching Patterns
The edge cache is a shared, region-scoped HTTP cache. It respects standard
Cache-Control semantics and can be programmatically read and
written from Workers via the caches API.
Stale-While-Revalidate (SWR)
Serve a cached response immediately, then regenerate in the background. Requires the origin to emit the correct directive:
Cache-Control: public, s-maxage=60, stale-while-revalidate=600
In Cloudflare's terminology this is Cache API + waitUntil — the response is returned to the client while the Worker schedules a background fetch to warm the cache.
export default {
async fetch(req: Request, env: Env, ctx: ExecutionContext) {
const cache = caches.default;
const cacheKey = new Request(req.url);
let cached = await cache.match(cacheKey);
if (cached) {
ctx.waitUntil(refreshCache(cacheKey, cache)); // background
return cached;
}
const fresh = await fetch(req);
ctx.waitUntil(cache.put(cacheKey, fresh.clone()));
return fresh;
}
};
Cache tagging / surrogate keys
Cloudflare Cache Tags and Fastly Surrogate-Key headers let you purge groups of responses without knowing their exact URLs. Tag responses at write time, purge by tag on data mutation:
Cache-Tag: product-42, category-shoes, tenant-acme
// Purge via Cloudflare API on product update
await fetch(
`https://api.cloudflare.com/client/v4/zones/${zoneId}/purge_cache`,
{
method: 'POST',
headers: { Authorization: `Bearer ${token}` },
body: JSON.stringify({ tags: [`product-${id}`] })
}
);
Cache topology
| Layer | Scope | Invalidation | Notes |
|---|---|---|---|
| Browser cache | Single user | Hard reload / max-age |
Uncontrollable after delivery |
| Edge cache (CDN) | PoP region | Tag purge / TTL | Shared across users in region |
| Tiered cache | Regional super-PoP | Cascading purge | Reduces origin load; adds ~5 ms hop |
| Origin cache | Global | Manual / TTL | Redis, Memcached, in-proc LRU |
04 SSR & Streaming
Edge SSR renders HTML at a CDN node rather than on an origin server. Combined with HTTP streaming, it lets the browser start parsing above-the-fold content while the edge is still fetching data for the rest of the page.
React Server Components + Edge Runtime
In Next.js 13+, setting export const runtime = 'edge' in a
route segment moves rendering to the edge runtime. Only WinterCG-compatible
APIs are available within that segment.
// app/products/page.tsx
export const runtime = 'edge';
export default async function ProductsPage() {
// Runs at the edge — fetch() only, no Node.js APIs
const products = await getProducts(); // hits an HTTP API
return <ProductList products={products} />;
}
Streaming with Suspense boundaries
Wrap slow data-fetching components in <Suspense>.
React flushes the shell HTML immediately, then streams each Suspense
boundary as its data resolves. The edge node acts as the streaming proxy
— no full-page buffer required.
export default function Page() {
return (
<>
<AboveTheFold /> {/* flushed immediately */}
<Suspense fallback={<Skeleton />}>
<SlowComponent /> {/* streamed when ready */}
</Suspense>
</>
);
}
Promise.all.
05 Data Access
The biggest architectural constraint at the edge is data proximity. A Worker at a Cloudflare PoP in Frankfurt should not query a Postgres instance in us-east-1 — the TCP round-trip alone is 100+ ms, eliminating the edge latency benefit entirely.
Options by latency tier
| Data store | Latency | Best for |
|---|---|---|
| KV (Cloudflare KV, Vercel KV) | < 5 ms | Config, feature flags, session tokens, static content |
| Durable Objects / D1 (same PoP) | 1–10 ms | Per-user state, real-time coordination, transactional writes |
| Distributed SQL (Turso, Neon, PlanetScale) | 5–30 ms | Read-heavy relational data with regional replicas |
| Upstash Redis (nearest region) | 5–25 ms | Rate limiting, caching, pub/sub |
| Origin Postgres (single region) | 80–300 ms | Write-critical data — keep writes at origin, not edge |
Read-replica pattern
Separate read and write paths. All writes go to origin; reads are served from the nearest edge-compatible replica. This is the standard approach with PlanetScale's Prisma Accelerate or Neon's serverless driver:
import { neon } from '@neondatabase/serverless';
const sql = neon(process.env.DATABASE_URL!);
// HTTP-based query — works in WinterCG runtimes
const rows = await sql`SELECT * FROM products WHERE id = ${id}`;
06 State at the Edge
Edge Workers are stateless by default. In-memory variables are per-isolate and not shared between requests or PoP nodes. Any state that must persist or be shared requires an explicit backing store.
Cloudflare Durable Objects
Durable Objects provide a single-threaded, strongly-consistent execution environment tied to a named instance. Requests to the same DO instance are colocated and serialized — useful for real-time collaboration, rate limiters, and WebSocket hubs.
export class RateLimiter {
state: DurableObjectState;
count = 0;
reset = Date.now() + 60_000;
constructor(state: DurableObjectState) {
this.state = state;
}
async fetch(request: Request): Promise<Response> {
if (Date.now() > this.reset) {
this.count = 0;
this.reset = Date.now() + 60_000;
}
if (this.count++ >= 100) {
return new Response('Rate limited', { status: 429 });
}
return new Response('OK');
}
}
KV for distributed read state
KV is eventually consistent (typically converges within 60 s globally). It is not suitable for counters or any state requiring strong consistency. Use it for: feature flags, user preferences, A/B assignments, computed config blobs written from a trusted backend and read everywhere.
const flags = await env.FLAGS.get('feature-config', { type: 'json' });
const enabled = flags?.['new-checkout'] ?? false;
07 Middleware Pattern
Edge middleware intercepts every matching request before it reaches any handler. It is the correct place for cross-cutting concerns: auth, geo-fencing, bot detection, A/B assignment, and locale redirection.
The pattern composes as a pipeline. Each step either continues
(NextResponse.next()), short-circuits (new Response(...)),
or rewrites the request URL. Keep each step pure and free of slow I/O.
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
export async function middleware(req: NextRequest) {
// 1. Auth guard — reads JWT from cookie, no fetch()
const token = req.cookies.get('__session')?.value;
if (!token || !await verifyJwt(token)) { // crypto.subtle — no I/O
return NextResponse.redirect(new URL('/login', req.url));
}
// 2. Geo-fence
const country = req.geo?.country;
if (BLOCKED_COUNTRIES.has(country ?? '')) {
return new Response('Unavailable in your region', { status: 451 });
}
// 3. A/B assignment — hash userId from JWT, no random()
const res = NextResponse.next();
const bucket = getBucket(token); // deterministic
res.headers.set('x-ab-bucket', bucket);
return res;
}
export const config = {
matcher: ['/dashboard/:path*', '/api/:path*']
};
fetch() in middleware unless absolutely necessary.
Every outbound request adds latency on the critical path. Encode decisions
into the auth token at sign-in time and read them with crypto.subtle
— a pure CPU operation that takes < 1 ms.
08 Constraints
Edge runtimes impose hard limits. Exceeding them results in termination, not graceful degradation. Design around these upfront.
| Constraint | Cloudflare Workers | Vercel Edge | Deno Deploy |
|---|---|---|---|
| CPU time | 10 ms (free) / 30 s (paid) | ~1.5 s | 50 ms / req |
| Memory | 128 MB | 128 MB | 512 MB |
| Bundle size | 1 MB (compressed) | 4 MB | No hard limit |
| Node APIs | None (WinterCG only) | Partial subset | Partial subset |
| Subrequests per req | 50 | Unlimited | Unlimited |
| WebSockets | Yes (Durable Objects) | No | Yes |
Bundle size
Tree-shake aggressively. Avoid heavy dependencies — moment,
lodash, large schema validators. Prefer edge-specific
alternatives: date-fns (tree-shakable), zod
(lightweight), hono instead of express.
Use wrangler build --dry-run to inspect bundle size before
deploying.
CPU time
CPU time counts only active execution — not time spent awaiting I/O. Blocking synchronous operations (heavy regex, JSON.parse on large payloads, cryptographic key generation) consume CPU budget. Move precomputable work (key imports, compiled regex) to module-level initialization which runs once per isolate lifecycle.
// BAD — key imported on every request
async function verify(token: string) {
const key = await crypto.subtle.importKey(...); // slow
return crypto.subtle.verify(key, ...);
}
// GOOD — key imported once at module scope
const keyPromise = crypto.subtle.importKey(...);
async function verify(token: string) {
const key = await keyPromise; // cached
return crypto.subtle.verify(key, ...);
}
09 Platform Comparison
| Platform | Runtime | PoP count | Storage | Framework integration |
|---|---|---|---|---|
| Cloudflare Workers | WinterCG (V8) | 300+ | KV, D1, R2, DO | Hono, Remix, Next.js (via adapter) |
| Vercel Edge Runtime | WinterCG (V8) | ~100 | Vercel KV, Blob, Postgres | Next.js (native), SvelteKit, Nuxt |
| Deno Deploy | Deno (V8) | 35+ | Deno KV | Fresh, Hono |
| Fastly Compute | Wasm (WASI) | 90+ | Config store, KV | Rust, Go, JS via Wasm |
| AWS Lambda@Edge | Node.js / Python | CloudFront (~450) | External only | Any — full Node.js |
Cloudflare Workers has the most mature edge-native storage ecosystem (KV, D1, Durable Objects) and the largest PoP footprint. It is the strongest default choice for greenfield edge applications.
Vercel Edge Runtime is the path of least resistance if
you are already on Next.js — export const runtime = 'edge'
is all the migration needed for individual routes.
Lambda@Edge supports full Node.js but has cold-start overhead and execution billed per 50 ms — it behaves more like a regional Lambda than a true edge runtime and should be treated as such.