Setting the retry Interval in SSE Streams Permalink to this section

Part of Event ID & Retry Mechanism Design.

When an EventSource connection drops, the browser reconnects automatically β€” but by default it uses a hardcoded delay of roughly 3 000 ms. That default is too short for overloaded servers and too long for low-latency dashboards. The retry field in the text/event-stream format lets the server take ownership of that delay, pushing a specific millisecond value the browser must honour on every reconnect attempt.


Symptom & Developer Intent Permalink to this section

You are shipping a real-time feed β€” a notification stream, a live log tail, or an AI token-by-token response β€” and one of the following is true:

  • Clients hammer a recovering server with rapid reconnects, turning a brief outage into a thundering herd.
  • A load-balancer health-check is failing and you want clients to wait 30 s before retrying rather than the browser’s 3 s default.
  • You are building a zero-downtime deploy path and need clients to back off during a rolling restart.
  • Conversely, the default 3 s is too slow for a trading dashboard that expects sub-second failover.

In each case the fix is the same: emit a retry: line with the correct millisecond value.


Root Cause Analysis Permalink to this section

The wire format Permalink to this section

The SSE wire format defines four field names: data, event, id, and retry. The retry field syntax is:

retry: <unsigned integer in ms>\n\n

A bare retry: line (no data:) is a valid event on its own β€” the browser applies the new interval and dispatches nothing to JavaScript. The spec says:

β€œIf the field name is retry, the user agent must set the event stream’s reconnection time to that integer.”

This is a per-connection, persistent setting: once the browser receives retry: 5000, every subsequent reconnect on that EventSource instance uses 5 000 ms β€” until another retry: line overrides it.

What the browser does without a retry field Permalink to this section

Without a server-supplied retry, each browser engine applies its own compiled-in default. In practice all modern engines land between 2 000 and 5 000 ms. Critically, the browser does not implement exponential backoff β€” it uses the same flat delay every time. For a server restart that takes 45 s, that means up to 22 reconnect attempts per client before the server is ready, which multiplied across thousands of clients produces a connection storm.

Why server control beats client control Permalink to this section

The EventSource API exposes no reconnectInterval constructor option. There is no standard way for JavaScript to set the delay; the spec intentionally omits it. Third-party polyfills (e.g. event-source-polyfill) add their own non-standard options, but those only work for the polyfill path and are invisible to the browser’s native implementation. The only portable, spec-compliant mechanism is the server-sent retry: field.


Step-by-Step Resolution Permalink to this section

Step 1 β€” Emit a baseline retry line at connection open Permalink to this section

Send retry: as the first line after the browser connects. This establishes the delay before any real event is dispatched so the value is always set even if the connection drops immediately.

Node.js / Express

// Express SSE endpoint β€” sets a 5 s reconnect interval on connect
app.get('/events', (req, res) => {
  res.writeHead(200, {
    'Content-Type': 'text/event-stream',
    'Cache-Control': 'no-cache',
    'Connection': 'keep-alive',
  });

  // Emit retry FIRST, before any data events
  res.write('retry: 5000\n\n');  // browser will wait 5 000 ms before each reconnect

  // Send a keepalive comment every 20 s to prevent proxy timeouts
  const keepalive = setInterval(() => res.write(': keepalive\n\n'), 20_000);

  req.on('close', () => clearInterval(keepalive));
});

Python / FastAPI + sse-starlette

from sse_starlette.sse import EventSourceResponse
from fastapi import FastAPI

app = FastAPI()

@app.get('/events')
async def stream():
    async def generator():
        # First yield sets the reconnect interval to 8 s
        yield {'retry': 8000}
        async for event in produce_events():
            yield {'data': event}

    return EventSourceResponse(generator())

Go

// Go SSE handler β€” sends retry on connect, then streams events
func sseHandler(w http.ResponseWriter, r *http.Request) {
    flusher, ok := w.(http.Flusher)
    if !ok {
        http.Error(w, "streaming not supported", http.StatusInternalServerError)
        return
    }
    w.Header().Set("Content-Type", "text/event-stream")
    w.Header().Set("Cache-Control", "no-cache")
    w.Header().Set("Connection", "keep-alive")

    fmt.Fprintf(w, "retry: 5000\n\n") // set reconnect interval immediately
    flusher.Flush()

    ctx := r.Context()
    ticker := time.NewTicker(20 * time.Second)
    defer ticker.Stop()
    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            fmt.Fprintf(w, ": keepalive\n\n")
            flusher.Flush()
        }
    }
}

Step 2 β€” Dynamically escalate retry during server stress Permalink to this section

Push a longer retry: value the moment you detect high load. Embed the retry field alongside real event data β€” the browser updates its internal timer on receipt.

// Node.js β€” escalate retry when the server is under pressure
function writeEvent(res, data, retryMs = null) {
  if (retryMs !== null) {
    res.write(`retry: ${retryMs}\n`);
  }
  res.write(`data: ${JSON.stringify(data)}\n\n`);
}

// Normal operation
writeEvent(res, { temp: 72.3 });

// Server approaching connection limit β€” tell clients to back off 30 s
if (connectionCount > HIGH_WATER_MARK) {
  writeEvent(res, { status: 'busy' }, 30_000);
}

Step 3 β€” Add server-side jitter to prevent thundering herd Permalink to this section

A flat retry value causes all disconnected clients to reconnect simultaneously. Add jitter (Β±25 % of the base interval) per connection so reconnects spread across a window.

// Assign a jittered retry per connection
function jitteredRetry(baseMs, jitterFraction = 0.25) {
  const spread = baseMs * jitterFraction;
  return Math.round(baseMs - spread + Math.random() * spread * 2);
}

app.get('/events', (req, res) => {
  res.writeHead(200, {
    'Content-Type': 'text/event-stream',
    'Cache-Control': 'no-cache',
  });

  const retryMs = jitteredRetry(5000); // e.g. returns 3 800–6 200
  res.write(`retry: ${retryMs}\n\n`);

  // ...rest of handler
});

For 1 000 simultaneous clients with a 5 s base and Β±25 % jitter, reconnects are now spread across a 2.5 s window rather than a single instant.


Step 4 β€” Restore a fast retry on recovery Permalink to this section

After a rolling restart or a dependency outage clears, push a short retry: so clients catch up quickly. Combine this with a deployment-complete marker event.

// After restart: signal clients to resume fast polling
res.write('retry: 1000\n');
res.write('event: deployment-complete\n');
res.write('data: {"version":"2.4.1"}\n\n');

Pair this with idempotent event IDs and Last-Event-ID so clients can replay missed events from the correct offset after reconnecting.


Step 5 β€” Tune retry for infrastructure constraints Permalink to this section

Match the retry value to your stack. Nginx, AWS ALB, and Cloudflare each have their own idle-connection timeouts that interact with SSE.

Layer Default timeout Recommended retry ceiling
Nginx proxy_read_timeout 60 s 45 s (send keepalive under 60 s)
AWS ALB idle timeout 60 s 50 s
Cloudflare HTTP/2 stream idle 100 s 90 s
Browser EventSource built-in 3 s Override immediately on connect
Mobile networks (intermittent) Varies 10–15 s with jitter

For buffer management and chunked transfer encoding concerns, also verify that your proxy does not buffer the retry line β€” the client never receives the updated value if the proxy holds the response in a buffer until a larger chunk accumulates.


Validation & Monitoring Permalink to this section

Verify with curl Permalink to this section

# Stream the endpoint and inspect raw frames β€” look for "retry:" on the first chunk
curl -N -H "Accept: text/event-stream" https://api.example.com/events

Expected output within the first 200 ms:

retry: 5000

: keepalive

Inspect in Chrome DevTools Permalink to this section

  1. Open DevTools β†’ Network β†’ Filter: EventSource.
  2. Click the /events request β†’ EventStream tab.
  3. Each frame is listed with its raw field values. Confirm retry appears in the first frame.
  4. To simulate a disconnect: throttle to Offline in the Network panel, then restore connectivity and observe the timer before the browser retries (use the Timing tab on the new request).

Unit test (Node.js / Supertest) Permalink to this section

import request from 'supertest';
import app from '../src/app.js';

test('SSE endpoint emits retry on connect', async () => {
  const chunks = [];
  await new Promise((resolve) => {
    const req = request(app)
      .get('/events')
      .set('Accept', 'text/event-stream')
      .buffer(false)
      .parse((res, cb) => {
        res.on('data', (chunk) => chunks.push(chunk.toString()));
        res.on('end', cb);
        setTimeout(() => res.destroy(), 500); // collect 500 ms then quit
      });
    req.then(resolve).catch(resolve);
  });
  const raw = chunks.join('');
  expect(raw).toMatch(/^retry: \d+\r?\n/m);
});

Verification Checklist Permalink to this section


Frequently Asked Questions Permalink to this section

Does the retry field affect reconnects triggered by network errors versus server closes?

Yes β€” the retry interval applies to all automatic reconnects regardless of why the connection ended: a TCP RST, a clean server close, or a proxy timeout. The browser's EventSource state machine uses the same stored reconnection time in every case.

Can JavaScript override the retry interval set by the server?

No. The EventSource constructor and its prototype expose no property to set the reconnect interval. The spec intentionally reserves that control for the server. Third-party wrappers like ReconnectingEventSource or event-source-polyfill add non-standard options, but they only apply when you are using that wrapper β€” the native browser implementation ignores them.

What happens if I send retry: 0?

The spec requires a non-negative integer and 0 is technically valid. In practice, browsers accept it and reconnect immediately (or within a single event-loop tick). This can be useful for a "reconnect instantly after deployment" signal but risks creating a tight reconnect loop if the server is still not ready. Send retry: 0 only when you are certain the endpoint is healthy.

Does the retry value persist across page reloads?

No. The retry interval is stored in the EventSource object's internal state, which is discarded when the page navigates or the EventSource is garbage-collected. A new EventSource instance always starts with the browser's built-in default until it receives a fresh retry: line from the server.

How does retry interact with the Last-Event-ID header?

They are orthogonal. On reconnect the browser sends Last-Event-ID as a request header regardless of how long it waited (the retry interval). The server can use that header to replay missed events. Setting a long retry does not discard the last known event ID β€” the browser holds both values independently. See Event ID & Retry Mechanism Design for the full interaction model.


⚑ Production Directives

  • Always emit retry: as the first line on connect β€” never rely on the browser's hardcoded default.
  • Apply Β±20–30 % random jitter per connection to spread reconnect storms across a time window.
  • Escalate retry dynamically (e.g. 30 000 ms) when the server exceeds its connection high-water mark; reset to a fast value after recovery.
  • Keep keepalive comment intervals below the shortest upstream proxy idle timeout (typically 55 s for Nginx defaults).
  • Verify the retry: line reaches the client un-buffered using curl -N before every production deploy.