Connection Pooling for SSE Servers Permalink to this section
Part of Backend Stream Generation & Connection Management.
SSE streams are long-lived: a single browser tab can hold a connection open for hours. At a few dozen concurrent users that is trivial. At ten thousand it means ten thousand simultaneous open TCP sockets, each consuming a file descriptor, a kernel socket buffer (~87 KB receive + ~87 KB send by default on Linux), and some application-layer state. Naively treating each request as an independent, short-lived transaction collapses the server in three predictable ways: OS file-descriptor exhaustion, thread-pool starvation, and repeated TLS handshake cost. Connection pooling for SSE is the discipline of sharing and reusing transport-layer resources across logical event streams so that none of these ceilings are hit in production.
How Connection Pooling Works for SSE Permalink to this section
A standard HTTP/1.1 SSE connection occupies one TCP socket per client for its entire lifetime. The OS represents each socket as a file descriptor; the default ulimit -n on Linux is 1024 per process. Raise it and you still need per-socket kernel buffers, per-socket epoll registrations, and—if you are using a threaded model—one thread or green thread per connection.
Connection pooling addresses this at two distinct levels:
Transport pooling (proxy → application): The reverse proxy (nginx, Caddy, HAProxy) maintains a pool of persistent HTTP/1.1 or HTTP/2 connections to your application process. Clients each get their own downstream TCP socket to the proxy, but the proxy multiplexes upstream using a bounded set of keep-alive connections. With HTTP/2 upstream you can multiplex many logical streams over a single TCP connection.
Application-level connection registry: Inside your application, you maintain a registry—often a Map<clientId, WritableStream>—that maps each logical subscriber to its response handle. The underlying I/O event loop (Node.js libuv, Go’s netpoll, Python’s asyncio selector) already handles many sockets on a single OS thread. The “pool” here is the controlled, observable set of active handles with enforced limits.
The key invariant: the number of upstream event-source connections (to Redis, Kafka, or your own service) must remain constant regardless of how many clients connect. Each Redis SUBSCRIBE from a new client wastes one Redis connection. Instead, one subscriber per channel fans out to all registered clients—the Redis Pub/Sub fan-out pattern.
Wire-Level Anatomy Permalink to this section
Every SSE connection begins with a standard HTTP request:
GET /api/events HTTP/1.1
Host: api.example.com
Accept: text/event-stream
Cache-Control: no-cache
Last-Event-ID: 1718900000042
Connection: keep-alive
The server responds with 200 OK and never closes the body:
HTTP/1.1 200 OK
Content-Type: text/event-stream; charset=utf-8
Cache-Control: no-cache
X-Accel-Buffering: no
Connection: keep-alive
Transfer-Encoding: chunked
: keepalive\n\n
id: 1718900000043\n
event: update\n
data: {"price":142.07}\n\n
The connection stays in ESTABLISHED state indefinitely. The pool manager’s job is to keep track of every such socket, enforce a per-process ceiling, and detect silent drops before the client’s EventSource reconnect timer fires.
Server-Side Implementation: Node.js Connection Registry Permalink to this section
Node.js is the most common SSE runtime because its event loop is already non-blocking. The pattern below implements a bounded connection registry with heartbeats, idempotent Last-Event-ID tracking, and graceful drain.
// sse-pool.js — production connection registry for Node.js SSE
import { randomUUID } from 'crypto';
const MAX_CONNECTIONS = parseInt(process.env.SSE_MAX_CONN ?? '8000', 10);
const HEARTBEAT_MS = 20_000; // 20 s — under most proxy idle timeouts
const DRAIN_TIMEOUT = 5_000; // 5 s to flush on shutdown
class SSEPool {
#clients = new Map(); // clientId → { res, channel, lastEventId }
#heartbeatTimer = null;
constructor() {
this.#heartbeatTimer = setInterval(() => this.#sendHeartbeats(), HEARTBEAT_MS);
// Prevent the timer from blocking process exit
this.#heartbeatTimer.unref();
}
/** Register a new SSE client; returns clientId or throws if pool is full. */
add(res, { channel = 'default', lastEventId = null } = {}) {
if (this.#clients.size >= MAX_CONNECTIONS) {
res.writeHead(503, {
'Retry-After': '30',
'Content-Type': 'text/plain',
});
res.end('Connection pool exhausted. Retry in 30s.');
throw new Error('SSE pool full');
}
const id = randomUUID();
// Standard SSE headers — disable every caching layer
res.writeHead(200, {
'Content-Type': 'text/event-stream; charset=utf-8',
'Cache-Control': 'no-cache',
'X-Accel-Buffering': 'no', // nginx: disable proxy_buffer
'Connection': 'keep-alive',
});
res.flushHeaders(); // send headers immediately; don't wait for body
// Disable Nagle — SSE frames are tiny, latency matters
res.socket?.setNoDelay(true);
this.#clients.set(id, { res, channel, lastEventId });
// Remove client on disconnect (TCP RST, tab close, network drop)
res.on('close', () => this.remove(id));
res.on('error', () => this.remove(id));
return id;
}
/** Remove a client and destroy its response handle. */
remove(id) {
const client = this.#clients.get(id);
if (!client) return;
this.#clients.delete(id);
try { client.res.destroy(); } catch (_) { /* already gone */ }
}
/** Broadcast a formatted SSE event to all clients on a channel. */
broadcast(channel, { id, event, data }) {
const payload = this.#format({ id, event, data });
for (const [clientId, client] of this.#clients) {
if (client.channel !== channel) continue;
const ok = client.res.write(payload);
if (!ok) {
// Back-pressure: the kernel send buffer is full.
// Evict slow consumer rather than blocking the loop.
this.remove(clientId);
} else if (id) {
client.lastEventId = id; // track for reconnect
}
}
}
#format({ id, event, data }) {
let msg = '';
if (id) msg += `id: ${id}\n`;
if (event) msg += `event: ${event}\n`;
msg += `data: ${typeof data === 'string' ? data : JSON.stringify(data)}\n\n`;
return msg;
}
#sendHeartbeats() {
const comment = ': keepalive\n\n';
for (const [id, { res }] of this.#clients) {
const ok = client.res.write(comment);
if (!ok) this.remove(id);
}
}
/** Graceful shutdown: notify clients, then drain within timeout. */
async drain() {
const goodbye = 'event: close\ndata: server_shutdown\n\n';
for (const { res } of this.#clients.values()) {
try { res.write(goodbye); } catch (_) { /* ignore */ }
}
await new Promise(resolve => setTimeout(resolve, DRAIN_TIMEOUT));
for (const id of this.#clients.keys()) this.remove(id);
clearInterval(this.#heartbeatTimer);
}
get size() { return this.#clients.size; }
}
export const pool = new SSEPool();
Wire it into an Express route:
// routes/events.js
import { pool } from './sse-pool.js';
import { redisSubscriber } from './redis.js'; // single shared subscriber
app.get('/api/events', (req, res) => {
const channel = req.query.channel ?? 'default';
const lastEventId = req.headers['last-event-id'] ?? null;
pool.add(res, { channel, lastEventId });
// No further work in this handler — events arrive via redisSubscriber.on('message')
});
// Separate process.on('SIGTERM') handler
process.on('SIGTERM', async () => {
server.close(); // stop accepting new connections
await pool.drain(); // flush in-flight SSE frames
process.exit(0);
});
Metric Exposure Permalink to this section
Expose pool depth to your observability stack:
app.get('/metrics', (req, res) => {
res.set('Content-Type', 'text/plain');
res.send([
`# HELP sse_active_connections Number of live SSE clients`,
`# TYPE sse_active_connections gauge`,
`sse_active_connections ${pool.size}`,
].join('\n'));
});
Server-Side Implementation: Go Connection Registry Permalink to this section
Go’s net/http http.Flusher pattern maps naturally to a pooled registry protected by a sync.RWMutex. See Go Streaming Patterns for SSE for full context; below is a pool-focused excerpt:
// pool.go
package sse
import (
"fmt"
"net/http"
"sync"
"time"
)
const (
maxConnections = 10_000
heartbeatEvery = 20 * time.Second
)
type client struct {
w http.ResponseWriter
flusher http.Flusher
channel string
done chan struct{}
}
type Pool struct {
mu sync.RWMutex
clients map[string]*client // uuid → client
}
func NewPool() *Pool {
p := &Pool{clients: make(map[string]*client)}
go p.heartbeatLoop()
return p
}
// Add registers a client. The caller must hold the HTTP handler goroutine
// open (block on <-client.done) so the connection stays alive.
func (p *Pool) Add(id string, w http.ResponseWriter, channel string) (*client, error) {
p.mu.Lock()
defer p.mu.Unlock()
if len(p.clients) >= maxConnections {
http.Error(w, "pool full", http.StatusServiceUnavailable)
w.Header().Set("Retry-After", "30")
return nil, fmt.Errorf("pool full")
}
flusher, ok := w.(http.Flusher)
if !ok {
return nil, fmt.Errorf("streaming not supported")
}
// Mandatory SSE headers
h := w.Header()
h.Set("Content-Type", "text/event-stream; charset=utf-8")
h.Set("Cache-Control", "no-cache")
h.Set("X-Accel-Buffering", "no")
h.Set("Connection", "keep-alive")
w.WriteHeader(http.StatusOK)
flusher.Flush()
c := &client{w: w, flusher: flusher, channel: channel, done: make(chan struct{})}
p.clients[id] = c
return c, nil
}
func (p *Pool) Remove(id string) {
p.mu.Lock()
c, ok := p.clients[id]
if ok {
delete(p.clients, id)
close(c.done)
}
p.mu.Unlock()
}
func (p *Pool) Broadcast(channel, payload string) {
p.mu.RLock()
defer p.mu.RUnlock()
for _, c := range p.clients {
if c.channel != channel { continue }
fmt.Fprint(c.w, payload)
c.flusher.Flush()
}
}
func (p *Pool) heartbeatLoop() {
ticker := time.NewTicker(heartbeatEvery)
defer ticker.Stop()
for range ticker.C {
p.Broadcast("", ": keepalive\n\n") // empty channel = all clients
}
}
The handler blocks on <-c.done; when the client disconnects, req.Context().Done() fires and the handler calls pool.Remove(id) which closes c.done.
Edge Cases and Network Interference Permalink to this section
SSE connections traverse multiple network hops, each of which can silently destroy a long-lived stream.
| Layer | Failure mode | Mitigation |
|---|---|---|
| Corporate HTTP proxy | Strips Connection: keep-alive, enforces 60 s idle timeout |
Heartbeat every 20 s; retry: directive ≤ 5000 ms |
| nginx (default config) | proxy_buffering on holds body until buffer fills |
proxy_buffering off; proxy_cache off; |
| AWS ALB | 60 s idle timeout (configurable to 4000 s) | Raise to 3600 s in listener settings; heartbeat < 60 s |
| Cloudflare (free) | 100 s HTTP response timeout | Upgrade to Pro/Business; or use Durable Objects streaming |
| CDN edge cache | May cache 200 text/event-stream responses |
Cache-Control: no-store, no-cache; Surrogate-Control: no-store |
| IPv6 NAT64 gateway | May re-sequence TCP segments, breaking chunked boundaries | Enforce Transfer-Encoding: chunked at the app layer; verify with curl |
| TLS termination proxy | X-Forwarded-Proto mismatch causes mixed-content block |
Terminate TLS at proxy, set Strict-Transport-Security |
nginx Configuration for SSE Pools Permalink to this section
upstream sse_backend {
keepalive 64; # maintain 64 idle keep-alive connections to app
server 127.0.0.1:3000;
}
server {
listen 443 ssl http2;
location /api/events {
proxy_pass http://sse_backend;
proxy_http_version 1.1; # required for keep-alive
proxy_set_header Connection ""; # clear hop-by-hop for upstream pool
proxy_set_header Host $host;
# Disable ALL buffering for SSE
proxy_buffering off;
proxy_cache off;
proxy_read_timeout 3600s; # match your SSE session length
chunked_transfer_encoding on;
# Tell Cloudflare, Varnish, and other CDNs not to buffer
add_header X-Accel-Buffering no always;
}
}
File-Descriptor Budget Permalink to this section
SSE servers exhaust file descriptors long before they exhaust CPU. Each connection consumes 1 FD minimum; with TLS and logging pipes, expect 3–5 FDs per client. The detailed process is covered in Tuning File-Descriptor Limits for SSE Connection Pools, but the fast version:
# Raise soft and hard limits for the SSE process user
echo "www-data soft nofile 65535" >> /etc/security/limits.conf
echo "www-data hard nofile 65535" >> /etc/security/limits.conf
# For systemd services, override in the unit file:
# [Service]
# LimitNOFILE=65535
# Verify at runtime:
cat /proc/$(pgrep -f 'node server.js')/limits | grep 'open files'
And the matching kernel parameters:
# /etc/sysctl.d/99-sse.conf
fs.file-max = 2097152 # system-wide FD ceiling
net.core.somaxconn = 65535 # listen backlog
net.ipv4.tcp_tw_reuse = 1 # reuse TIME_WAIT sockets for new connections
Performance and Scale Considerations Permalink to this section
Connection Count vs. Memory Permalink to this section
A Node.js process idle-holding 10,000 SSE connections uses roughly:
| Resource | Per-connection cost | 10,000 connections |
|---|---|---|
| OS socket buffers (default) | ~174 KB (87 KB rx + 87 KB tx) | ~1.7 GB kernel memory |
Node.js net.Socket object |
~4–8 KB heap | ~80 MB heap |
| TLS session state (if applicable) | ~8–16 KB | ~160 MB |
| Application registry entry | ~0.5–1 KB | ~10 MB |
Reduce kernel buffer consumption:
# Halve default socket buffers — fine for SSE (mostly server→client)
sysctl -w net.core.rmem_default=43690
sysctl -w net.core.wmem_default=43690
Event-Loop Saturation (Node.js) Permalink to this section
Broadcasting to 10,000 clients in a single synchronous loop blocks the event loop for the duration of all res.write() calls. Chunk the broadcast into micro-task batches:
async function broadcastBatched(clients, payload, batchSize = 500) {
const entries = [...clients.entries()];
for (let i = 0; i < entries.length; i += batchSize) {
const batch = entries.slice(i, i + batchSize);
for (const [id, { res }] of batch) {
if (!res.write(payload)) pool.remove(id); // back-pressure eviction
}
// Yield to the event loop between batches
await new Promise(resolve => setImmediate(resolve));
}
}
Back-Pressure and Slow Consumers Permalink to this section
When res.write() returns false, the kernel send buffer is full—the client is consuming events more slowly than they arrive. Options, in order of preference:
- Evict immediately — simplest; the
EventSourcereconnect on the client handles recovery. Best for non-critical streams. - Drop events with a skip counter — emit
event: skip\ndata: {"n":42}\n\nand continue. Good for dashboard metrics. - Buffer with a bounded queue — hold up to N events per client; evict if queue exceeds the limit. Adds memory pressure.
The interaction between pool-level back-pressure and token-bucket rate limiting is covered in Rate Limiting & Backpressure Handling.
Worker / Thread Scaling Permalink to this section
For CPU-bound workloads or very high connection counts, run multiple Node.js worker threads or Go goroutines, each owning a shard of the connection registry:
// cluster.js — shard pool by worker ID
import cluster from 'cluster';
import os from 'os';
const WORKERS = os.cpus().length; // or a fixed number like 4
if (cluster.isPrimary) {
for (let i = 0; i < WORKERS; i++) cluster.fork();
cluster.on('exit', (worker) => {
console.error(`Worker ${worker.process.pid} died; restarting`);
cluster.fork();
});
} else {
// Each worker runs an independent SSE pool shard
import('./server.js');
}
With Node.js cluster, each worker holds its own OS connections; a shared-nothing model. For multi-node fan-out, use Redis Pub/Sub as the inter-process event bus so all shards receive the same events.
Validation and Debugging Permalink to this section
Load Test the Pool Ceiling Permalink to this section
# Open 1000 concurrent SSE connections with wrk2 (event-stream aware)
wrk -t4 -c1000 -d60s --latency \
-H "Accept: text/event-stream" \
http://localhost:3000/api/events
# Simpler: bash loop with curl (count FDs after)
for i in $(seq 1 200); do
curl -sN -H "Accept: text/event-stream" http://localhost:3000/api/events &
done
ls /proc/$(pgrep -f 'node')/fd | wc -l
Confirm Heartbeats Arrive Permalink to this section
# One-liner: watch for keepalive comments in the stream
curl -sN -H "Accept: text/event-stream" http://localhost:3000/api/events \
| grep --line-buffered ":"
# Expected output every ~20 s:
# : keepalive
Check Proxy Buffering is Disabled Permalink to this section
# If the first event is delayed, proxy buffering is active.
# Test by timing to first byte:
curl -o /dev/null -s -w "TTFB: %{time_starttransfer}s\n" \
-H "Accept: text/event-stream" https://api.example.com/api/events
# Should be < 500 ms if headers flush immediately
Structured Logging for Pool Events Permalink to this section
// Emit JSON log lines that your log aggregator can count/alert on
const logPoolEvent = (event, clientId, extra = {}) => {
process.stdout.write(JSON.stringify({
ts: new Date().toISOString(),
event, // 'connect' | 'disconnect' | 'evict' | 'full'
clientId,
poolSize: pool.size,
...extra,
}) + '\n');
};
Alert thresholds to configure in your APM:
poolSize > 0.90 * MAX_CONNECTIONS→ warn (circuit breaker will fire at 100%)evictrate > 10/min → slow consumer problem; investigate back-pressure handlingfullevents any rate → either scale horizontally or raiseSSE_MAX_CONN
DevTools Verification Permalink to this section
- Open Chrome DevTools → Network tab → filter by
EventStream. - Select the SSE request; the EventStream sub-tab shows each
id,event,datafield. - Check Timing → “Waiting (TTFB)” should be < 500 ms; “Content Download” should grow steadily (not spike).
- In Firefox,
about:networking#httpshows active persistent connections; count them to verify proxy pooling is working.
⚡ Production Directives
- Set
SSE_MAX_CONNand respond with503 + Retry-After: 30when the pool is full — never let the server silently drop frames under overload. - Send a heartbeat comment (
: keepalive\n\n) every 20 s and setretry: 3000\n\nso clients reconnect in 3 s after a silent drop. - Disable proxy buffering at every layer:
proxy_buffering offin nginx,X-Accel-Buffering: noresponse header, and raise ALB/Cloudflare idle timeouts to at least 3600 s. - Tune OS FD limits to at least 3× your expected peak connection count before deploying (
LimitNOFILE=65535in the systemd unit). - Emit structured pool metrics (
poolSize,evictrate) and alert at 90% capacity to give time for horizontal scaling before the circuit breaker fires.
Production Checklist Permalink to this section
Frequently Asked Questions Permalink to this section
What is the practical maximum number of SSE connections per Node.js process?
With default kernel socket buffers (~174 KB each) and a 16 GB server, the kernel memory ceiling lands around 90,000 connections before you hit 1 GB of socket buffer memory. In practice, Node.js heap and V8 GC pressure typically cap you at 20,000–30,000 active connections per process before latency degrades. Run Node.js with --max-old-space-size=4096 and use Node's cluster module to shard across CPU cores. Each core can own ~5,000–10,000 connections cleanly.
Do I need a connection pool if I'm using HTTP/2?
HTTP/2 multiplexes many logical streams over a single TCP connection between client and server, which helps at the transport level. However, your application still needs a connection registry to track active subscribers, enforce per-tenant limits, send heartbeats, and handle graceful drain. The FD pressure is reduced (one TCP socket may carry many streams), but the application-layer pool remains necessary.
Why does my SSE stream break silently after exactly 60 seconds behind AWS ALB?
AWS ALB's default idle timeout is 60 seconds. If no bytes flow on the connection for 60 s, the ALB sends a TCP RST. Your SSE server never sees the disconnect; it keeps writing to a dead socket. Fix: raise the ALB idle timeout to 3600 s in the listener configuration, and send a heartbeat comment (: keepalive\n\n) every 20–30 s to keep the connection active regardless.
How do I handle the Last-Event-ID across horizontal scaling?
When a client reconnects it sends Last-Event-ID in the request headers. If the reconnect lands on a different server node, that node must be able to replay events from that ID. Store events in a Redis stream (XRANGE mystream <lastId> +) or a time-series database indexed by event ID. The pool manager reads the header, replays missed events, then switches to live fan-out. This is covered in detail in the Idempotent Event ID Generation guide.
Should I evict or queue slow SSE consumers?
For most use cases (live dashboards, price tickers, notification feeds) immediate eviction is correct: the EventSource reconnects, sends Last-Event-ID, and replays from the last known-good position. Queuing adds unbounded memory risk—a client paused behind a corporate proxy may queue millions of events. Only queue when the data cannot be replayed (e.g., one-time tokens) and the queue is strictly bounded with an explicit max depth and eviction policy.