Tuning File-Descriptor Limits for SSE Connection Pools Permalink to this section
Part of Connection Pooling for SSE Servers.
Each persistent SSE connection holds one open file descriptor for the TCP socket and, depending on your architecture, one more for a backing pipe, epoll slot, or Redis subscriber socket. On a default Linux install the per-process limit is 1 024. At ~800 concurrent clients your server silently starts rejecting new connections with Error: EMFILE: too many open files (Node.js), OSError: [Errno 24] Too many open files (Python), or a connection-reset with no log entry at all. This guide walks through every layer where the limit is enforced—kernel, shell session, systemd unit, and container runtime—and gives you copy-paste commands to raise it correctly and verify the result.
Symptom & Developer Intent Permalink to this section
You are running an SSE connection pool and notice one of these failure modes once active connections cross a threshold:
- Node.js / libuv:
Error: EMFILE: too many open files, acceptin stderr or your process monitor. - Python (FastAPI / Starlette):
OSError: [Errno 24] Too many open filesthrown insideasyncio’s event loop. - Go / net/http:
accept tcp: accept4: too many open filesreturned fromnet.Listen. - Nginx upstream:
(24: Too many open files) while connecting to upstreaminerror.log. - Silent drops: clients receive a connection-reset or the browser
EventSourceimmediately firesonerrorand retries, never establishing a stream.
The intent is to support N concurrent SSE connections—where N might be 5 000, 50 000, or 500 000—without EMFILE/ENFILE errors and without restarting the process.
Root Cause Analysis Permalink to this section
How Linux accounts for file descriptors Permalink to this section
Every open socket, file, pipe, or epoll file descriptor increments two counters:
| Counter | Scope | Default |
|---|---|---|
RLIMIT_NOFILE (soft) |
Per process | 1 024 |
RLIMIT_NOFILE (hard) |
Per process ceiling | 4 096 (varies by distro) |
fs.file-max |
System-wide open FDs | ~800 000 on modern kernels |
fs.nr_open |
Per-process kernel ceiling | 1 048 576 |
An SSE server that holds C concurrent connections consumes at minimum C sockets. Add the listening socket, a Redis pub/sub connection per worker, any log-file handles, and your TLS contexts, and a realistic overhead per connection is 1.2–1.5 FDs on average. A process with nofile=1024 therefore caps out around 680–800 live SSE streams.
Why raising it in your shell is not enough Permalink to this section
ulimit -n 65535 in a terminal only raises the soft limit for that shell session and its children. When a process manager (systemd, Docker, Kubernetes, PM2) forks your server, it inherits the limits configured in its own unit file or runtime spec—not your interactive shell. This is the single most common reason engineers raise the limit and still see EMFILE errors.
The kernel’s system-wide ceiling Permalink to this section
fs.file-max caps the total number of open FDs across all processes. On most modern kernels the default is large enough (often cat /proc/sys/fs/file-max returns 9–12 million), but a constrained VPS or container base image may set it much lower. If you are scaling past 100 000 concurrent SSE connections you need to verify this as well.
Step-by-Step Resolution Permalink to this section
Step 1 — Diagnose current limits Permalink to this section
Run these as the user that owns the server process:
# Soft and hard limit for the current shell
ulimit -Sn # soft (enforced)
ulimit -Hn # hard (ceiling you can raise to without root)
# Limits of a running process (replace PID)
cat /proc/$(pgrep -f "node server")/limits | grep "open files"
# System-wide current usage
sysctl fs.file-nr # used / free / max
cat /proc/sys/fs/file-max # absolute kernel ceiling
Step 2 — Raise the limit for an interactive / dev session Permalink to this section
# Raise soft limit to 65 535 for this shell (requires hard limit >= target)
ulimit -Sn 65535
# If the hard limit is too low, you need root:
sudo sh -c 'ulimit -Hn 1048576; exec su - youruser'
# Verify
ulimit -n # should print 65535
Step 3 — Set persistent per-user limits via /etc/security/limits.conf Permalink to this section
This applies to PAM-authenticated logins (SSH, console). It does not apply to systemd services.
# /etc/security/limits.conf (append or replace existing nofile lines)
* soft nofile 65535
* hard nofile 1048576
For the change to apply to your current session, log out and back in, or use pam_limits directly:
sudo sysctl -w fs.file-max=2097152 # temporary system-wide raise
Step 4 — Configure systemd service units Permalink to this section
This is the correct place to set limits for any process launched by systemd. Edit your service file directly or use a drop-in:
# Create a drop-in override (preferred — survives package updates)
sudo mkdir -p /etc/systemd/system/my-sse-server.service.d/
sudo tee /etc/systemd/system/my-sse-server.service.d/limits.conf <<'EOF'
[Service]
LimitNOFILE=1048576
EOF
# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart my-sse-server
# Verify the running process picked up the new limit
systemctl show my-sse-server | grep LimitNOFILE
# → LimitNOFILE=1048576
cat /proc/$(systemctl show --property MainPID --value my-sse-server)/limits \
| grep "open files"
LimitNOFILE in systemd sets both soft and hard to the same value; specify LimitNOFILE=soft:hard (e.g., 65535:1048576) if you want them to differ.
Step 5 — Tune kernel sysctl parameters Permalink to this section
# /etc/sysctl.d/99-sse-fds.conf
fs.file-max = 2097152 # system-wide ceiling
fs.nr_open = 1048576 # per-process kernel ceiling (must be <= file-max)
# TCP socket tunables that affect connection lifecycle
net.ipv4.tcp_fin_timeout = 15 # reduce TIME_WAIT duration
net.ipv4.tcp_tw_reuse = 1 # reuse TIME_WAIT sockets
net.core.somaxconn = 4096 # listen backlog
net.ipv4.tcp_max_syn_backlog = 8192
# Apply immediately without reboot
sudo sysctl --system
Verify:
sysctl fs.file-max fs.nr_open
Step 6 — Docker / OCI container runtime Permalink to this section
Docker inherits nofile from the host kernel’s defaults, not your systemd service. Set it explicitly:
# docker run
docker run \
--ulimit nofile=1048576:1048576 \
my-sse-server:latest
# docker-compose.yml
services:
sse-server:
image: my-sse-server:latest
ulimits:
nofile:
soft: 65535
hard: 1048576
For the Docker daemon itself (affects all containers on the host):
// /etc/docker/daemon.json
{
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Hard": 1048576,
"Soft": 65535
}
}
}
Restart the daemon after editing: sudo systemctl restart docker.
Step 7 — Kubernetes Permalink to this section
In Kubernetes, ulimit is set at the node level (via the container runtime) or through a privileged init container. The recommended approach is to set limits in the Pod Security Admission baseline or a custom init container:
# pod-spec fragment
initContainers:
- name: set-ulimits
image: busybox
command: ["sh", "-c", "ulimit -n 1048576"]
securityContext:
privileged: true
For node-level settings, configure containerd or CRI-O via their config files, or use a DaemonSet that writes to /proc/sys:
# DaemonSet container command (requires privileged)
command: ["sysctl", "-w", "fs.file-max=2097152"]
Step 8 — Application-layer guard (Node.js example) Permalink to this section
Even after raising OS limits, guard against misconfigured environments at startup:
// startup-check.js (ESM)
import { execSync } from "node:child_process";
const MAX_REQUIRED = 65_535;
function checkFdLimit() {
try {
// /proc/self/limits is Linux-only; skip gracefully on other OSes
const raw = execSync("cat /proc/self/limits").toString();
const match = raw.match(/Max open files\s+(\d+)/);
const soft = match ? parseInt(match[1], 10) : Infinity;
if (soft < MAX_REQUIRED) {
console.error(
`[FATAL] nofile soft limit is ${soft}; need >= ${MAX_REQUIRED}. ` +
`Add LimitNOFILE=${MAX_REQUIRED} to your systemd unit.`
);
process.exit(1);
}
console.log(`[OK] nofile soft limit: ${soft}`);
} catch (_) {
// Non-Linux; skip
}
}
checkFdLimit();
Validation & Monitoring Permalink to this section
Check limits took effect Permalink to this section
# For a running PID
PID=$(pgrep -f "node server.js")
grep "open files" /proc/$PID/limits
# Max open files 65535 1048576 files
# Count currently open FDs for that process
ls -1 /proc/$PID/fd | wc -l
Load-test to confirm headroom Permalink to this section
# Install wrk or use the built-in approach with curl + parallel
# Open 2000 simultaneous SSE connections and hold for 30 s
seq 1 2000 | xargs -P 2000 -I{} \
curl -s -N -H "Accept: text/event-stream" \
http://localhost:3000/events > /dev/null &
# While that runs, watch FD consumption
watch -n1 "ls -1 /proc/$(pgrep -f 'node server')/fd | wc -l"
Prometheus / metrics Permalink to this section
If your server exposes metrics, track these gauges:
# Example: expose current FD count from Node.js
import { readFileSync } from "node:fs";
function openFdCount(pid = process.pid) {
try {
return readFileSync(`/proc/${pid}/fd`, { withFileTypes: true }).length;
} catch (_) {
// /proc/PID/fd requires the same uid or root
return -1;
}
}
// Register as a Prometheus gauge and scrape via /metrics
Key alert thresholds: warn at 70 % of nofile soft limit; page at 90 %. This gives headroom for reconnection storms when clients retry after a deploy. For more on managing reconnect bursts, see Rate Limiting & Backpressure Handling and Event ID & Retry Mechanism Design.
System-wide FD exhaustion check Permalink to this section
# columns: allocated / free / max
cat /proc/sys/fs/file-nr
# e.g.: 14368 0 2097152
# "free" is always 0 on modern kernels (not a concern)
Alert if allocated / max > 0.8.
Verification Checklist Permalink to this section
⚡ Production Directives
- Set
LimitNOFILE=1048576in every systemd unit that runs an SSE server — this is the single highest-impact change and cannot be replaced by/etc/security/limits.conf. - Add a startup FD-limit guard that exits non-zero when the soft limit is below your minimum; catch misconfigured deploys before they reach production traffic.
- Alert at 70 % of the soft limit on a per-process gauge; reconnection storms after a rolling restart can momentarily double active connections.
- Tune
net.ipv4.tcp_fin_timeout=15andtcp_tw_reuse=1alongside FD limits — sockets stuck in TIME_WAIT still consume FDs even after the application closes them. - In Docker/Kubernetes, set ulimits at the container level; host-level
daemon.jsondefaults are overridden per-container and should be set both places as defence-in-depth.
Frequently Asked Questions Permalink to this section
Why does raising ulimit in my shell not fix the EMFILE error in production?
Shell ulimit changes apply only to that shell session and its direct children. Systemd, Docker, and Kubernetes all fork processes with limits inherited from their own configuration, not your interactive session. The correct fix depends on the supervisor: LimitNOFILE in a systemd unit, --ulimit in docker run, or a node-level containerd setting in Kubernetes.
How many file descriptors does one SSE connection actually use?
One TCP socket = one FD. If your SSE architecture subscribes each connection to a Redis channel directly, add one more FD per connection for the Redis socket (though a shared pub/sub fan-out pattern avoids this — see Redis Pub/Sub Fan-Out for SSE). Add ~10–20 FDs for the listening socket, TLS contexts, log files, and internal pipes. Budget 1.2–1.5 FDs per connection for sizing.
What is the maximum practical value for LimitNOFILE?
The kernel's per-process ceiling is fs.nr_open, which defaults to 1 048 576 (2^20). You can raise fs.nr_open up to fs.file-max. In practice, 1 048 576 is the correct production target for high-concurrency SSE servers; setting it higher requires changing fs.nr_open first, which is rarely necessary unless you exceed ~800 000 concurrent connections on a single process.
Do I need to change fs.file-max as well as LimitNOFILE?
Only if your system-wide total (across all processes) approaches the fs.file-max value shown in /proc/sys/fs/file-nr. On a dedicated SSE server this is uncommon until you exceed ~500 000 connections. Run cat /proc/sys/fs/file-nr to check; if allocated / max > 0.5, raise fs.file-max in /etc/sysctl.d/.
My Go server hits EMFILE but the Node.js server on the same box does not — why?
Go's net/http server opens the listening socket, one goroutine stack (not an FD), and one FD per accepted connection — the same as Node.js. The discrepancy is usually the Go process running under a different systemd unit or user with a stricter LimitNOFILE, or the Go binary calling setrlimit at startup with a hardcoded value. Check /proc/$(pgrep -f mygoserver)/limits and compare against the Node.js process.