This article builds on the I/O fundamentals primer from last deep-dive. If you haven’t read it yet, start there for better understanding. Read here.

The Uptime Engineer

👋 Hi, I am Yoshik Karnawat

Know how Nginx handles 100,000 connections while apache chokes at 10,000

Facts About I/O Multiplexing

  • epoll is 1,400x faster than select() at 10,000 connections. 930 seconds vs 0.66

  • Most production servers use select() by default, limiting them to 1,024 concurrent connections

  • Nginx can handle 100,000+ connections per machine using epoll, while Apache with select/poll chokes at 10,000

  • Linux systems didn't have epoll until kernel 2.5.44 (2002)

Nginx, Redis, HAProxy - handling hundreds of thousands of concurrent connections without breaking a sweat.​

Apache? Starts choking past 10,000.​

The difference isn't threading. Not process pools. Not some optimization hack.​

It's three system calls: select, poll, and epoll.​

Most engineers know the names. Few understand what they actually do or why the choice between them determines if your system scales or dies under load.​

The Problem They Solve

Before these existed, the server model was broken.

One process = one connection. Or worse: a blocking read() that froze the entire worker until data arrived.​

Neither approach scales when you're managing thousands of connections.​

What we needed: a way to monitor multiple file descriptors simultaneously and wake up only when any of them has work to do.​

This is I/O multiplexing and it evolved in three stages.​

select() - The Original Bottleneck

select() uses three bitmap arrays to track file descriptors:​

  • readfds - which FDs to read from

  • writefds - which FDs to write to

  • exceptfds - which FDs have exceptions

You set bits for the FDs you care about. The kernel scans them and returns when events occur.​

Why it doesn't scale:

Hard FD limit: FD_SETSIZE is typically 1024. Anything above that? select() refuses to work.​

O(n) complexity: Every call requires copying the FD set from user space to kernel, linearly scanning all file descriptors, then rebuilding the entire set.​

For 1,000 connections, the kernel scans all 1,000 even if only one has traffic.​

The rebuild penalty:

select() destroys the FD set on return. You must reconstruct the entire bitmask on every loop iteration.​

Verdict: Portable, but performance degrades linearly with connection count.​

poll() - Slightly Better, Still Limited

poll() replaces bitmasks with a dynamic array of struct pollfd:​

struct pollfd {
    int   fd;       // File descriptor
    short events;   // Events to monitor
    short revents;  // Events that occurred
};

Improvements over select():

  • No 1024 FD limit - can monitor tens of thousands​

  • Cleaner API - no bitmask manipulation​

  • Separates input/output - no rebuild needed​

But fundamentally still broken:

poll() still requires copying the entire array to kernel space, linearly scanning every FD, then copying results back.​

Still O(n) per call. At 10,000 connections, this kills performance.​

Verdict: Better API than select(), same performance problem at scale.​

epoll() - The Revolution

Linux introduced epoll specifically for high-concurrency networking.​

The fundamental difference:

select/poll: You ask the kernel, "Is anyone ready?" and it scans everyone to answer.​

epoll: The kernel tells you, "Here are the ones that changed" no scanning required.​

This transforms complexity from O(n) per loopO(1) per event.​

How epoll works (3-step API):

Step 1: Create an epoll instance
Step 2: Register file descriptors
Step 3: Wait for events

Why epoll scales:

No scanning required. The kernel tracks events internally using a red-black tree for the interest set and a linked list for active FDs.​

You only pay cost when events happen and not per loop.​

This is why Nginx, Redis, and HAProxy use it.​

Edge-triggered vs Level-triggered:

Level-triggered (LT): Kernel notifies you as long as the FD is readable/writable. Safer, more intuitive. Default mode.​

Edge-triggered (ET): Kernel notifies you only when state changes. Most scalable. Requires careful non-blocking I/O.​

selectpollepoll represents the shift from "scan everything every time" to "notify me when something changes".​

When Should You Use What?

  • select()
    Good only for tiny, portable, cross-platform utilities.

  • poll()
    Fine for medium-sized systems, or when you need more FDs but don’t need extreme scale.

  • epoll()
    If you’re building anything that needs to survive spikes or manage thousands of connections, this is the only serious choice.

Fin: If you're building anything at scale on Linux, epoll is the only serious choice.

Until next time,
Stay up. Stay fast. Stay curious.

Keep Reading

No posts found