Performance

What is IOPS (I/O Operations Per Second)?

IOPS measures the number of read or write operations a storage system can perform per second — the primary metric for evaluating storage performance under random-access workloads like databases and transactional applications.

Technical Overview

IOPS is always measured relative to a specific I/O profile: block size (4K, 8K, 128K, etc.), access pattern (random vs. sequential), read/write ratio, and queue depth. A storage system that delivers 1 million IOPS at 4K random reads may deliver only 200K IOPS at 4K random writes, and 50K IOPS at 128K sequential writes — these are different workloads that stress different components of the storage path. Storage vendors and benchmarks typically report peak IOPS at the most favorable conditions (small block, 100% read, high queue depth), so understanding the test parameters is critical when comparing specifications.

The relationship between IOPS, latency, and queue depth follows Little's Law: IOPS = Queue Depth / Latency. At a fixed latency of 100 µs, a storage path with queue depth 1 can achieve 10,000 IOPS; with queue depth 100, the same latency yields 1,000,000 IOPS. This is why NVMe's massive queue depth advantage translates directly into higher IOPS: the protocol can keep more operations in flight simultaneously, filling the latency pipeline and extracting full device parallelism.

For NVMe/TCP, IOPS is bounded by several factors: network bandwidth (a 25 GbE link at 4K blocks can theoretically carry ~780K IOPS before being bandwidth-limited), CPU processing overhead per I/O operation, storage device IOPS capacity, and the efficiency of the software stack. Modern NVMe SSDs can exceed 1.5 million 4K random read IOPS; NVMe/TCP targets serving those SSDs can approach that limit on a well-configured server with sufficient network bandwidth and CPU cores.

How It Relates to NVMe/TCP

IOPS is the headline performance metric where NVMe/TCP most dramatically outperforms legacy protocols. The combination of NVMe's multi-queue architecture, the low per-operation overhead of the NVMe command set (no SCSI CDB translation), and modern TCP offload capabilities allows NVMe/TCP to deliver IOPS that were previously achievable only with local NVMe storage. The comparison versus iSCSI — which is bandwidth- and queue-depth-limited to roughly 400K IOPS in typical deployments — demonstrates the practical impact of protocol choice on storage-limited application workloads.

Key Characteristics

  • Measurement unit: Operations per second (K IOPS = thousands, M IOPS = millions)
  • Key variables: Block size, read/write ratio, queue depth, access pattern
  • Benchmark tools: fio, iometer, vdbench, sysbench
  • Little's Law: IOPS = Queue Depth ÷ Latency (in seconds)
  • NVMe local: Up to 1.5–7M IOPS (device-dependent)
  • NVMe/TCP (practical): Up to ~1.5M IOPS over 100 GbE

IOPS by Protocol (4K Random Read, High Queue Depth)

Protocol / Medium Typical IOPS Notes
NVMe local (PCIe 4.0) ~1.5M Single high-end NVMe SSD
NVMe/TCP (25–100 GbE) ~1.5M Near-native with sufficient BW
iSCSI (10–25 GbE) ~400K Limited by single queue depth
NFS (v4.1) 100K–500K Metadata overhead on random I/O