What is Queue Depth?

Technical Overview

Queue depth — also called Outstanding I/O or I/O depth — is the number of I/O commands a host can have simultaneously in-flight to a storage target. When an application issues a read or write, that command enters a submission queue. The storage device processes it and returns a completion. If the device can process commands faster than they arrive, the queue is often empty (low queue depth). If the host can keep the queue full, the device's parallelism is fully utilized. Modern NVMe SSDs are designed to exploit deep queues: their internal NAND controllers process multiple flash operations simultaneously across thousands of flash dies, so a queue depth of 1 delivers a fraction of the device's potential IOPS.

The NVMe protocol was architected from the ground up for deep queue workloads. The NVMe specification allows up to 65,535 I/O queues per controller and up to 65,535 commands per queue, yielding a theoretical maximum of over 4 billion outstanding commands simultaneously. In practice, deployments use far fewer queues (typically one per CPU core), but even one queue per core at queue depth 64–256 provides dramatically more parallelism than legacy protocols.

Queue depth interacts with latency in a predictable way described by Little's Law: average latency = queue depth / throughput rate. At low queue depths, throughput is limited by the number of concurrent operations, not device speed. At very high queue depths, queueing delays begin to dominate and average latency rises. The optimal operating point — where IOPS are maximized while latency remains acceptable — is the "saturation point" of the storage path, and it depends heavily on queue depth support.

How It Relates to NVMe/TCP

Queue depth is one of the primary reasons NVMe/TCP outperforms iSCSI and Fibre Channel at scale. Both iSCSI and traditional FC expose only a single queue per session/LUN, creating a fundamental concurrency bottleneck. NVMe/TCP inherits NVMe's multi-queue architecture, allowing each host CPU core to independently submit I/O without contention. For workloads like database OLTP — which require high concurrency of small random I/Os — this multi-queue advantage translates directly into 3–5× higher throughput and lower latency at the same network bandwidth.

Key Characteristics

NVMe maximum: 65,535 queues × 65,535 commands per queue
Typical NVMe deployment: 1 queue per CPU core, depth 64–1024
iSCSI maximum: 1 queue × 128 commands (CmdSN window)
FC maximum: 1 queue × 256 commands per LUN
Little's Law: Latency = Queue Depth / Throughput
Optimal depth: Device-dependent; typically 32–256 for NVMe SSDs

Queue Depth Comparison by Protocol

Protocol	Queues	Commands per Queue	Total Max Outstanding
NVMe/TCP	Up to 65,535	Up to 65,535	~4 billion
iSCSI	1 per session	128 (CmdSN window)	128
Fibre Channel	1 per LUN	256 (TASK SET)	256

Technical Overview

How It Relates to NVMe/TCP

Key Characteristics

Queue Depth Comparison by Protocol

Related Terms

Protocol Comparisons

Browse the Glossary