Series

Off-Heap Algorithms in Java

7 articles in this series

192 min total read
By Arthur Costa
Off-Heap Algorithms in Java: The Ring Buffer Foundation
1
Part 1

Off-Heap Algorithms in Java: The Ring Buffer Foundation

From a naive heap-based queue to an off-heap ring buffer with dramatically better throughput, tail latency, and GC behavior for high-frequency trading workloads.

Nov 15, 202520 min read
Arthur CostaAArthur Costa
Wait-Free SPSC Queues in Java
2
Part 2

Wait-Free SPSC Queues in Java

How to replace synchronized queue handshakes with a wait-free Single-Producer Single-Consumer ring buffer that uses precise memory ordering instead of locks.

Dec 23, 202518 min read
Arthur CostaAArthur Costa
Lock-Free MPSC Queues in Java
3
Part 3

Lock-Free MPSC Queues in Java

How to replace locked many-producer queues with a lock-free Multi-Producer Single-Consumer ring buffer coordinated entirely by CAS and sequence numbers.

Nov 17, 202518 min read
Arthur CostaAArthur Costa
MPMC Queues in Java: The Final Boss
4
Part 4

MPMC Queues in Java: The Final Boss

How to build a dual-CAS Multi-Producer Multi-Consumer ring buffer in Java that scales on both ends without collapsing under lock contention.

Nov 18, 202518 min read
Arthur CostaAArthur Costa
Event Pipelines in Java: The LMAX Disruptor Pattern
5
Part 5

Event Pipelines in Java: The LMAX Disruptor Pattern

How to chain SPSC queues into a high-throughput event pipeline, following the LMAX Disruptor pattern for multi-stage processing with sub-microsecond latency.

Nov 19, 202518 min read
Arthur CostaAArthur Costa
Wait-Free Telemetry: Never-Blocking Observability
6
Part 6

Wait-Free Telemetry: Never-Blocking Observability

Build wait-free telemetry buffers that never block producers, with overwrite semantics for high-frequency trading observability that doesn't impact system performance.

Jan 4, 202650 min read
Arthur CostaAArthur Costa
Sharded Processing: Per-Core Isolation for Zero Contention
7
Part 7

Sharded Processing: Per-Core Isolation for Zero Contention

Eliminate contention entirely with per-CPU-core sharded buffers, thread affinity, and isolated processing lanes for maximum parallelism.

Jan 4, 202650 min read
Arthur CostaAArthur Costa