May in TigerLand

    Dear friends,

    We hope your May was marvelous. Last month, we reworked the Linux loop with an 8% throughput increase, simplified the Grid’s block ownership model, and improved how stalls are decided and applied. We enjoyed Swival’s TigerBeetle audit, saluted Prof. Hannes Mühleisen’s OLTP vs OLGP vs OLAP spectrum, and announced our full house of speakers for Systems Distributed, taking place in Boston, July 27-28. Tickets are available!

    Let’s go!

    “Style ought to prove that one believes in an idea;
    not only that one thinks it but also feels it.”

    Friedrich Nietzsche

    TigerBeetle runs everything on a single thread, alternating between replica.tick(), which advances state, and io.run_for_ns(), which submits queued I/O work, waits for completions, and fires callbacks. Every I/O operation is asynchronous and tracked via a caller-owned Completion struct that bundles the syscall to run, a pointer to its context, and the callback to invoke once the kernel finishes the operation. On Linux, I/O operations are submitted to the kernel via io_uring. Last month, we reworked the Linux loop invoked by io.run_for_ns().

    • Previously, run_for_ns tracked a separate timeout including a workaround to avoid deadlocking due to an ancient kernel bug. Now, the timeout is embedded directly into the io_uring_enter syscall. This eliminates both the bookkeeping for timeouts, and the workaround.

    • The next_tick function is now publicly exposed and allows scheduling a deferred callback without involving kernel I/O. Also, we now re-enter the kernel after each completion’s callback is invoked rather than only once after all callbacks are run. Together these changes result in an 8% throughput increase by pipelining I/O and CPU work more effectively.

    The Grid is TigerBeetle’s abstraction over on-disk storage, providing read/write access to fixed-size blocks including an in-memory cache for hot blocks. Every component in TigerBeetle that touches on-disk data goes through the Grid. Now, the Grid’s cache owns all blocks.

    • Previously, each component allocated its own blocks and swapped them in and out of Grid ownership. The change allows allocating all blocks as one contiguous memory region rather than allocating them one-by-one.

    • On startup, the Grid allocates cache_count and stash_count blocks, where stash_count is the upper limit for the amount of blocks that components may hold a reference to at once. Components like compaction and scans no longer need to copy blocks, instead they can borrow them from the Grid. Even if a block is evicted from the cache, the component can keep a reference indefinitely because the block is moved to the stash.

    • The change results in a simpler ownership model and small throughput increase from 697K to 712K transactions per second.

    TigerBeetle uses Log-Structured Merge (LSM) trees to store data on-disk. Each LSM tree consists of multiple levels with exponentially increasing size. Incoming data is inserted into the top-most level, and, once a level is full, its data is compacted into the next level. At the end of each bar, TigerBeetle performs a k-way merge between two levels. While this was a significant improvement over the previous compaction implementation, it still introduced a noticeable tail latency spike.

    • The latest change eliminates the final merge step at the end of each bar. Instead, merging is now performed incrementally and on demand during compaction, which significantly smooths tail latency.

    • This has a minor performance impact on scans as they do need to perform on-demand merge, however the impact is almost not measurable.

    • On tail latencies, however, the performance impact is profound, with an up to 3x reduction!

    Viewstamped Replication (VSR) is TigerBeetle’s consensus protocol, responsible for replication across a cluster of replicas consisting of one primary and multiple backups. The primary receives client requests, replicates them to backups via prepares, and informs the client once the request is processed on a majority of replicas. To prevent slow replicas from lagging behind, TigerBeetle can inject commit stalls which give backups time to catch up. Last month we made several improvements to how stalls are decided and applied. 

    • First, we fixed a bug where the primary would incorrectly treat a backup as up-to-date. The fix tracks the highest commit_min seen from each backup and uses that to accurately measure lag. 

    • Second, we refined the stall metric itself. Rather than comparing the primary’s commit_min against the backup’s commit_min, we now use op - commit_min on the backup. This is a more direct measure of how far behind a backup is, in committing the prepares it holds. Unconditional stalls now consult the pipeline queue, giving idle primaries a grace period before stalling. 

    • Finally, prepare timeout retries are now fanned out across backups rather than retried serially.


    • We banned the assert keyword in Java as it can be en-/disabled with runtime flags. Thank you @MarioAriasC for pointing this out!

    • Jiefeng Li spotted a resource leak and potential overflow in the .NET, Java, and Go clients. Thanks for improving TigerBeetle with us, Jiefeng!


    Last month on IronBeetle, we continued the Pragmatics of Consensus mini-series where matklad and Tobi dive into our implementation of the Viewstamped Replication protocol. We covered the view change protocol, explaining what happens if the primary fails, including how replicas isolate an old primary and elect a new one, carrying over uncommitted pipeline state. Later, we opened the black box of how a newly elected primary decides what to include in its pipeline and how view changes are made durable, touching upon TigerBeetle’s fault model.

    Join us live every Thursday at 5pm UTC on Twitch, YouTube, and X!


    Swival’s Audit of Tigerbeetle, May 12

    An audit of TigerBeetle by Swival with 10 initial findings uncovered no high or medium severity vulnerabilities; most were false positives, with a few bugs falling outside TigerBeetle’s security fault model. Merci, Frank!

    The OLTP, OLGP, OLAP Spectrum, May 14

    Over in columnar land, Prof. Hannes Mühleisen announced DuckDB Quack, and, in his talk, included a spectrum that we wholeheartedly enjoyed: OLTP vs OLGP vs OLAP. Jump to the mention at 24:32! The elephant is being squeezed on both sides, and, with Quack, we see DuckDB moving more and more to include OLGP.

    Protocol-Aware Deterministic Simulation Testing, May 14

    Building distributed systems is notoriously hard to get right! At Bug Bash 2026, Chaitanya presented TigerBeetle’s approach to testing distributed database systems, and now you can watch the replay! TigerBeetle’s protocol-aware approach differs from typical DST in providing visibility into internal consensus and storage state, allowing for deeper testing of invariants and simulation of complex handcrafted scenarios. 

    OKTO PAYMENTS: Strengthening Financial Infrastructure, May 22

    Following regulatory developments in Brazil, OKTO PAYMENTS undertook a significant infrastructure transformation to support scalability, resilience, and future growth, choosing TigerBeetle to strengthen its real-time transaction processing. Thanks, Kalebi!

    Andrew Kelley’s JetBrains Interview, May 27

    In an interview hosted by Vitaly Bragilevsky, Head of Rust Ecosystem at JetBrains, Andrew spoke about how the Zig Software Foundation continues to invest in human contributors and hold the line on quality, describing briefly how TigerBeetle benefits from Zig’s design decisions. The interview passed half a million views and is well worth watching. Andrew’s leadership of the ZSF and his careful roadmap for the language is a case study in how to design and steward a systems language for the generations to come.


    Systems Distributed, Jul 27–28

    In just over a month, we’ll gather in Boston for the 4th Systems Distributed conference. The full house of Systems Distributed speakers has been announced, including: the creators of Zig (Andrew Kelley), Resonate (Dominik Tornow), Zooko’s Triangle (Zooko Wilcox-O’Hearn), and TigerBeetle (Joran Dirk Greef); the (co-)authors of NASA’s Power of Ten Rules for Safety-Critical Code (Gerard Holzmann), Flexible Paxos (Heidi Howard), Viewstamped Replication Revisited (James Cowling), Protocol-Aware Recovery (Ram Alagappan and Aishwarya Ganesan), and Logic for Programmers (Hillel Wayne); and a former technical group supervisor from NASA’s jet propulsion laboratory (Margaret Holzmann). Super-charged!

    Get your ticket to support the benefit, and we’ll see you next month for talks, harbor walks, and systems programming and systems thinking, all the way across the stack!


    Tweet Tweet Tweet Tweet Tweet Tweet

    ‘Till next time… protect it with fire!

    The TigerBeetle Team

    An idling tiger beetle Speech bubble says hi