March in TigerLand
Dear friends,
Hope your March was magnificent! We updated the TigerBeetle REPL, improved temporal locality to reduce memory bandwidth by 20%, and went on an expedition to reduce binary size. We also added single-page mode to docs, had a “board meeting”, and made a little appearance on Lex Fridman (thanks, Prime!).
Let’s go!
“Measure, Then Build” Remzi Arpaci-Dusseau
- TigerBeetle’s deterministic simulator (the VOPR) has been invaluable for uncovering various safety and availability bugs in our Viewstamped Replication (VSR) implementation, from overt violations to subtle bugs caused by complex interleavings of storage, network and process faults. Having recently achieved “the power of zero”, with all known safety and availability bugs resolved, we now set out to solve performance. To this end, we have begun investing in Deterministic Performance Testing (DPT) atop our existing simulation infrastructure, to uncover performance regressions and benchmark performance-related changes in VSR.
- We improved upon a known performance bug – elevated latencies in the presence of even a single replica failure. To alleviate this, we introduced alternating ring replication directions for alternating prepares, so that replicas can identify breaks in the ring topology and initiate repair more efficiently. Additionally, we fixed a bug where replicas were not resetting prepare_timeout, leading to pathologically bad performance in a busy cluster.
- With deterministic performance testing in place, we were able to reproduce this performance degradation!
- We also optimized the performance of our client request protocol to make it more resilient to network faults. Earlier, a single request message dropped between the client and the primary would cause client retries to round robin through the entire cluster before contacting the primary again. Now, client retries are directed only at the primary, and the client learns of view changes via round robin pings to the cluster. Check out the description of the PR to see the reduced number of messages and time for the optimized algorithm, using DPT!
- VOPR’s fault injection logic is careful to inject faults that the TigerBeetle cluster can tolerate. For example, a prepare is only corrupted on at most a minority of replicas. Despite exercising caution, fault injection could sometimes end up injecting a rare combination of faults that a cluster cannot tolerate! To detect such fault positives, we now check for correlated faults at the end of a VOPR run.
- We also fixed two upgrade race conditions wherein if a new binary is dropped in at an inopportune time (see the PRs for more details), the replica would either skip upgrading to the new version, or crash. While both bugs could be circumvented by manually restarting the replicas, they were squashed for good.
- Replicas now ignore deprecated commands, as opposed to crashing. This guards against the case where a packet replay from a replica running on an older version crashes a replica running on a newer version.
- Over the past few years, network and disk bandwidth have been growing faster than memory bandwidth. In fact, this trend can be traced even further back! Consequently, at TigerBeetle, several of our design decisions are in service of optimizing memory bandwidth utilization. One example is our use of zero-copy deserialization, such that a prepare message received over the network is directly written to the local disk, and then directly sent over the network to another replica, without any memory copies!
- Last month, we reduced memory bandwidth utilization by ~20% by using LIFO (Last In/First Out) instead of FIFO (First In/First Out) to store free blocks during compaction, for better temporal locality!
- For increased memory safety, we banned direct usage of Zig’s @memcpy in the TigerBeetle codebase. Instead, we augment @memcpy with pointer bounds checking for the memory buffers in our copy_disjoint utility. We also have similar wrappers over Zig’s copyForwards and copyBackwards utilities.
- TigerBeetle has a “zero dependencies” policy, apart from the Zig toolchain. Dependencies, in general, inevitably lead to supply chain attacks, safety and performance risk, and slow install times. For foundational infrastructure in particular, the cost of any dependency is further amplified throughout the rest of the stack. Occasionally, motivated by insulating TigerBeetle from churn in Zig’s standard library, and aligning APIs with TIGER_STYLE, we vendor Zig’s std APIs into stdx.
- Last month, we vendored the BitSetType, to shave off 38KiB from the Linux release binary! Specifically, the usage of initFull from the Zig standard library was a binary size footgun, since it entailed including the entire bitset into the text section of ELF binary.
- Additionally, we vendored the Pseudo Random Number Generator (PRNG) to remove floating point from the API for determinism, and to isolate TigerBeetle from churn in PRNG algorithms. Finally, we vendored AEGIS for hash stability.
- Last month also saw some miscellaneous improvements to TigerBeetle documentation, metrics, and clients:
- We introduced a single page mode to our documentation. Additionally, we spruced up multiple sections, including the safety guarantees and architecture of TigerBeetle. Enjoy the new reading experience and get hacking!
- Additionally, we added new metrics for better observability into the state of TigerBeetle’s replicas; replica status, view number and op number, to name a few. Recall that these metrics are periodically emitted via statsd if --statsd=: is specified in the tigerbeetle start command. Otherwise, they are emitted via log statements to stdout.
- Finally, we added validation to disallow negative big integers in the Java and Golang clients, and bumped the oldest supported client version to 0.16.4. Note that the latter removes backward compatibility with various deprecated features and requires users to make sure their clients are running on at least version 0.16.4 before upgrading to 0.16.33.
- Ivo spotted and fixed a bug in our Python client’s sample code for using the create_transfers API for more than 8190 transfers. Great catch, and thank you for the fix, Ivo!
- Nils added syntax highlighting to the systemd and docker-compose snippets in our documentation. Good eye, Nils!
On IronBeetle, matklad discussed why it’s important that the end client generates idempotency key, hosted a special “letters from readers” edition on managing errors (a deep philosophy!), and debugged a failing deterministic seed live.
Brian’s TigerBeetle Stream redefined the Rust client types to be ABI compatible with Zig, covered converting between C and Rust types, and encountered some interesting challenges while merging Rust client branch with upstream TigerBeetle.
Join us live every week on Twitch or catch up on the TigerTube!
Interledger Summit, Oct 27 (2024) Joran’s talk from Interledger Summit 2024 is up! In seven years, OLTP increased three orders and existing infrastructure is general purpose, and 20-30 years old. What will the future look like? Faithful time will tell. Tune in to the end to hear the live audience questions from Stefan Thomas as well as Kosta Peric, former Chief Architect of SWIFT.
ARCHITECTURE.md Under-the-hood explorers, we restructured the overview of TigerBeetle’s architecture. Now including a problem statement, overview, motivation for design decisions, and list of references that have inspired us. Bravo, matklad!
Understanding TigerBeetle (Part 1), Swana Simran, Mar 6 We enjoyed this overview of why the world needs TigerBeetle, including that: “ while traditional DB’s speak SQL, TigerBeetle speaks Debit/Credit ”. Thanks, Swana and here’s to part 2!
Lex Fridman Podcast #461, Mar 22 On Monday, we learned that The Primeagen referenced us in his conversation with Lex Fridman over the weekend, in the context of mission-critical safety and the joy of assertions, following TigerStyle. Catch it at this timestamp. Thanks (again), Prime!
Hiring Goals Experience with TigerBeetle is becoming
a job requirement. That’s pretty awesome! 
Elixir Language Meetup, Milan (April 9) Get the scoop on Riccardo Binetti’s talk “TigerBeetlex: An Elixir and Zig Love Story”, accepted for ElixConf in May, at the Milan meetup for the Elixir community. We wish we could be there!
J on the beach (May 14-16) Federico Massimiliano Lorenzi is preparing a talk on TigerBeetle’s Multiversion Binaries for J on The Beach in May! A recommended rendezvous for Developers, Data Scientists and DevOps.
Systems Distributed ’25 (June 19-20) We announced SD’25 a month ago, tickets are moving, and we’re looking forward to spending time with you all in person in Amsterdam soon. In June. As @ludwigABAP would say, see you there Euro-moots!
‘Till next time… kick the drums and thrash!
The TigerBeetle Team
A secret project begins…!







