June in TigerLand

    Dear friends,

    Hope your June was a joy! We spoke about The Next 30 Years of OLTP at Money 20/20 in Amsterdam, hosted Systems Distributed ‘24 in NYC, upgraded to Zig 0.13.0, landed new code, optimizations and bug fixes, were featured on the Nasdaq tower in Times Square, reached #1 on GitHub Trending, and found time to enjoy moments with friends, old and new.

    Let’s go!

    • This month, we focused on simplifying, cleaning, and rethinking parts of the code base. In other words, we added stuff, and ended up lighter!

    “I use the rubber a lot more than the pencil”

    Tom Jobim (Brazilian composer and creator of Bossa nova)

    • The upgrade to Zig 0.13.0 enhanced TigerBeetle’s developer experience with faster build times and nicer error messages. Hats off to the entire Zig team! ⚡ With the new compiler, it was then time to upgrade our build system.

    • By reorganizing modules and build actions, we stopped compiling unnecessary files during tests and fuzzers, further reducing compilation time in some cases.

    • We improved the way we generate our C ABI header file, making the process more idiomatic and intuitive.

    • TODOs and obsolete workarounds could be eliminated after the upgrade, and the new compiler’s ability to detect non-mutated variables is incredibly convenient!

    • As a result, we now have a build.zig file that is a pure delight to read! Special thanks to our friend Jonathan Marler who also lent a hand with his contributions.

    • We refactored an internal iterator to expose a mutable pointer instead of using @constCast. The iterator was originally a constant pointer. However, deep in the stack, we found that a mutable reference was needed. While the cast operation was safe and justified by a comment, exposing a mutable pointer was more direct, also to prevent misuse or unsound mutations. This is an example where communicating intent explicitly in code can be better than relying on comments.

    • There were also further gains to reduce memory footprint! You can now run TigerBeetle in –development mode with an RSS of under 1GB. Most of these gains come from the new ability to configure request size at runtime. The cluster now advertises the maximum request size to clients, allowing the operator to tune for smaller buffers in development or constrained environments.

    • Following our policy that no feature should ship without a few years of deterministic simulation testing (we literally run 100 CPU cores 24x7 with a time acceleration factor of up to 700x per core—to simulate 2 centuries each and every day), we enhanced our simulator to cover our recently added get_account_transfers and get_account_balances operations. While we had unit tests and fuzzers for these functions, they’re now also integrated within our deterministic simulator, which is aware of the rules, and which can explore different scenarios, to perform fine-grained assertions over the results, and test not only for correctness, but also liveness.

    • “You shall not pass!” said the wizard—and he meant it! Within minutes of running the simulator (actually within the simulated world it was the equivalent of several days), we found a critical bug where an early return could leave rejected transfers partially persisted on disk. 🤯

    • In April’s newsletter, we discussed how TigerBeetle uses Protocol-Aware Recovery for Consensus-Based Storage to maximize cluster availability through each replica’s shared durability. On that occasion, we introduced the Grid Scrubber, a background task that iterates through storage to find corrupted blocks and proactively asks other replicas for help recovering faults.

    • Now, we have an interesting new approach: if each replica starts scrubbing the storage from a different position, the cluster-wide throughput of bytes verified is much higher than if all replicas performed the operation in the same order. This approach significantly enhances the probability of finding and repairing corrupted blocks, and is possible only because TigerBeetle’s storage engine was designed to produce “byte-for-byte logically identical” data files (excluding spares holes) across machines. 🩹

    • Ahead of our upcoming multiversion deterministic upgrade process, which will soon allow seamless version upgrades with deterministic safety and reduced downtime, we introduced some features in preparation. Notably, we now guard any undocumented command line arguments behind an --experimental flag. This helps to keep the surface area small and ensures easy forward compatibility across upgrades.

    • Additionally, we added support for asynchronous I/O in operations such as opening and statting files. This capability will be required to inspect files in a non-blocking fashion when checking for a new version.

    • Focusing on developer experience, our documentation page has received new recipes and clarified existing ones. We also released a new tutorial called Tigerlings. Inspired by the excellent Rustlings and Ziglings tutorial formats, Tigerlings allows users to learn how TigerBeetle works by fixing tiny broken examples!

    Clone the repository and enjoy! git clone https://github.com/tigerbeetle/tigerlings

    Joran walked onto the Mastercard stage at Money 20/20 in Amsterdam to speak to a packed crowd about “The Next 30 Years of Transactions Processing”.

    • In the last 7 years, the “transactions of everyday life” have increased 100x to 1,000x, even 10,000x across several sectors (e.g. cloud computing, energy, real time payments). Yet popular transaction databases are 20-30 years old, and even newer cloud databases are circa 2012, designed for a different workload and scale.

    • Research into maximizing durability (and therefore availability) has advanced since 2018 (e.g. storage fault-tolerance, high frequency trading architectures, deterministic simulation testing) but can be hard to retrofit. At the same time, extracting transaction performance from general purpose database designs is becoming more and more expensive.

    • How can we reset transaction infrastructure for the next 30 years? To adapt and apply specialization for three orders of magnitude more performance? How can we evolve our engineering methodologies for tighter tolerances and stricter safety standards?

    • How can we make the hard things look easy? For example, to move multi-cloud replication from the category of Disaster Recovery to the everyday out of the box experience?

    A huge thank you to Money 20/20 for having us, and for the opportunity to “look ahead”!

    TigerBeetle was named to Redpoint’s 2024 InfraRed list of the top 100 transformative companies in cloud infrastructure. You can see the list and read the full report here.

    The 2nd iteration of Systems Distributed was held in NYC on 27th and 28th June, as a benefit for the Zig Software Foundation. Special thanks to our friends at Antithesis and Convex for sponsoring!

    Folks who couldn’t make it—don’t worry, we’ve got you covered! All talks were recorded, and we’ll post the videos on our YouTube as soon as post-production is done! We want to get the ideas out there, and you can subscribe to get notified as soon as they’re up.

    If you can’t contain your excitement until then, here are some highlights! Spanning systems languages and compilers, to databases, distributed systems, and testing:

    • Testing : Kyle Kingsbury presented case studies of popular databases that violated their documented isolation levels while under test using Jepsen. Alex Petrov talked about how to get the best of randomized testing by applying novelty seeking techniques to input generation. John Murray and Brian Lagoda talked about how they used Antithesis to test Antithesis.

    • Abstractions : James Cowling & Sujay Jayakar gave a beautiful talk on the power of abstractions, why SQL isn’t necessarily the best abstraction for application development, and how queries can be real imperative programs. Frank Denis, author of libsodium, took us on a tour of cryptographic primitives and building blocks in the context of authentication (slides are up!). Dominik Tornow explained how to go from async await to distributed async await using elegant abstractions like functions and promises. Amod Malviya discussed how systems thinking can encourage abstractions that make for a better, safer product experience for the end customer.

    • Replication & Consensus: Joran presented the art of consensus, how to move from thinking in terms of physical durability and availability, to logical durability and availability, and the relationship between these, as a utility function that every optimal consensus protocol should solve. Gwen Shapira went into detail about how Nile, a serverless Postgres offering, replicates DDLs across all database tenants across the globe.

    • Scaling (databases & businesses): Deepti Srivastava shared the lessons she learned from two decades of building data infra products and emphasized the importance of ease of adoption, use, and maintainability. Sammy Steele talked about lessons learned from 5 years of building databases at petabyte scale at Dropbox and Figma, and how Figma designed their in-house AWS RDS (Postgres) with horizontal sharding.

    • Programming Languages : Andrew Kelley talked about his experience of building a music player from scratch in Zig. Richard Feldman talked about how one can write distributed pure functions in Roc, and the practical considerations of distributing functions that are known to be pure.

    If you’ve been following us for a while, you may know of our conviction in the power of Deterministic Simulation Testing for shipping safer software sooner. In that spirit, we joined our friends at Polar Signals, Antithesis, Convex, Resonate, and Alex Petrov to announce the formation of the Deterministic Simulation Testing Alliance! (more details to come in time)

    With that, it’s a wrap for Systems Distributed ‘24. See you next year!

    IronBeetle is our weekly live stream into TigerBeetle internals. There’s no better way to understand TigerBeetle than to walk through the code with matklad as your guide. Recent shows covered the layout of TigerBeetle’s storage grid, and TigerBeetle’s unique storage superblock. 🎉

    Mark your calendars🗓️ and join us live every Thursday.

    Thursdays / 10am PT / 1pm ET / 5pm UTC

    twitch.tv/tigerbeetle

    IronBeetle YouTube Playlist

    Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet

    Till next time… wish you were there!

    The TigerBeetle Team

    An early sketch for the cover art for our talk on Durability and the Art of Consensus.

    RSS iconRSS
    An idling tiger beetle Speech bubble says hi