How FoundationDB works and why it works

FoundationDB is a very impressive database. Its paper won the best industry paper award in SIGMOD'21. In this post, I will explain, in detail, how FDB works and discuss a few very interesting design choices they made. It's a dense paper packed with neat ideas. Many details (sometimes even proof…

MyRocks and Repeatable Read Isolation

tl;dr: Repeatable read isolation on MyRocks is effectively Snapshot isolation (unless it's affecting an unique index). According to MySQL reference, REPEATABLE READ This is the default isolation level for InnoDB. Consistent reads within the same transaction read the snapshot established by the first read. This means that if you…

BW-Tree

Traditional B-Tree has a few noticeable downsides e.g. heavy disk IO (for in-place update), and low space utilization (as B-Tree leaves a lot of space on the table to avoid frequent splits and merges). BW-Tree(https://www.microsoft.com/en-us/research/publication/the-bw-tree-a-b-tree-for-new-hardware/) is a B-Tree variant proposed by…

Notes on the Google Spanner Paper

This is my notes on the paper: Spanner: Google’s Globally-Distributed Database. I will first summarize the paper and try to explain how it works. At the end, I will list my opinion and questions about Spanner. What is Spanner It's a globally replicated database that hides sharding and replication.…

Notes on the Amazon Aurora Paper

This is my notes on the paper: Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. What's Amazon Aurora Functionally speaking, an instance of Aurora is same as an instance of MySQL. The differences are Aurora decouples compute from storage and it's fault-tolerant. It supports all ANSI isolation levels…