Notes on the Spanner: becoming a SQL system paper

I have wanted to read the paper, Spanner: becoming a SQL system from SIGMOD 2017 for a while and finally got a chance to finish reading it. I have mixed feelings. The paper mostly focuses on the DQL aspect of SQL and only briefly mentioned DML and locking. I like…

Linearizability is more than Capturing Causality Everywhere

Linearizability is one of the strongest single-object consistency models, and implies that every operation appears to take place atomically, in some order, consistent with the real-time ordering of those operations (https://jepsen.io/consistency/models/linearizable). Quite often people make the mistake of over-simplifying linearizability as the ability of being…

IP as distributed data in the cloud

I previously wrote a post on reasoning DNS as a distributed database. In the same spirit, today, let's take a look at IP as distributed data (the Internet would be a distributed database in this analogy). This is inspired by a very interesting blog post from Cloudflare, https://blog.cloudflare.…

Bet using a token bucket

This post is a slightly expanded version of my earlier tweet. In software systems, "betting" is actually a common practice. One of the most common bets people place is retry. When the first attempt doesn't work, we can just "try it again". Retry is a bet – you are betting on…

Notes on Amazon's DynamoDB USENIX ATC'22 Paper

This is a very practical paper. It focuses on practical matters such as admission control, non-uniform access patterns, metastability introduced by caches, etc. You won't find fancy distributed system algorithms in this paper. But it's an important paper which covers critical topics nonetheless. A system only delivers real value to…

Leave something to be somebody else's problem

This was one of my favorite posters at Meta (formerly known as Facebook)'s office. I don't have a picture of it on my phone, so I found one from the internet. I like to take ownership of problems that I run into, both at work, and in my personal…

Why TIMEOUTs are hard to get rid of

For the entirety of this post, we assume a non-real-time system. First of all, why do we have TIMEOUT as a type of replies? What if we don't? This means, in some cases, either client or server (or both) can be "blocked" for an unbounded amount of time. With async…