Compare Small Versions that Wrap Around

Why use small versions Almost all stateful services (data store, cache, etc.) have versions stored along with each piece of data. The version can be in the form of a timestamp, version (a.k.a logical clock), or even better -- Hybrid Logical Clock. If it's a data store, HLC…

Notes on the Amazon Aurora Paper

This is my notes on the paper: Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. What's Amazon Aurora Functionally speaking, an instance of Aurora is same as an instance of MySQL. The differences are Aurora decouples compute from storage and it's fault-tolerant. It supports all ANSI isolation levels…

Non-blocking 2pc

2pc is a Blocking Protocol Two Phase Commit is a blocking protocol. It blocks when Coordinator is not available. Not only the transaction cannot make progress. Other transactions that conflict with the same set of keys are also blocked. Non-blocking 2pc Alternative Daniel Abadi proposed a non-blocking alternative for 2pc…

Proper Boost Installation on OSX

boost is THE C++ library out there. Many features were introduced in boost before adopted in C++ standard. I want to have it installed on my local Mac laptop, so I can easily play with it. homebrew formula not working brew install boost is the first thing I did and…

Debugging Random Program Behavior

I was looking at a weird coredump the other day. From the core, the program was trying to write to virtual address 0x6 and crashed on memcpy. There's a piece of code looks like if (a == 1) { do_foo(); } else { do_bar(); } And from the coredump, a is indeed 1.…

Consistent Badge Count at Scale

-- Scalable Read Atomic Transaction for Partitioned DatastoreA StoryYou are building a messaging app. You start with a non-partitioned single database, where you store unseen message count and the actual messages in two different tables. It served you well ... until more and more people are using your app and the…

Paxos at its heart is very simple

The Paxos algorithm, when presented in plain English, is very simple. -- Leslie Lamport 2001 [1] I am going to try explain the single decree Paxos in a way, hopefully, that's easy to understand. To this day, IMO, John Ousterhout's lecture on Paxos it still THE BEST out there. The…

Paxos Replication vs. Leader-Follower Replication

If you are operating some stateful services, chances are you have to replicate your data. There are use cases of single replica database, in which cases, the user can tolerate dataloss. It's optimizing for throughput, low latency, etc. But vast majority of stateful services have to deal with replication. Replication…

请不要再称数据库是CP或者AP (Please stop calling databases CP or AP)

经Martin Kleppman本人同意,这篇文章是他英文原文的中文翻译。Authorized by Martin Kleppmann, this is a Chinese translation of his original blog post. 在 Jeff Hodges 精彩的博客文章给年轻人关于分布式系统的笔记中,他建议我们用CAP定理来评论系统。 很多人都听取了这个建议,描述他们的系统为"CP" (有一致性但在网络分区的时候不可用),“AP”(可用但是在网络分区的时候不一致) 或者有时候 "CA" (说明"我还没有读过Coda的五年前的文章")。 我同意Jeff的所有观点。唯独他关于CAP定理的观点,我必须表示不同意。CAP定理本身太简单化而且被广泛的误解,以至于在描述系统上没有太多用处。因此我请求我们不要再引用CAP定理,不要再讨论CAP定理。取而代之,我们应该用更精确的术语来理解我们系统的权衡。 (没错,我意识到很讽刺的是我不希望别人再讨论这个话题,但我却正在一篇关于这个话题的博客文章。…