Notes on Amazon's DynamoDB USENIX ATC'22 Paper

This is a very practical paper. It focuses on practical matters such as admission control, non-uniform access patterns, metastability introduced by caches, etc. You won't find fancy distributed system algorithms in this paper. But it's an important paper which covers critical topics nonetheless. A system only delivers real value to…

Leave something to be somebody else's problem

This was one of my favorite posters at Meta (formerly known as Facebook)'s office. I don't have a picture of it on my phone, so I found one from the internet. I like to take ownership of problems that I run into, both at work, and in my personal…

Why TIMEOUTs are hard to get rid of

For the entirety of this post, we assume a non-real-time system. First of all, why do we have TIMEOUT as a type of replies? What if we don't? This means, in some cases, either client or server (or both) can be "blocked" for an unbounded amount of time. With async…

Fast SQL from Schemaless ingestion

This is a note on Louis Brandy and Nathan Bronson's talk at Systems @scale - https://www.facebook.com/atscaleevents/videos/5279535778820968/. Data generated from the customers are often schemaless (e.g. JSON), but fast SQL for analytical queries are desired out of this unstructured data. Having fast query performance…

Notes on Photon - Databricks' query engine over data lakes

This is a note on Databricks' SIGMOD '22 paper - Photon: A Fast Query Engine for Lakehouse Systems, which won the best industrial paper award. At a high level, the paper describes Photon, a vectorized query engine (written in C++) that adapts to the underlying unstructured data at run-time to…

When and How to Invalidate Cache

You can leave comments on HN – https://news.ycombinator.com/item?id=31709501. In this post, I will talk about one way to figure out when to invalidate cache entries. I will use a specific setup as an example, which should still be general enough, and at the end of…

Cache Invalidation

Cache made consistent: Meta’s cache invalidation solutionWhen it comes to cache invalidation, we believe we now have an effective solution to bridge the gap between theory and practice.Engineering at MetaLu Pan [https://engineering.fb.com/2022/06/08/core-data/cache-invalidation/]We wrote a post on Meta/FB's engineering…

Feedback Loops

Any complex systems e.g. an economy seem to have feedback loops. Digital systems are no exceptions. Feedback loops add significant amount of complexity to a system, which makes it harder to manage. However, a philosophical point might be that it's exactly the presence of feedback loops that makes a…