Lu Pan - Lu's blog (Page 3)

Generic Fault Tolerant Distributed Critical Section is Impossible

By Lu Pan in distributed system on 04 Oct 2022

In this post, we will show that a generic distributed critical section can't satisfy both safety and liveness properties with even a single faulty client process. System Model We assume a non-real-time system with no bound on process time or network delay. In the system, there exists a lock service…

Notes on Amazon's DynamoDB USENIX ATC'22 Paper

By Lu Pan in database on 13 Aug 2022

This is a very practical paper. It focuses on practical matters such as admission control, non-uniform access patterns, metastability introduced by caches, etc. You won't find fancy distributed system algorithms in this paper. But it's an important paper which covers critical topics nonetheless. A system only delivers real value to…

Leave something to be somebody else's problem

By Lu Pan in problem on 26 Jul 2022

This was one of my favorite posters at Meta (formerly known as Facebook)'s office. I don't have a picture of it on my phone, so I found one from the internet. I like to take ownership of problems that I run into, both at work, and in my personal…

Why TIMEOUTs are hard to get rid of

By Lu Pan in timeout on 14 Jul 2022

For the entirety of this post, we assume a non-real-time system. First of all, why do we have TIMEOUT as a type of replies? What if we don't? This means, in some cases, either client or server (or both) can be "blocked" for an unbounded amount of time. With async…

Fast SQL from Schemaless ingestion

By Lu Pan in rockset on 01 Jul 2022

This is a note on Louis Brandy and Nathan Bronson's talk at Systems @scale - https://www.facebook.com/atscaleevents/videos/5279535778820968/. Data generated from the customers are often schemaless (e.g. JSON), but fast SQL for analytical queries are desired out of this unstructured data. Having fast query performance…

Notes on Photon - Databricks' query engine over data lakes

By Lu Pan in olap on 01 Jul 2022

This is a note on Databricks' SIGMOD '22 paper - Photon: A Fast Query Engine for Lakehouse Systems, which won the best industrial paper award. At a high level, the paper describes Photon, a vectorized query engine (written in C++) that adapts to the underlying unstructured data at run-time to…

When and How to Invalidate Cache

By Lu Pan in cache on 11 Jun 2022

You can leave comments on HN – https://news.ycombinator.com/item?id=31709501. In this post, I will talk about one way to figure out when to invalidate cache entries. I will use a specific setup as an example, which should still be general enough, and at the end of…

Cache Invalidation

By Lu Pan in cache on 08 Jun 2022

Cache made consistent: Meta’s cache invalidation solutionWhen it comes to cache invalidation, we believe we now have an effective solution to bridge the gap between theory and practice.Engineering at MetaLu Pan [https://engineering.fb.com/2022/06/08/core-data/cache-invalidation/]We wrote a post on Meta/FB's engineering…