Google's Spanner paper was published in 2012 which famously introduced the TrueTime API. PTP is the standard equivalent of TrueTime. Two years later, Kulkarni, et al. published the Hybrid Logical Clock paper – Logical Physical Clocks and Consistent Snapshots in Globally Distributed Databases. In the HLC paper, they list the downsides of TrueTime,
- TrueTime requires special hardware,
- Commit-wait introduces delay and reduces concurrency.
(For readers less familiar with Spanner, here's a summary I wrote years ago.)
The HLC paper claims that it can also provide consistent snapshots in distributed databases by capturing causality in every network communication and within each progress. It offers a set of theorems and corollaries to support it. It's pretty straightforward when explained in English. If event
e happens-before event
f, HLC(e) must be less than HLC(f). If HLC(e) == HLC(f),
f must be concurrent events. In this case, if one takes a snapshot at
t across multiple databases, you will get a consistent snapshot back, in a sense that no one would be able to detect any anomalies. Or is it so?
Externally vs. Internally consistent snapshots
There is nothing wrong or inaccurate about the HLC paper, except that there is a very important definition about "happens-before" in the Introduction section.
The causality relationship captured, called happened-before (hb), is defined based on passing of information, rather than passing of time.
Happens-before in this context is only referring to the captured causality relationship. What if there is happens-before relationship that's not captured? It's entirely out of the HLC scope; and users can observe anomalies when the system misses out-of-band communication and the causality it implies.
I call HLC-based snapshots Internally Consistent Snapshots.
TrueTime/PTP on the other hand can order events without communication; and it can be externally consistent. I call PTP/TrueTime based snapshots Externally Consistent Snapshots.
Notice that GPS and atomic clocks only need to be inside the system to provide External Consistency with PTP/TrueTime; while HLC's guarantee is strictly limited by the scope where causality can be captured. This is because all the components that require access to PTP/TrueTime can be internal. E.g. Only database servers perform the commit-wait; and we can put a client proxy server to mint a timestamp for linearizable reads even when the customers themselves do not have access to atomic clocks.
Use-cases of Internally Consistent Snapshots
Semantically speaking External Consistency is obviously better than Internal Consistency; but it has a cost. Internal consistency is still very useful in practice. Say you have 100 databases, these 100 HLCs define a total order. Internally, we pretend this total order is consistent with physical time. HLC-based snapshots will be consistent within this total order, and that is very powerful. You can build systems internally that leverage this total order and snapshot capability. E.g. you can build a push-based SQL execution engine, in which it can perform
join asynchronously against an internally consistent snapshot.