Saturday, June 06, 2020

Snapshots and Determinism

Laplace's Demon, Determinism, Materialized Views, and Debugging




We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.
— Pierre Simon Laplace, A Philosophical Essay on Probabilities[3]


See:  https://www.thegreatcoursesplus.com/mysteries-of-modern-physics-time - Lecture 6: Reversibility and the Laws of Physics  

Systems built on systems.  Doesn't matter how complex - layer upon layer.  If you don't have entropy you can reverse time (debug). 

Introducing entropy (in this context that would be data mutability) destroys reversibility. I am not saying this can be applied to predict the future, just noting the impact that data mutability has on debugging.  This is similar to how fixed windows are associated with correctness as described in https://www.oreilly.com/radar/the-world-beyond-batch-streaming-101/

Also if you have incomplete / incorrect / mismatched information, then as per Chaos theory, tiny errors will propagate and multiply through the system.

Snapshots are the key to being careful when taking state.  Assures no entropy.  Just need to be global in nature to be useful.  An impossibility in the scope that Laplace had envisioned, but within the context of well defined computations, very possible.

Views that represent state at a particular moment in time.  Ability to apply SQL to the view itself, not just flat data.  It would be query-able.

This is all part of the bigger picture and why Envoy is at the heart of this solution.

Single compute instance synchronizing operations for many different time periods - or single compute instance targeted to a single time slice???  Easier to achieve Laplace's vision if the data is immutable.  Definition of a compute instance is flexible, could be a graph structure.

You can either devote large portions of your system to change the nature of this reality....or build systems that embrace it.





For Ingestion example see:  https://www.memsql.com/forum/t/ingestion-from-mongodb/173

MemSQL does not support materialized views in the current version, but it is on the Roadmap.

See https://www.memsql.com/forum/t/materialized-views-in-the-future/2033 

Note simple temp tables might be able provide the materialization and this might also be handled by parameterized queries. The idea here is that access to time sliced data would be simple and not have to do filtering - the data would be pre-aggregated into time-slices.

No comments: