Friday, June 26, 2020

From Time Slicing to On Demand using the Data DSL




In previous posts I have touched upon type safety in data and isolating compute resources to the scope of a single calculation or set of related calculations. This resulted in the idea of the "resource graph"

It may seem that all data will need to be entirely replicated for all the pieces.  Now sure, this may be the simple solution, but, since we have defined the Data DSL as a separate concern, we are free to change the implementation independent of program execution.

All that is really needed is data isolation.  This may not require total data duplication.  We are free to optimize the specifics of how data is both retrieved and stored by abstracting this part out of the application layer.  Now the problem is simpler... We are not mixing application logic - which is trying to solve a computation (Execution DSL) with logic that is supplying data (Data DSL).  We are free to optimize to provide data locality and efficiently reuse static data.  

Remember, the Data DSL not only will perform the query but also manages the data resources - it has control over positioning of the data and is able to guarantee it will be on a particular node.  This in combination with the resource DSL provisioning the compute resources is how data locality is achieved. 

It was previously suggested that materialized views be constructed to provide this type safety in data.  Each set of views corresponding to a point in time.  The reasoning behind this was wanted lock free access to data and the guarantee of immutability.  But what if standard views could be used?  With an in memory database like MemSQL it may not be necessary to "materialize" each set of views to achieve performance.

Or maybe parts of a data set for a given computation might be materialized and other parts not?   There are options here.  This is the job of the Data DSL... It's not part of the application layer.  This is a cross cutting concern for ALL compute-centric applications (our focus). 

Power through separation of concerns.  This is what makes the complex simple, building larger systems out of simpler pieces.




No comments: