Positioning Data - laying down the infrastructure for compute
Instead addition to having a central database, using MemSQL as caching mechanism for local data could reduce latency for requests from the main database.
- Node: MemSQL Single node cluster acts as the cache for data on a Node.
- Append only ingest from Kafka for lock-less updates and reads
- Debezium feed from Central Source
- Pod: FileSystem can house data that doesn't need to be accessed via SQL
- Can be shared between Containers in the pod
- Container: In Memory Cache inside Containers - application level data storage
Data is provisioned from the central data source "to the curb" (think fiber internet) where the Curb is the Node. From the Node level data source data flows into the Pod and individual containers.
A design like this could provide a possible starting point in defining the physical layout of data defined by the Data DSL. Both relational and file based data could be distributed down to the compute layer from centralized storage.
It is thought that this base structure could be viewed as satisfying a crosscutting concern (data positioning) and could exist alongside worker containers.
Note there is a Rack context that currently does not have a representation in the data hierarchy in the diagram, it would however be possible add a Rack layer cache.
Update 8/9:
It should be possible to use a different database if :
the database supports single node instances
the database supports single node instances
data import from a kafka feed - or other messaging system - is possible
A database would hold data at the node...but its still not available in process. A container level cache was described above but what if the import was directly to the in-memory cache in the container? This would be application specific in this design. Not handled at the infrastructure layer. The cache implementation would depend on programming language etc.
No comments:
Post a Comment