In a previous post I had presented the idea of "higher" level patterns existing in "2nd layer" and beyond code. I had said patterns would emerge in this "2nd layer" - the orchestration layer. The first pattern identified was that of iteration with its applicability to time slicing. Now there appears to be second pattern that is emerging for processing hierarchical data sets with modular logic and data locality.
The ability to cycle through different containers in a pod
provides the ability to process hierarchical data sets the don't have
homogenous compute requirements - meaning that more than one type of operation is performed in the chain.
Parent levels could pass data via the pod to their children and visa versa. Processing of the hierarchy could with any generic graph traversal algorithm. Breadth first, Depth First, dependency graph, etc. Data could be persisted to network storage for high availability but kept locally for speed during processing of the tree. This could keep the logic in the application modular where each container performed a specific task. This in turn could help with versioning and maintenance as changes could be isolated to single container definitions.
In previous posts I had discussed an inner and outer scope for orchestration. I think Macro and Micro are good terms for this concept.
Possible patterns for orderly execution of compute
Macro - this is for groups of pods
- Cycling data through pods to digest streaming data
- Circular Queue
- Graph Execution
- Depth First - sequential
- Breadth First - sequential
- Graph with dependencies - sequential
- Fan Out - parallel
- Fan In - parallel
- Mode of communication.
- To System: via Operator Framework
- From Pod to Pod
- Network - (hits the router)
- File System - network share
Micro - this is inside a Pod; moving data through containers in a pod and changing containers as you go
- Cycling - bring sets of containers in and out scope via Advanced Statefull set
- Circular Queue
- Graph Execution
- Depth First
- Breath First
- Graph with dependencies
- Fan Out - parallel
- Fan In - parallel
- Mode of communication
- To System - Kubernetes client
- From container to container
- Network - shared IP
- File System (disk) - shared scratch disk (RAM)
- IPC - only available in Micro mode
Where does this Macro code live? In an Operator. I would say it needs to be coded in Go as Go has channels which are well suited for implementation of all of the patterns. Kubernetes infrastructure provides work queues and access to the ETCD data store for fault tolerance - keeping the state of the machine at the Macro level.
Moving into the Micro level you are at the level of the Pod. But, this pod is self aware, it knows it's part of a Macro structure. The brains of the pod are located in a quarterback sidecar container. The quarterback container knows about the containers it holds and has access to the Kubernetes API; this is where it can define the Micro execution flow of containers.
It might be possible that the "pattern" code for both the Macro and Micro APIs could have common sections. The "pattern" logic is the code implementing the Depth First algorithm, the Breadth first algorithm, etc.; this could be a type of flow language or API. One interesting consequence of having this flow language / API would be that it could be generated or programmed by some higher level code. If this is possible then entire systems could self bootstrap from data.
This would lead to the next level. It would be interesting to see if both of these layers could be abstracted further. If these very structures (sections of micro and macro logic) could be themselves orchestrated then we start to get higher level power. This was mentioned in earlier posts as a progression to higher and higher levels of aggregation.
No comments:
Post a Comment