Whereas Operators are about adjusting at runtime, the CDK is about provisioning and mainly static in this sense. What is needed is to be able to change the fabric at runtime, to start with a base and then have the ability to change it in response to the current compute needs. There is the ability to create auto-scaling groups in EKS - it may be be possible to update the number of requested nodes via Boto3 from outside of the cluster. This might make the system more dynamic. In fact, additional auto-scaling groups might be provisioned and removed according to an execution blueprint working from outside the cluster.
I am exploring creating a Django application that controls the deployment of EKS clusters to AWS via the CDK. It might be possible to expand this application to handle sending the messages from boto3 vs directly in an operator. This might be cleaner than previous designs in that Operators would not need to connect outside of K8's to the cloud provider. The management application would call down to the K8's layer and Operators would take over from there.
This would give the system the ability to be more responsive than out of the box auto scaling could provide. As was mentioned several times in previous posts, the key to scaling is providing the extra resources with minimal delay to the flow. As with runners transferring a baton in relay race, the next runner starts before the hand-off not at the hand-off. This separation of concerns would allow that. The Django CDK layer could create blueprints for execution and start up K8's nodes ahead of time so they would be ready to accept input at the correct point in the chain. So there would be no lag waiting for the system to scale up.
No comments:
Post a Comment