Tuesday, December 03, 2019

Revisit the Alibaba Container Framework


As per the interview with Fei Gou at Alibaba:

https://www.infoq.com/presentations/alibaba-kubernetes/?utm_source=email&utm_medium=devops&utm_campaign=newsletter&utm_content=10082019


"We enhanced the upstream services from two aspects; we call one the new controller, we call the other one statefulsets. The first thing we do is we introduce a so-called "in-place upgrade," which means if the container images’ only component is getting upgraded in a spec, the controller will not recreate a new pod. Instead, the Kubelet will just restart the pod with the new image, and the pod is kept the same. The benefit of this feature is that in this way, the pod states are preserved; the pod states, including the IP and all the PV configuration, they are all kept the same during the upgrade, and you avoid a lot of unnecessary rescheduling, because you still need to call the scheduler and go through the all the scheduling pods if you want to start a new pod. If you do the in-place upgrade, those offhand can be just eliminated."

"We leveraged the standard Kubernetes CRD controller and the scheduler plugin to implement all the features that we wanted to implement to extend to upstream Kubernetes. When we designed these new controllers, new CRDs, we put the scalability as our first constant when building the new components."

https://openkruise.io/en-us/docs/advanced_statefulset.html

We still want the same shells for our containers we just want to replace the containers... 

It could be the solution for replacing containers in pod; this was needed for "Inside the pod" playbooks, where a QB container directed a set of worker containers to complete a task.

If the QB could be substituted along with the other offensive players without calling a timeout ( actually rescheduling the whole pod) it would probably be faster.  

If the PodGroup concept from Volcano could be combined with the container replacment capabilities of OpenKruise, it would enable the Pod version of the DSL for batch operations to be realized.




No comments: