If there is one thing that defines Elon - IMHO opinion it's his ability to aim at the right target. This cannot be an accident. Therefore in order to improve my aim, I have decided to learn more about how he thinks.
How to think like Elon:
https://www.youtube.com/watch?v=L-1F-dV66ko
https://www.youtube.com/watch?v=9SKyrqCvPtY
Update 2/4 : As it turns out, this discussion on windowing has some nuances that are described in this: https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102. Appears that in order to aim at the right target more research may need to be done continuing down the path of first principles thinking.
Question: What are we trying to accomplish? In figuring out what to aim at, I am seeing that the benefit of reworking the idea of streaming inside K8's might be its ability to abstract out data mutability. This would be the paradigm shift.
Note: using the streaming libraries I see requires that you conform to their APIs
As I have been saying, the game changer here might be the fact that you can use lower-level substrates to partition the problem… then you just use normal code without any API other than accept input return output.
The lower level architecture abstracts you from having to couple yourself to APIs
This should give your code base added flexibility.
Looking deeper into streaming using APIs… I am hesitant to give up the control that K8's gives you with respect to scheduling and location of resources etc. But, you don't want to throw the baby out with the bathwater. So I continue the research.
I don’t want to have to conform to an API that automagically does the scheduling etc. As it appears to me K8's may be able to do some of those things at a lower level; thereby removing the burden of API conformance on the upper-level code base. APIs allow the codebase to benefit from abstractions, not saying that there is no benefit. Just in doing so they also couple you to the API. Maybe there is another way to isolate your code from the complexities inherent in the streaming problem?
Note: Why would I want to handle scheduling myself? I think its part of the "uber language of compute". Your engine should be able to make scheduling decisions based on runtime input. Pushing this out to an API prevents being able to control this. Thinking that AI can be applied to scheduling and resource management - Figuring out when to start pods during a long-running sequence of steps in a graph. See previous posts on "ship to ship data transfer"
No comments:
Post a Comment