jmenke blog: Avoiding Kafka: Using debezium and Hazelcast Jet.

https://hazelcast.com/blog/designing-an-evergreen-cache-with-change-data-capture/

Jet also provides a Debezium module where it can process change events directly from the database, and write them to its distributed key-value store. This avoids having to write the intermediate messages to Kafka and then read again to be written to a separate cache.

Hazelcast can support distributed partitioning also. I have been taking the approach that data should be partitioned via the data DSL into node level islands - this providing data locality. This is different than a fully distributed data store. Instead it was sharded localized data. It appears filtering in the Jet Pipeline or Debezium connector could provide this shared localized data by filtering.

Note that while Jet pipelines provide filtering capabilities, it’s also possible to filter items in the CDC connector to optimize load of the data pipeline

A modified layout using a Hazelcast instance as a sidecar container in each Pod. The link between the Central Data Source is not not over Kafka but using Debezium.

Another modification could be a Hazelcast instance inside each container but this would be application specific again. Hazelcast needs Java to run the instance.

In the diagram below the pods could be any language with a Hazelcast client: Java, Scala, . NET Framework[1], C++, Python, Node. js, Go and Clojure

Note: there is a hazelcast operator:

https://github.com/hazelcast/hazelcast-operator/blob/master/hazelcast-operator/README.md

Example code for the pipeline is here:

https://github.com/hazelcast-demos/evergreen-cache/blob/master/pipeline/src/main/java/org/hazelcast/evergreencache/UpdateCacheJob.java

https://hazelcast.com/blog/jet-4-2-is-released/

jmenke blog

Sunday, August 09, 2020

Avoiding Kafka: Using debezium and Hazelcast Jet.

No comments: