A Change-Data-Capture use-case: designing an evergreen cache - GeekcampSG 2020

Published on: Sunday, 4 October 2020

Skip to talk at 0:51 • Q&A at 25:42 and in the description

In this talk, Nicolas Frankel will describe an easy-to-setup architecture that leverages CDC to have an evergreen cache.

You might have read about Change-Data-Capture before. It’s been
described by Martin Kleppmann as turning the database inside out: it
means the DB can send change events (SELECT, DELETE and UPDATE)
that one can register to. Just opposite to Event Sourcing that
aggregates events to produce state, CDC is about getting events out of
states. Once CDC is implemented, one can subscribe to its events and
update the cache accordingly. However, CDC is quite in its early stage,
and implementations are quite specific.

Nicolas Frankel is a developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts (such as telecoms, banking, insurances, large retail and public sector). Usually working on Java/Java EE and Spring technologies, but with focused interests like Rich Internet Applications, Testing, CI/CD and DevOps. Currently working for Hazelcast. Also double as a teacher in universities and higher education schools, a trainer and triples as a book author.

Slides at: https://www.slideshare.net/nfrankel/geekcampsg-2020-a-changedatacapture-usecase-designing-an-evergreen-cache

Q: What's the granularity? Rows of specified tables? How are the rows identified? Schema-defined primary key?
A: Basically, you are sent the events that happened the payload being the row itself including the pk you can do everything you want with it in my demo, I'm updating row by row hope it answers the question. Here's the link to the full-fledged blog post https://jet-start.sh/blog/2020/07/16/designing-evergreen-cache-cdc

Q: Is there any benchmark on the performance impact on the database server?
A: There’s no benchmark on the database, because there’s no impact on the database. The Jet job is external to the database and reads the binary log. However, there’s an impact of activating this binary log, which needs to be activated for replication anyway. There should be a benchmark for each database, but you can guesstimate ~10% of performance loss due to making sure the file is written before changing the database state.

Q: Is it scalabile horizontally?
A: Yes. By design, Jet distributes its jobs over nodes available in the network (and local cores!).

Visit https://geekcamp.sg for more information about GeekcampSG