View from Apache Flink on Evolution & Outlooks for the Modern Stateful Stream Processor | Ververica

Published on: Thursday, 25 July 2019

Get the slides:


Stream Processing has evolved quickly in a short time: a few years ago, stream processing was mostly simple real-time aggregations with limited throughput and consistency. Today, many stream processing applications have complex logic, strict correctness guarantees, high performance, low latency, and maintain large state without databases. Since then, stream processing has become much more sophisticated because the stream processors – the systems that run the application code, coordinate the distributed execution, route the data streams, and ensure correctness in the face of failures and crashes – have become much more technologically advanced. In this talk, we walk through some of the techniques and innovations behind Apache Flink, one of the most powerful open source stream processors.

In particular, we plan to discuss the evolution of stateful stream processing, Flink’s approach of fault-tolerance with distributed asynchronous and incremental snapshots, and how that approach looks today after multiple years of collaborative work with users running large scale stream processing deployments. Furthermore, we plan to discuss how stream processing is outgrowing its original space of real-time data processing and is becoming a technology that offers new approaches to data processing (including batch processing), real-time applications, and even distributed transactions.


Tzu-Li (Gordon) Tai is a Committer and PMC member of the Apache Flink project, and Software Engineer at Ververica. His contributions in Flink spans various components, including some of the most popular Flink streaming connectors (e.g. for Apache Kafka, AWS Kinesis, Elasticsearch, etc.), Flink's type serialization system, as well as several topics surrounding evolvability of stateful streaming applications. He is a frequent speaker at conferences such as Flink Forward, Strata Data, as well as many meetups related to Apache Flink or data engineering in general.

Data Council ( is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.