Engineers.SG

Published on: Monday, 11 November 2019

Speaker: Tanay Tummalapalli, Student

Apache Beam is a unified programming model for implementing batch and streaming data pipelines that run on any distributed processing back-ends. It is particularly useful for Embarrassingly Parallel data processing tasks as well as for ETL tasks and pure data integration. This talk will cover the fundamentals of stream processing with the Beam model as well look at examples to illustrate it's use. The Python SDK of Apache Beam enables the Python data science community to use existing tools and software with Apache Beam to leverage its unified model for batch and streaming, to write data processing pipelines that can run on distributed processing back-ends like Apache Spark, Apache Flink, etc. https://beam.apache.org/

About the speaker

Google Summer of Code '19 Student @ Apache Beam. Former Intern @ SocialCops, GigSync. Interested in Data Engineering, Databases, Stream Processing, Distributed Systems, and Machine Learning. 21 | INTP | Music Nerd Github: https://github.com/ttanay LinkedIn: https://www.linkedin.com/in/ttanay

Event Page: https://pycon.sg/

Produced by Engineers.SG

« Back

Organization

PyConSG

Stream Processing Fundamentals with Apache Beam - PyCon SG 2019

Organization