Building Data Pipelines with Monitoring & Observability
Jiaqi Liu
"Data pipelines, with many layers of transformations, machine learning logic, and movement from different data sources, can often be challenging to build and maintain. As a result, it's valuable to not only be able to test and monitor your code but also validate and audit the data that you are working with.
We’ll discuss, what it means to have observability in a data pipeline, what key features that allows a data pipeline to be easily testable and observable. We'll also look at how to identify timeseries metrics that can be used to monitor the health of a data pipeline and how we can build tracing data lineage into the core foundation of your data jobs."