Cloud Data flow for Data Processing
Goal: In this module, you will learn how to develop and execute a variety of data processing patterns using Dataflow processing and how to manage cluster using Dataproc service.
Objective: Upon completing this module, you should be able to understand:
- Build dataflow pipeline
- How to create a maven project with Dataflow SDK
- How to create and execute streaming pipeline using Dataflow template
- How to create pipeline on Beam
- Testing pipeline
- Create/Manage/Delete cluster using Dataproc service
- How to run a job on cluster
- Using APIs to automate jobs
Topics:
- Dataflow services
- Stream and Batch processing
- Apache Beam SDK
- Monitoring using Stackdriver
- Data transformation with Cloud Data flow
- Working with Dataproc
- Creating Cluster
- Managing cluster
- Automation of jobs