Wednesday, December 12, 2018

GCP : Cloud Data Flow & Apache Beam SDK

Cloud Dataflow supports fast, simplified pipeline development via expressive Java and Python APIs in the Apache Beam SDK, which provides a rich set of windowing and session analysis primitives as well as an ecosystem of source and sink connectors. Plus, Beam’s unique, unified development model lets you reuse more code across streaming and batch pipelines.

below is a conceptual diagram for Apache Beam




Below given some concepts of Apache beam

Unified : Use a single programming model for both batch and streaming use cases.
Portable : Execute pipelines on multiple execution environments.
Extensible : Write and share new SDKs, IO connectors, and transformation libraries.



references:
https://cloud.google.com/dataflow/

No comments:

Post a Comment