m Apache Beam

m Apache Beam

Apache Beam is a unified model for defining both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and Runners for executing them on distributed processing backends, including Apache Flink, Apache Spark, Google Cloud Dataflow and Hazelcast Jet.

Key Points

  1. Data pipelines connect, transform data sources to data targets in batches or event streams
  2. Beam provides high-level, portable data pipeline processing model over data services runtimes
  3. Beam input data events can come from Spark, Flink, Kafka and more that arrive over time
  4. Beam processing done via SDK: SQL, Java, Python, Go and ??
  5. Beam can perform many transformations from existing libraries and new application logic
  6. Implement batch and streaming data processing jobs that run on any execution engine.


youtube Apache Beam overview 2019Beam overview 2019
https://github.com/apache/beamBeam github
https://beam.apache.org/documentation/Beam docs
https://beam.apache.org/documentation/runtime/model/Beam execution model
https://beam.apache.org/documentation/resources/learning-resources/Beam learning resources - interactive examples ***
https://beam.apache.org/documentation/programming-guide/Beam Programming Guide


Apache Big Data projects
https://www.educba.com/my-courses/dashboard/Data engineering education site:  educfba
https://beam.apache.org/get-started/quickstart-java/Java quickstart
https://beam.apache.org/get-started/quickstart-py/Python quickstart

Key Concepts

Apache Beam Overview 2

youtube Apache Beam overview 2019

Big Data - variety, volume, velocity, variance

Which data framework to use?

Beam Vision

Beam processing details

Parallel Do functions

Per Key aggregations

event time windowing output


Where we are in March 2019


Apache Beam - Java Quickstart


Apache Beam - Python Quickstart


Potential Value Opportunities

Potential Challenges

Candidate Solutions

Step-by-step guide for Example

sample code block

sample code block

Recommended Next Steps

Related content