Real-Time Stream Processing

This two-day course teaches data engineers how to process unbounded streams of data in real-time using popular open-source frameworks. We focus mostly on Apache Flink – the most promising open-source stream processing framework that is more and more frequently used in production. Additionally, we provide short introductions to Spark Streaming, Apache Storm and Apache Samza to let students know about existing alternatives to widen their perspective and help to find the best tool for their use-cases.

During the course we simulate real-world end-to-end scenario – processing logs generated by users interacting with a mobile application in real-time. The technologies that we use include Kafka, Flink, HDFS, YARN and Elasticsearch. All exercises are done on Hadoop clusters running on a remote multi-node cluster.

Target Audience

Data engineers who are interested in leveraging large-scale and distributed tools to process streams of data in real-time. Some experience coding in Python, Java, or Scala, plus basic familiarity with Big Data tools (e.g. Hadoop, Spark) is assumed.

Course Agenda

Day 1

  • Introduction to use-case – StreamRock
  • Apache Kafka
    • Key concepts
    • Daemons and cluster infrastructure
    • Hands-on exercises
  • Elasticsearch
    • Key concepts
    • Daemons and cluster infrastructure
    • Hands-on exercises
  • Apache Flink
    • Key concepts
    • Basic API
    • Time & Windows
    • Integration with Kafka and Elasticsearch
    • Hands-on exercises

Day 2

  • Apache Flink (cont’d)
    • Stateful operators
    • Hands-on exercises
    • Advanced features
    • Daemons and cluster infrastructure
    • Best practices
    • Hands-on exercises
  • Bonus 1 – Apache Storm
    • Key concepts
    • Live demo
  • Bonus 2 – Spark Streaming
    • Key concepts
    • Live demo
  • (Optional) Bonus 3 – Apache Samza
    • Key concepts
    • Live demo
  • Discussion – Comparison of stream processing tools

Our Approach

The training provides a carefully prepared mix of theory, exercises, demos, discussions, quizzes and … fun! We make sure that each participant is highly engaged in hands-on exercises, discussions and teamwork exercises.

More Information

Please contact us for any questions on training courses, or if you would like to discuss a custom, on-site training course.