Skip to content

robert0714/mastering-kafka-streams-and-ksqldb-2021

 
 

Repository files navigation

Mastering Kafka Streams and ksqlDB

Code repository for the upcoming O'Reilly book: Mastering Kafka Streams and ksqlDB by Mitch Seymour

Available Editions and Versions

Edition Kafka Streams version ksqlDB version Publication Date Branch
Early Release 2.6.0 0.12.0 May 2020 early-release
1st Edition 2.7.0 0.14.0 February 2021 1st-edition
main 2.7.2 0.25.0 May 2022 [master][master]

The Streams API is not compatible with Kafka clusters running older Kafka versions (0.7, 0.8, 0.9).

Confluent Platform and Apache Kafka Compatibility

Confluent Platform Apache Kafka® Release Date Standard End of Support Platinum End of Support
7.1.x 3.1.x April 5, 2022 April 5, 2024 April 5, 2025
7.0.x 3.0.x October 27, 2021 October 27, 2023 October 27, 2024
6.2.x 2.8.x June 8, 2021 June 8, 2023 June 8, 2024
6.1.x 2.7.x February 9, 2021 February 9, 2023 February 9, 2024
6.0.x 2.6.x September 24, 2020 September 24, 2022 September 24, 2023
5.5.x 2.5.x April 24, 2020 April 24, 2022 April 24, 2023
5.4.x 2.4.x January 10, 2020 January 10, 2022 January 10, 2023
5.3.x 2.3.x July 19, 2019 July 19, 2021 July 19, 2022
5.2.x 2.2.x March 28, 2019 March 28, 2021 March 28, 2022
5.1.x 2.1.x December 14, 2018 December 14, 2020 December 14, 2021
5.0.x 2.0.x July 31, 2018 July 31, 2020 July 31, 2021
4.1.x 1.1.x April 16, 2018 April 16, 2020 April 16, 2021
4.0.x 1.0.x November 28, 2017 November 28, 2019 November 28, 2020
3.3.x 0.11.0.x August 1, 2017 August 1, 2019 August 1, 2020
3.2.x 0.10.2.x March 2, 2017 March 2, 2019 March 2, 2020
3.1.x 0.10.1.x November 15, 2016 November 15, 2018 November 15, 2019
3.0.x 0.10.0.x May 24, 2016 May 24, 2018 May 24, 2019
2.0.x 0.9.0.x December 7, 2015 December 7, 2017 December 7, 2018
1.0.0 February 25, 2015 February 25, 2017 February 25, 2018

Confluent for Kubernetes(CFK)

CFK Version Compatible Confluent Platform Versions Compatible Kubernetes Versions Release Date End of Support
2.3.x 7.0.x, 7.1.x 1.18 - 1.23 (OpenShift 4.6 - 4.10) April 5, 2022 April 5, 2023
2.2.x 6.2.x, 7.0.x 1.17 - 1.22 (OpenShift 4.6 - 4.9) Nov 3, 2021 Nov 3, 2022
2.1.x 6.0.x, 6.1.x, 6.2.x 1.17 - 1.22 (OpenShift 4.6 - 4.9) Oct 12, 2021 Oct 12, 2022
2.0.x 6.0.x, 6.1.x, 6.2.x 1.15 - 1.20 May 12, 2021 May 12, 2022

Kafka Client Compatibility

Spring Cloud Stream Version Spring for Apache Kafka Version Spring Integration for Apache Kafka Version kafka-clients Spring Boot Spring Cloud
3.1.x (2020.0.x) 2.6.x 5.4.x 2.6.x 2.4.x 2020.0.x
3.0.x (Horsham)* 2.5.x, 2.3.x 3.3.x, 3.2.x 2.5.x, 2.3.x 2.3.x, 2.2.x Hoxton*

upgrade-guide

Chapter Tutorials

  • Chapter 1 - A Rapid Introduction to Kafka
  • Chapter 2 - Getting Started with Kafka Streams
  • Chapter 3 - Stateless Processing (Sentiment Analysis of Cryptcurreny Tweets)
  • Chapter 4 - Stateful Processing (Video game leaderboard)
  • Chapter 5 - Windows and Time (Patient Monitoring / Infection detection application)
  • Chapter 6 - Advanced State Management
  • Chapter 7 - Processor API (Digital Twin / IoT application)
  • Chapter 8 - Getting Started with ksqlDB
  • Chapter 9 - Data Integration with ksqlDB and Kafka Connect
  • Chapter 10 - Stream Processing Basics with ksqlDB (Netflix Change Tracking - Part I)
  • Chapter 11 - Intermediate Stream Processing with ksqlDB (Netflix Change Tracking - Part II)
  • Chapter 12 - The Road to Production

Why read this book?

  • Kafka Streams and ksqlDB greatly simplify the process of building stream processing applications
  • As an added benefit, they are also both extremely fun to use
  • Kafka is the fourth fastest growing tech skill mentioned in job postings from 2014-2019. Sharpening your skills in this area has career benefits
  • By learning Kafka Streams and ksqlDB, you will be well prepared for tackling a wide-range of business problems, including: streaming ETL, data enrichment, anomaly detection, data masking, data filtering, and more

Support this book

A proposed Kafka maturity model

For a comparison, check out the Confluent white paper titled, “Five Stages to Streaming Platform Adoption” , which presents a different perspective that encompasses five stages of their streaming maturity model with distinct criteria for each stage .

Use Cases

Kafka Streams is optimized for processing unbounded datasets quickly and efficiently, and is therefore a great solution for problems in low-latency, time-critical domains. A few example use cases include:

  • Financial data processing ( Flipkart ), purchase monitoring, fraud detection
  • Algorithmic trading
  • Stock market/crypto exchange monitoring
  • Real-time inventory tracking and replenishment ( Walmart )
  • Event booking, seat selection ( Ticketmaster )
  • Email delivery tracking and monitoring (Mailchimp)
  • Video game telemetry processing (Activision, the publisher of Call of Duty )
  • Search indexing ( Yelp )
  • Geospatial tracking/calculations (e.g., distance comparison, arrival projections)
  • Smart Home/IoT sensor processing (sometimes called AIOT, or the Artificial Intelligence of Things)
  • Change data capture ( Redhat )
  • Sports broadcasting/real-time widgets ( Gracenote )
  • Real-time ad platforms ( Pinterest )
  • Predictive healthcare, vitals monitoring ( Children’s Healthcare of Atlanta )
  • Chat infrastructure ( Slack ), chat bots, virtual assistants
  • Machine learning pipelines ( Twitter ) and platforms ( Kafka Graphs )

The list goes on and on, but the common characteristic across all of these examples is that they require (or at least benefit from) real-time decision making or data processing. The spectrum of these use cases, and others you will encounter in the wild, is really quite fascinating. On one end of the spectrum, you may be processing streams at the hobbyist level by analyzing sensor output from a Smart Home device. However, you could also use Kafka Streams in a healthcare setting to monitor and react to changes in a trauma victim’s condition, as Children’s Healthcare of Atlanta has done.

Kafka Streams is also a great choice for building microservices on top of real-time event streams. It not only simplifies typical stream processing operations (filtering, joining, windowing, and transforming data), but as you will see in “Interactive Queries”, it is also capable of exposing the state of a stream using a feature called interactive queries. The state of a stream could be an aggregation of some kind (e.g., the total number of views for each video in a streaming platform) or even the latest representation for a rapidly changing entity in your event stream (e.g., the latest stock price for a given stock symbol).

Now that you have some idea of who is using Kafka Streams and what kinds of use cases it is well suited for, let’s take a quick look at Kafka Streams’ architecture before we start writing any code.

The 20 fastest-rising and sharpest-declining tech skills of the past 5 years: Kafka

https://www.techrepublic.com/article/the-20-fastest-rising-and-sharpest-declining-tech-skills-of-the-past-5-years/

Kafka Streams Official Examples

https://github.com/confluentinc/kafka-streams-examples

About

Code repository for the Mastering Kafka Streams and ksqlDB book

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 97.9%
  • Shell 2.1%