Kafka has become a key data infrastructure technology, and we all have at least a vague sense that it is a messaging system, but what else is it? How can an overgrown message bus be getting this much buzz? Well, because Kafka is merely the center of a rich streaming data platform that invites detailed exploration. In this talk, we’ll look at the entire streaming platform provided by Apache Kafka and the Confluent community components. Starting with a lonely key-value pair, we’ll build up topics, partitioning, replication, and low-level Producer and Consumer APIs. We’ll group consumers into elastically scalable, fault-tolerant application clusters, then layer on more sophisticated stream processing APIs like Kafka Streams and ksqlDB. We’ll help teams collaborate around data formats with schema management. We’ll integrate with legacy systems without writing custom code. By the time we’re done, the open-source project we thought was Big Data’s answer to message queues will have become an enterprise-grade streaming platform, all in 60 minutes. Timecodes 0:00 Intro 01:24 What are Events? 04:33 A platform for events 05:11 The Log 06:54 Topics 08:53 Partitions 09:37 Publisher API 10:45 Message keys 14:10 Consumer API 19:20 Consumer Groups 22:30 Brokers 24:39 Demo 01 - Kafka producer/consumer 30:30 Kafka use cases 32:43 Kafka Connect 36:40 Schemas & Serialisation Formats 44:40 Stream processing introduction 46:50 Kafka Streams 48:03 ksqlDB 50:16 Demo 02 57:05 Recap & Resources 📚 Free eBooks: http://rmoff.dev/ndcoslo2020 ✍️ Slides: https://rmoff.dev/kafka101 🎥 More Apache Kafka videos: https://www.youtube.com/rmoff
Get notified about new features and conference additions.