From the course: Apache Kafka Essential Training: Building Scalable Applications (2021)

What is Kafka?

- [Instructor] A good place to start this course, is by reviewing some of Apache Kafka features and capabilities. Kafka is an event streaming platform. Events or messages represent the actual data that is exchanged through Kafka. The terms, events and messages, are used interchangeably in Kafka's context. It is a critical piece of the big data puzzle and plays an integral part in many big data pipelines. Kafka is open source and can be downloaded and deployed free of cost. There are also commercial options available that provide support and serverless options. It is arguably the most popular messaging platform in the world. In Kafka's world, there are data publishers called producers which push messages into Kafka. And there are subscribers called consumers which listen and receive messages. Producers and consumers are the standard terms in the Kafka world to represent publishers and subscribers. What capabilities does Kafka provide for data exchange? It collects messages from multiple producers concurrently. It provides persistent storage of the messages received. This provides fault tolerance capabilities. It transports data across from producers to consumers. With mirroring capabilities, it can also transport across networks. It distribute data to multiple concurrent consumers for downstream processing. Finally, it provides tracking of message consumption by each consumer. This ensures at least once delivery of messages, even if the consumers go down and come back again. Before we dive into the course content, let's discuss the prerequisites in the next video.

Contents