From the course: Apache Kafka Essential Training: Getting Started (2021)

What is Kafka?

From the course: Apache Kafka Essential Training: Getting Started (2021)

Start my 1-month free trial

What is Kafka?

- [Narrator] Having reviewed the benefits of the Publish-subscribe pattern in message queues, let's dive into the most popular technology in that domain. Apache Kafka. What is Apache Kafka? Kafka is an event streaming platform, events or messages represent the actual data that is exchanged through Kafka. The terms events and messages are used interchangeably in Kafka's context. It is a critical piece of the Big Data puzzle, and plays an integral part in many big data pipelines. Kafka is open source, and can be downloaded and deployed free of cost. There are also commercial options that provide support and serverless capabilities. It's arguably the most popular messaging platform in the world. In Kafka's world, there are data publishers called Producers, which push messages into Kafka. And there are subscribers called Consumers, which listen to and receive messages. Producers and Consumers are the standard terms in the Kafka world to represent publishers and subscribers. What capabilities does Kafka provide for data exchange? It collects messages from multiple producers concurrently. It provides persistent storage of the messages received. This provides fault tolerance capabilities. It transports data across from producers to consumers. With mirroring capabilities, it can also transport across networks. It distributes data to multiple concurrent consumers for a downstream processing. Finally, it provides tracking of message consumption by each consumer. This ensues at least once delivery of messages, even if the consumers go down and come back again. We will discuss more details about these features and see them in action, throughout the course.

Contents