From the course: Stream Processing Design Patterns with Kafka Streams

What is stream processing?

- [Trainer] I will start off by introducing the concept of stream processing in this video. Stream processing is becoming more and more popular for handling big data and delivering actions in real time. Stream processing deals with the ability to understand and process a continuous stream of data and produce insights in real time. Two concepts stand out in this definition, continuous stream of data and processing in real time. What are some of the key characteristics of stream processing that differentiates it from batch processing? First, stream processing deals with unbounded data sets. While processing a given record in the data set, it is not possible to know how many more records exist in the stream, the unbounded stream continuous forever. Stream processing is done one record at a time. Each record is inspected, transformed, and analyzed. It is also possible to create windows that provide additional summaries. Computations on streams are real time, records are processed, insight generated and pushed to the next stage in real time. Stream processing has low latency where the entire processing happens in subsequent time intervals. There should be no visible lag then required from raw data ingestion to delivery of insights. It should enable parallel processing in order to scale across large quantities of data in real time. Stream processing is done by building pipelines. A pipeline consists of streaming inputs, processing jobs and streaming outputs. This picture shows a typical processing pipeline. There will be multiple streaming inputs possibly in different formats. Processing tasks would cleanse, process and transform input data and then push them to streaming outputs. Input may be combined to deliver insights. Outputs of one processing task can become the input for another processing task. A network of tasks help deliver the goals of the stream processing pipeline. Having discussed the characteristics of stream processing, let's now look at the opportunities and challenges of this technology in the next video.

Contents