Home
World Journal of Advanced Research and Reviews
International Journal with High Impact Factor for fast publication of Research and Review articles

Main navigation

  • Home
    • Journal Information
    • Editorial Board Members
    • Reviewer Panel
    • Abstracting and Indexing
    • Journal Policies
    • Our CrossMark Policy
    • Publication Ethics
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Join Editorial Board
    • Join Reviewer Panel
  • Contact us
  • Downloads

eISSN: 2581-9615 || CODEN: WJARAI || Impact Factor 8.2 ||  CrossRef DOI

Research and review articles are invited for publication in March 2026 (Volume 29, Issue 3) Submit manuscript

Optimizing Real-Time Data Pipelines for Machine Learning: A Comparative Study of Stream Processing Architectures

Breadcrumb

  • Home
  • Optimizing Real-Time Data Pipelines for Machine Learning: A Comparative Study of Stream Processing Architectures

Sreenivasulu Ramisetty 1, * , Thirupurasundari Chandrasekaran 2, Vamsi Krishna Eruvaram 3, Mohan Raja Pulicharla 4

1 Data Architect, USA
2 Sr Project Manager
3 Data Engineer, USA.
4 Data Engineer Staff, USA.
 
Review Article
World Journal of Advanced Research and Reviews, 2024, 23(03), 1653-1660
Article DOI: 10.30574/wjarr.2024.23.3.2818
DOI url: https://doi.org/10.30574/wjarr.2024.23.3.2818
 
Received on 04 August 2024; revised on 11 September 2024; accepted on 13 September 2024
 
Within the time of enormous information and real-time analytics, optimizing information pipelines for machine learning is basic for convenient and exact bits of knowledge. This consideration analyzes the execution and versatility of Apache Kafka Streams, Apache Flink, and Apache Pulsar in real-time machine-learning applications. In spite of the wide use of these innovations, there's a need for comprehensive comparative examination with respect to their productivity in commonsense scenarios. This inquiry about addresses this crevice by giving a point-by-point comparison of these systems, centering on idleness, throughput, and asset utilization.
We conducted benchmarks and tests to assess each framework's execution in taking care of high-throughput information, conveying real-time expectations, and overseeing asset utilization. Our conclusion uncovered that Apache Flink accomplishes a 25% lower end-to-end idleness compared to Kafka Streams in high-throughput scenarios. Apache Pulsar exceeds expectations in adaptability, handling up to 1.5 million messages per moment, whereas Kafka Streams appears 15% higher memory utilization.
These discoveries highlight the qualities and impediments of each system. Kafka Streams coordinate well with Kafka's informing framework but may have higher idleness beneath overwhelming loads. Flink offers prevalent low-latency and high-throughput execution, making it reasonable for complex assignments. Pulsar's progressed informing highlights and versatility are promising for large-scale applications, though it requires cautious tuning. This comparative investigation gives down-to-earth bits of knowledge for choosing the ideal stream preparation system for machine learning pipelines.
 
Real-time ML pipelines; Kafka Streams performance; Flink vs Kafka latency; High-throughput stream processing; Pulsar scalability ML; Stream processing comparison
 
https://wjarr.com/sites/default/files/fulltext_pdf/WJARR-2024-2818.pdf

Preview Article PDF

Sreenivasulu Ramisetty, Thirupurasundari Chandrasekaran, Vamsi Krishna Eruvaram and Mohan Raja Pulicharla. Optimizing Real-Time Data Pipelines for Machine Learning: A Comparative Study of Stream Processing Architectures. World Journal of Advanced Research and Reviews, 2024, 23(3), 1653-1660. Article DOI: https://doi.org/10.30574/wjarr.2024.23.3.2818

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content

Copyright © 2026 World Journal of Advanced Research and Reviews - All rights reserved

Developed & Designed by VS Infosolution