Home
World Journal of Advanced Research and Reviews
International Journal with High Impact Factor for fast publication of Research and Review articles

Main navigation

  • Home
    • Journal Information
    • Editorial Board Members
    • Reviewer Panel
    • Abstracting and Indexing
    • Journal Policies
    • Our CrossMark Policy
    • Publication Ethics
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Join Editorial Board
    • Join Reviewer Panel
  • Contact us
  • Downloads

eISSN: 2581-9615 || CODEN: WJARAI || Impact Factor 8.2 ||  CrossRef DOI

Research and review articles are invited for publication in March 2026 (Volume 29, Issue 3) Submit manuscript

Designing resilient, low-latency data pipelines for streaming big data analytics using Apache Kafka and Spark ecosystems

Breadcrumb

  • Home
  • Designing resilient, low-latency data pipelines for streaming big data analytics using Apache Kafka and Spark ecosystems

Uju Ugonna Uzoagu *

Department of Computer Science, College of Computing and Software Engineering, Kennesaw State University, USA.

Review Article

World Journal of Advanced Research and Reviews, 2025, 27(03), 1856-1873

Article DOI: 10.30574/wjarr.2025.27.3.3369

DOI url: https://doi.org/10.30574/wjarr.2025.27.3.3369

Received on 21 August 2025; revised on 26 September 2025; accepted on 29 September 2025

The exponential growth of real-time data streams from digital platforms, Internet of Things (IoT) devices, and enterprise applications has redefined the requirements for big data analytics. Traditional batch-processing architectures, while robust for historical analysis, are increasingly insufficient in addressing the need for low-latency decision-making in sectors such as finance, healthcare, telecommunications, and e-commerce. Consequently, resilient streaming data pipelines have become critical in supporting fault-tolerant, scalable, and high-throughput analytics. This study explores the design and implementation of resilient, low-latency data pipelines for streaming big data analytics by leveraging the Apache Kafka and Apache Spark ecosystems. Kafka, a distributed publish-subscribe messaging system, provides durable, fault-tolerant ingestion capabilities with strong scalability properties, while Spark Structured Streaming delivers near real-time analytical processing and advanced machine learning integration. Together, these technologies form a complementary foundation for constructing streaming pipelines capable of handling large volumes of high-velocity data. The paper discusses architectural design principles, including partitioning strategies, replication for fault tolerance, stateful stream processing, and backpressure handling. It further evaluates techniques for ensuring end-to-end resilience, such as exactly-once semantics, checkpointing, and integration with containerized environments like Kubernetes for deployment scalability. Case study insights highlight latency benchmarks and system performance under varying workloads, demonstrating how the Kafka-Spark integration supports enterprise-grade analytics. By uniting resilience, scalability, and analytical depth, the proposed pipeline framework enables organizations to harness real-time insights while ensuring reliability under fluctuating conditions. The findings contribute practical guidelines for architects, engineers, and decision-makers seeking to operationalize streaming analytics infrastructures that meet the growing demands of modern data-driven enterprises.

Streaming data pipelines; Apache Kafka; Apache Spark; Big data analytics; Low-latency processing; Resilient architectures

https://wjarr.com/sites/default/files/fulltext_pdf/WJARR-2025-3369.pdf

Preview Article PDF

Uju Ugonna Uzoagu. Designing resilient, low-latency data pipelines for streaming big data analytics using Apache Kafka and Spark ecosystems. World Journal of Advanced Research and Reviews, 2025, 27(3), 1856-1873. Article DOI: https://doi.org/10.30574/wjarr.2025.27.3.3369

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content

Copyright © 2026 World Journal of Advanced Research and Reviews - All rights reserved

Developed & Designed by VS Infosolution