Scalable big data architectures for healthcare analytics using Spark and SQL- based pipelines

Jagadeeswar Alampally *

IQVIA Inc., USA.
 
Research Article
World Journal of Advanced Research and Reviews, 2021, 09(03), 429-434
Article DOI10.30574/wjarr.2021.9.3.0122
 
Publication history: 
Received on 21 February 2021; revised on 24 March 2021; accepted on 28 March 2021
 
Abstract: 
The emergence of healthcare data poses challenges in data processing, storage, and analysis. This paper discusses scalable big data solutions in healthcare analytics and the application of Apache Spark and SQL-based pipelines in this context. The proposed architecture provides the means to perform real-time analytics on big data in healthcare through the use of Spark’s distributed computing features and data transformation with the help of SQL. This paper discusses the design and implementation of a scalable data pipeline to suit healthcare applications and its potential use to support real-time decision-making, predictive analytics, and health monitoring systems. Performance assessment proves the scalability, performance, and capability of the architecture to process both structured and unstructured data, which opens the way to the enhanced healthcare output and efficiency in operations.
 
Keywords: 
Big Data; Healthcare Analytics; Apache Spark; SQL Pipelines; Scalable Architectures; Real-Time Data Processing; Data Management; Healthcare Systems.
 
Full text article in PDF: 
Share this