Multi-Class Lung Disease Classification Using Google's HeAR Foundation Model and MFCC Features: A Comparative Study of Performance and Data Efficiency

Brighton Mukundwi; Delvin Tadiwa Vengesai

doi:10.30574/wjarr.2026.30.2.1459

Brighton Mukundwi ^{1, *} and Delvin Tadiwa Vengesai ²

¹Department of Data Analytics and Visualisation, Faculty of Computer Science, Yeshiva University; ORCiD: 0009-0003-8516-9656
²Department of Biological Sciences and Ecology, Faculty of Science, University of Zimbabwe; ORCiD: 0009-0001-4948-5729

Research Article

World Journal of Advanced Research and Reviews, 2026, 30(02),1902-1913

Article DOI: 10.30574/wjarr.2026.30.2.1459

DOI url: https://doi.org/10.30574/wjarr.2026.30.2.1459

Publication history

Received on 10 April 2026; revised on 20 May 2026; accepted on 22 May 2026

Abstract

Over 454 million individuals worldwide suffer from chronic respiratory conditions, with low- and middle-income countries bearing a disproportionate burden due to inadequate diagnostic facilities. Although foundation audio models remain underexplored in this field, automated classification of respiratory sounds could support early disease detection at scale. This study evaluates Google's Health Acoustic Representations (HeAR) model for multi-class lung disease classification and directly compares it with conventional Mel-frequency cepstral coefficient (MFCC) features across five classifiers, including SVM, Gradient Boosting, and MLP. Using the Asthma Detection Dataset Version 2, audio was segmented into 3,602 two-second clips, balanced with SMOTE, and evaluated on a stratified held-out test set. Model performance was assessed using training data fractions ranging from 10% to 100% in a data efficiency experiment. The best HeAR-based model, MLP, achieved a macro F1-score of 84.5% and accuracy of 86.4%, while MFCC features combined with Gradient Boosting produced the highest overall performance, with 88% accuracy and 87% macro F1-score. HeAR embeddings consistently outperformed linear classifiers under limited data conditions. At 10% training data, HeAR SVM achieved a macro F1-score of 70%, compared to 55.8% for MFCC SVM. While MFCC features with non-linear ensembles delivered superior peak performance on controlled single-source data, HeAR embeddings produced a more linearly separable feature space, enabling stable classification with less labelled data, making them suitable for resource-limited clinical settings.

Keywords

Health Acoustic Representations; MFCC Features; Respiratory Sound Analysis; Digital Health; Low-Resource Settings

Download Article PDF

https://wjarr.com/sites/default/files/fulltext_pdf/WJARR-2026-1459.pdf

Preview Article PDF

How to cite this article

Brighton Mukundwi and Delvin Tadiwa Vengesai. Multi-Class Lung Disease Classification Using Google's HeAR Foundation Model and MFCC Features: A Comparative Study of Performance and Data Efficiency. World Journal of Advanced Research and Reviews, 2026, 30(02), 1902-1913. Article DOI: https://doi.org/10.30574/wjarr.2026.30.2.1459

Multi-Class Lung Disease Classification Using Google's HeAR Foundation Model and MFCC Features: A Comparative Study of Performance and Data Efficiency

Brighton Mukundwi ^{1, *} and Delvin Tadiwa Vengesai ²

Preview Article PDF

Get Certificates

Issue details

Multi-Class Lung Disease Classification Using Google's HeAR Foundation Model and MFCC Features: A Comparative Study of Performance and Data Efficiency

Brighton Mukundwi 1, * and Delvin Tadiwa Vengesai 2

Preview Article PDF

Get Certificates

Issue details

Brighton Mukundwi ^{1, *} and Delvin Tadiwa Vengesai ²