Sentiment analysis of passenger feedback on U.S. airlines using machine learning classification methods

Md Nurul Raihen; Sultana Akter

doi:10.30574/wjarr.2024.23.1.2183

Sentiment analysis of passenger feedback on U.S. airlines using machine learning classification methods

Md Nurul Raihen ^{1, *} and Sultana Akter ²

¹Department of Mathematics and Computer Science, Fontbonne University, Saint Louis, MO, USA.

²Institute for Data Science and Informatics, University of Missouri Columbia, Columbia, MO, USA.

Research Article

World Journal of Advanced Research and Reviews, 2024, 23(01), 2260–2273

Article DOI: 10.30574/wjarr.2024.23.1.2183

DOI url: https://doi.org/10.30574/wjarr.2024.23.1.2183

Publication history:

Received on 12 June 2024; revised on 18 July 2024; accepted on 20 July 2024

Abstract:

Twitter, a platform for micro-blogging, has contained as a novel information architecture. Everyday People worldwide publish about 200 million status messages, known as tweets. Twitter users express their opinions by posting concise text messages. Twitter data is useful for sentiment analysis and consumer feedback tweets. This study employed multi-class sentiment analysis to analyze tweets from 6 major US airlines (American, United, US Airways, Southwest, Delta and Virgin America). Airlines are essential for travel, and this study has helped people choose the best ones. Classification model with the lowest error rate could help airline companies improve their business by figuring out why information is being misclassified. This analysis of airline evaluations can help us identify good airlines and apply this model to our own journeys. This helps the airline identify its weaknesses so they can improve them. A technique of natural language processing (NLP) known as sentiment analysis (or opinion mining) classifies the tone of data as positive, negative, or neutral. The analysis was conducted with seven distinct classification strategies: Linear Discriminant Analysis, Quadratic Discriminant Analysis, Decision Tree, Random Forest, K-Nearest Neighbors, Gradient Boosting, and AdaBoost to utilize the split validation (80% as train data set, 20% as test data set) and 10 folds cross validation process. The suggested model demonstrates superior accuracy and efficiency compared to all others, achieving an accuracy score of 90.13% for the Random Forest with 10 folds cross validation approach. The project aims to utilize machine learning techniques to estimate the reasons for misclassified information since the lowest error rate means the airline sentiment provides less wrong prediction.

Keywords:

Twitter; Airlines; Classification; Error Rate; Validation

Full text article in PDF:

Click here

Sentiment analysis of passenger feedback on U.S. airlines using machine learning classification methods

Md Nurul Raihen 1, * and Sultana Akter 2

Md Nurul Raihen ^{1, *} and Sultana Akter ²