A data-driven approach to gas demand prediction in the USA using machine learning

Nayem Uddin Prince 1, *, Mohd Abdullah Al Mamun 2, Md Mehedi Hassan Melon 3, Anwar Hossain 4, Yasin Arafat 5 and Mohammad Amit Hasan 6

1 Student, Department of Computer Science and Engineering, Daffodil International University, Bangladesh.
2 Student, Department of Business Administration, BRAC University, Bangladesh.
3 Student, Department of Electrical and Automation Engineering, Nanjing Tech University, China.
4 Student, Department Electrical and Electronic Engineering, University of Asia Pacific, Bangladesh.
5 Student, Department of Business Administration, North South University, Bangladesh.
6 Student, Department of Business Administration, University of Liberal Arts, Bangladesh.
 
Research Article
World Journal of Advanced Research and Reviews, 2020, 05(02), 193–203
Article DOI: 10.30574/wjarr.2020.5.2.0002
 
Publication history: 
Received on 05 January 2020; revised on 17 February 2020; accepted on 20 February 2020
 
Abstract: 
Natural gas is crucial for energy generation, industrial production, and residential heating in the United States. The demand is difficult to forecast due to economic fluctuations, energy price instability, and seasonal temperature variations. Time series analysis and linear regression may fail to account for the nonlinear interactions in gas demand data, potentially resulting in inaccuracies. This work addresses the necessity for precise gas demand projections, which are essential for energy planning and resource management. Natural gas is crucial for electricity production, industrial processes, and residential heating; nevertheless, demand is influenced by weather conditions, economic activity, and energy pricing. Conventional forecasting models are inadequate for capturing these intricate processes, necessitating the employment of more sophisticated predictive methodologies. This research employs decision trees, linear regression, gradient boosting, and random forests to examine data from 10 significant US states spanning 2000 to 2019. The random forest model accurately anticipated demand patterns, achieving an R-squared of 99.67% and a root mean square error (RMSE) of 34.53. These findings demonstrate that machine learning can elucidate nonlinear relationships in gas demand data. The study provides a framework for improved demand forecasting, aiding energy providers and legislators in optimising resource allocation, increasing cost-efficiency, and promoting environmental sustainability.
 
Keywords: 
R-squared; RMSE; XGBoost; MLP; Scatter plot; Demand. Null value; Label Encoding; Distribution; State; statistical. 
 
Full text article in PDF: 
Share this