Effect of excessive neural network layers on overfitting

Caleb Isaac * and Kourosh Zareinia

University of Alabama, USA.
 
Review Article
World Journal of Advanced Research and Reviews, 2022, 16(02), 1246-1257
Article DOI: 10.30574/wjarr.2022.16.2.1247
Publication history: 
Received on 08 October 2022; revised on 21 November 2022; accepted on 25 November 2022
 
Abstract: 
Artificial intelligence has been transformed through deep neural networks into models that can learn complex representations from our data. But, an excessive amount of layers in the neural networks may create overfitting; that is, the model can memorize training data instead of generalizing to newer inputs. Networks that are too complex tend to 'overfit' on noise in addition to meaning patterns. Vanishing and exploding gradients, increased computational cost and the curse of dimensionality are the factors that make this issue worse and deeper networks harder to train effectively.
In this paper, I discuss how neural network layers are used to learn and how they are problematic when depth becomes large. We further discuss regularization techniques (L1/L2 regularization, dropout, batch normalization, etc) which prevent the problem of overfitting and promotes generalization. Besides, training efficiency and stability can be optimized with adaptive learning rate optimizers, gradient clipping, early stopping, etc. Additionally, transfer learning is also introduced as a powerful way of exploiting the power of pre-trained networks while avoiding excessively deep networks.
We also investigate the real-world case studies on which deep networks did not generalize well and yet its problem was bypassed with NAS, sparse networks, and meta-learning. Efficient, adaptable, and the capacity for generalization are the potential future of deep learning models for designing efficient and generalizable models with high performance at the right depth. By applying best practices in architecture design and optimization, researchers can construct strong models that offer excellence and accuracy without over-complication. 
 
Keywords: 
Overfitting in deep learning; Neural network layers; Regularization techniques; Transfer learning; Neural architecture search (NAS)
 
Full text article in PDF: 
Share this