An Adaptive Approach for Training Deep Belief Networks
Main Article Content
Abstract
Deep Belief Networks (DBNs) are a stack of networks, each having picked up unique characteristics and attributes from the original data. DBNs can handle supervised and unsupervised tasks thanks to their intricate layer-wise neural architecture. This article presents an adaptive approach for training DBN and also analyzes the various training algorithms used in the process of training DBNs. This paper begins by delving into the pre-training phase, where Restricted Boltzmann Machines (RBMs) play a central role. We review the Contrastive Divergence (CD) and Persistent Contrastive Divergence (PCD) algorithms, highlighting their advantages and disadvantages in initializing deep belief nets. These networks have a variety of applications. Importance is placed on their pertinence to different data types and scales. Moving to the fine-tuning stage, the paper explores the use of backpropagation with gradient descent. Furthermore, the architectural variants of DBNs like CDBNs and RDBNs with their respective areas of application are discussed. CDBNs have an accuracy of over 95% when operated on standard image classification benchmarks like MNIST and ImageNet whereas RDBNs achieve an accuracy of over 90% on sentiment analysis and 85% for speech recognition on longer audio sequences. We highlight the adaptation of DBNs for specific tasks, including classification, regression, clustering, and generative modeling. We also compare the training complexity of the proposed algorithm with the existing algorithm. We found that the proposed algorithm outperformed other algorithms by training the model in 500 seconds.