Contextual Sentiment Boosting through Lexicon Masking and Transformer Fine-Tuning
Main Article Content
Abstract
Introduction: In today's advert environment sentiment analysis plays critical for gathering insights from consumer perspectives. Typical machine learning models rely on predefined features which might struggle with feature weighting, neglect important comments and misunderstand word meanings particular to a certain domain especially when applied to small datasets.
Objectives: This study aims to overwhelm the boundaries of traditional sentiment analysis by enhancing feature representation and leveraging deep learning. Also it focusses on improving the extraction of sentiment-related features, domain adaptation and optimizing model parameters for better performance.
Methods: The proposed approach includes fine-tuning a BERT with a focus on sentiment-appropriate terms by boosting the fraction of masked words during training, which allows more effective bidirectional contextual learning. Particle Swarm Optimization (PSO) is used for hyperparameter tuning to optimize model performance. Additionally, character-level and subword embeddings are used to handle unknown terms. Transfer learning is applied to enrich classification by integrating domain-adapted features.
Results: The PoS Masking-PSO BERT model achieved 96.6% accuracy on hotel and 95.2% on movie reviews, with F1-scores of 95.6% on both. Contrated to baseline BERT which have 89%, 93.87% performance improved significantly. Processing times dropped to 9 and 13 minutes. Optimal results used 12 encoder layers and 750 hidden units. PoS masking enriched feature context while PSO enhanced model stability and accuracy.
Conclusions Combining PoS masking and PSO into BERT strengthen sentiment analysis, competently addressing domain variation and feature weighting. The model surpasses existing BERT approaches in accuracy and F1-score and reduced processing time prompting it suitable for real-time applications. This modified architecture describes deep contextual features posing a robust solution for cross-domain sentiment analysis.