Context-Aware Transformer Models for Ambiguous Word Classification in Code-Mixed Sentiment Analysis
Main Article Content
Abstract
Introduction: In text-based sentiment analysis, an ambiguous word that has more than one meaning can result in ambiguity, which creates challenging issues in analysing the sentiment. The Deep Learning models have achieved effective classification of ambiguous words. The traditional models acquirecertain limitations in accuracy and efficiency due to the ignorance of context features and parallelization.
Objectives: To classify ambiguous words in code-mixed sentiment analysis using context-aware transformer models, improving accuracy, efficiency, and adaptability while addressing limitations in traditional methods.
Methods: This study employs transformer-based models (DistilBERT, IndicBERT, XLM-RoBERTa, TinyBERT) to classify ambiguous words in code-mixed text. Data preprocessing techniques, including stopword removal, stemming, and lemmatization, prepare inputs for training. Each model's performance is evaluated using metrics like accuracy, precision, recall, and F1-score. DistilBERT's superior efficiency is compared against others to identify the best-suited approach for ambiguous word classification.
Results: The study demonstrates the effectiveness of transformer models for context-aware ambiguous word classification in code-mixed sentiment analysis. DistilBERT outperforms others with an accuracy of 88.75%, precision of 92.87%, recall of 88.75%, and F1-score of 90.39%. Its lightweight architecture ensures faster inference and reduced memory usage compared to IndicBERT, TinyBERT, and XLM-RoBERTa.
Conclusions: The findings confirm DistilBERT's robustness and reliability for code-mixed language tasks, offering significant advancements over conventional methods by achieving superior accuracy and efficiency in sentiment analysis. Context-aware transformer models are uniquely optimized for ambiguous word classification in code-mixed sentiment analysis, ensuring high precision, efficiency, and applicability to multilingual challenges.