VISNET: An Efficient Light Weighted Hybrid Model for Early Detection of Breast Tumour in Ultrasound Images using Vision Transformer and Convolutional Neural Networks

Main Article Content

Archana Singh, Surya Prakash Mishra, Prateek Singh, Anshuka Srivastava

Abstract

Breast cancer is a leading cause of mortality among women worldwide, emphasizing the critical need for early and accurate diagnosis. Ultrasound imaging, a widely used diagnostic tool, presents challenges such as noise, shadow, contrast and variability in tumour presentation. Medical image analysis has seen impressive results from deep learning models, especially Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). For breast tumour detection, we provide an effective light-weighted hybrid model VISNET, which combines EfficientNetB0 with a ViT Transformer. In our model, we used transformers to snare long-range relationships, whereas CNNs are used for local feature extraction. To improve representation learning, a feature fusion module based on attention mechanism is infused. To satisfy the claims we’ve trained and tested our model on Two Datasets. According to experimental findings with respect to seven parameters on the UDIAT, Spain Breast Ultrasound dataset(B), our hybrid model outperforms other State-of-the-art CNNs (ResNet 50, VGG16, EfficientNet B0) and ViT model achieving Accuracy (96.9%), Precision (95.83%), Recall (97.73%), F1-Score (96.67%), Sensitivity (97.74%), Specificity (97.72%) and an AUC (98.67). Whereas, on Baheya, Egypt dataset the accuracy is 97.6%, Precision (97.51%), Recall (96.79%), F1-Score (97.14%), Sensitivity (96.8%), Specificity (96.79%) and AUC (99.82). In practice, our suggested model, VISNET satisfies the claims of light – weightiness as it can run on minimum GPU support and offers a viable way to increase the accuracy of breast tumour categorization as well as yields faster results in comparison to available heavy CNN models.

Article Details

Section
Articles