Hybrid Approach to Classify and Detect Lung Cancer using Neural Network by Transfer Learning and Feature Extraction

Main Article Content

Mithlesh Arya, Amita Gandhi, Punam Bhoyar, Varun Malik, Mudassir Khan, Kavita Arora

Abstract

Lung cancer is most common in men in India. Smoking is the main cause of lung cancer, apart from this there are many other reasons also. Lung cancer patients’ life can be saved by the early identification. There are many tests to identify cancer of lung like chest X-ray and Computed tomography (CT). In this paper we are using two datasets to validate our work. First dataset-I is numeric data which is collectively based on surveys and second dataset-II contains CT images of lungs. In numeric data we have 15 features and have applied a neural network and ensemble classifier to classify the data into normal and cancerous classes. After that Pearson correlation method is used to select the most prominent features. The best performance is attained when the XGBoost ensemble classifier is used and the obtained values are Precision (95%), Recall (98%) and F1 score (96%). On the second dataset-II we are using CT images dataset and two approaches to classify the data into normal, Benign and malignant classes. Firstly, transfer learning based models VGG16 and VGG19 model are applied on dataset. Dataset is categrioed into three parts like training data, testing data and validation data. Training of model is done with 80% data, then validates with 10% data and finally tested with 10% data. The kappa score and F1 score obtained by VGG16 for lung cancer is 98% and 96.43% and by VGG19 is 97.4% and 96.2%. Second we design CNN for the feature selection and this feature selected methods applied on flatten images. After that random forest, gradient boosting and support vector classifiers have been applied on the features.Then logistic regression method is applied to cateogories the images into three classes such as normal, benign and malignant. Most promienet results are predicted after feature selection and the evalution parameters are F1 score (1.00), precision (1.00) and recall (1.00).This paper enhance the accuracy of lung cancer identification and outperforms traditional computer vision methods across multiple performance metrics.

Article Details

Section
Articles