An Integrated Model of Natural Language Processing and Machine Learning (INM) for Autism Detection and classification from Random Symptom
Main Article Content
Abstract
Autism Spectrum Disorder (ASD) is a neurologic disability which affects daily life of autistic children. In general, parent of the autistic child provides symptoms in vernacular language and may express only one or two symptoms out of many. So, it is challenging to perceive this disorder in the early stage of the child. Even though it is challenging to detect at early stage, our proposed Integrated Natural Language Processing and Machine Learning (INM) model can minimize the severity of the condition. We collected Autistic children’s data from “Total Solution Rehabilitation Society, Hyderabad, India”. The dataset has many similar types of symptoms which may confound the machine learning model for Autism detection and classification. We experimented different NLP techniques i.e. Bag of Words (BoW), Bag of N-grams, TF-IDF, and One-Hot Encoding to normalize the dataset with highest cosine similarity index. We found (BoW) outperformed among all and we replaced similar-meaning symptoms with a unique symptom. The proposed model accepts any random symptom from parent as a preliminary symptom. Further, the frame work used Association rules to generate frequent symptom sets with minimum-support and maximum-confidence to identify the most appropriate symptoms with a preliminary symptom. Then frequent symptom set is classified through various ML classification techniques i.e. Random Forest (RF), Decision Tree (DT), Support Vector Machine (SVM), ADA Boost (AB) and Linear Discriminant Analysis (LDA) to identify the proper autism type. Finally, the results are evaluated with several statistical evaluation metrics (Accuracy, Precision, Recall, F1-Score and Mathews Correlation Coefficient: MCC). After examining the experimental results, it is identified that Random Forest classifier detected Autism type with maximum accuracy of 90.48%, precision of 96.67%, recall of 91.53%, F1-score of 93.29% and MCC of 87.94%. The proposed INM model guides the decision-making of Autism health care medicos while examining the ASD cases. The proposed INM model is tested in the “Total Solution Rehabilitation Society, Hyderabad, India.” organization with 1896 autistic children and achieved 99% accuracy on identification of ASD types.