AI/ML-Based Data Sensitivity Classification: A Technical Framework

Avinash Reddy Thimmareddy

doi:10.52783/jisem.v11i1s.14320

PDF

Published: Jan 5, 2026

DOI: https://doi.org/10.52783/jisem.v11i1s.14320

Keywords:

Data Sensitivity Classification, Machine Learning, Natural Language Processing, Personally Identifiable Information, Protected Health Information

Avinash Reddy Thimmareddy

Abstract

Increasingly complex information system environments are posing mounting challenges to organizations in determining and defending sensitive data. Conventional rule-based classifications have not been effective in managing various schemas, ambiguous metadata, and dynamic data structures, which typify new enterprise contexts. This architecture introduces an automated system that uses techniques of artificial intelligence and machine learning to categorize data sensitivity on a large scale. The framework uses multiple-dimensional feature extraction using column names, descriptions, table contexts, data types, and semantic embeddings. Transformer-based and classical models, such as BERT, TF-IDF, and ensemble classifiers, respectively, convert textual metadata into representations that can be used to make predictions. Multi-class architecture separates Personally Identifiable Information, Protected Health Information, and general sensitivity categories and fits various regulatory requirements in GDPR, HIPAA, and CCPA frameworks. Strict testing based on precision, recall, F1-score, and confusion matrix testing ensures the production-quality performance on uneven datasets that are characteristic of enterprise data catalogs. The framework saves a lot of manual classification effort and also provides high accuracy, which can allow the enforcement of policies automatically, faster compliance initiatives, and more mature data governance. Application in the healthcare, financial services, and technology sectors has shown a significant payoff in the form of lower compliance risks, lower operational overhead, and improved data protection capacity to facilitate digital transformation goals.

Issue

Vol. 11 No. 1s (2026)

Section

Articles

Journal of Information Systems Engineering and Management

AI/ML-Based Data Sensitivity Classification: A Technical Framework

Abstract

Volume 11 (2026)

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details