Reinforcement Learning for Secure Applications: Integrating ML and Data Engineering for Cloud Security
Main Article Content
Abstract
As cloud computing becomes increasingly integral to modern digital infrastructure, ensuring robust and adaptive security for cloud-based applications has become a critical challenge. Traditional rule-based and supervised machine learning approaches often fail to keep pace with dynamic, evolving threats. This study proposes a novel framework that integrates Reinforcement Learning (RL), Machine Learning (ML), and real-time Data Engineering to develop secure, autonomous applications for cloud environments. Utilizing Q-learning and Deep Q-Network (DQN) models, the framework dynamically detects and mitigates threats such as brute-force attacks, privilege escalations, and denial-of-service attempts. The RL agents are trained on real-time telemetry data streams processed through a scalable Kafka-Spark pipeline, enabling continuous learning and policy optimization. Comparative evaluations show that DQN achieves the highest detection performance, with an accuracy of 98.3% and F1-score of 0.973, significantly outperforming traditional ML models. Statistical analysis confirms the superiority of RL agents across precision, recall, and true positive rates. Additionally, the data engineering pipeline supports high-throughput, low-latency processing, essential for scalable deployment. The study concludes that integrating reinforcement learning with ML preprocessing and data engineering offers a transformative approach to cloud security delivering intelligent, proactive, and self-adaptive protection mechanisms. This framework has broad implications for securing multi-cloud and containerized environments in real-time, setting the foundation for future autonomous cybersecurity solutions.