Optimization of Lightweight AI Model for Low Power Predictive Analytics in Fog Edge Cintinuum
Main Article Content
Abstract
The rapid proliferation of IoT and smart devices demands intelligent data processing close to the data source to meet real-time, low-latency requirements. In this paper, we investigate methods to optimize lightweight AI models for deployment on resource-constrained nodes in a fog-edge continuum. We propose an AI pipeline that combines advanced model compression (pruning, quantization, and knowledge distillation) with hardware-aware neural architecture search (NAS) and a tiered cloud-fog-edge deployment strategy. Our methodology includes power-aware scheduling of inference across heterogeneous devices and networks. We simulate this setup on typical edge hardware (e.g., Raspberry Pi, Jetson Nano) using frameworks such as TensorFlow Lite and ONNX. Results (simulated) demonstrate significant reductions in model size, latency, and power consumption while maintaining acceptable accuracy (e.g., 80–85% on a test predictive task). For instance, a pruned-and-quantized CNN achieved ~4× lower power use and ~50% smaller memory footprint with only ~3–5% loss in accuracy compared to the baseline. These optimizations enable real-time predictive analytics (e.g., smart city traffic forecasting, industrial maintenance alerts) on edge devices under tight energy budgets. We discuss the trade-offs, highlight performance tables and graphs (accuracy vs. power, latency vs. size), and outline a deployment diagram across cloud, fog, and edge tiers. Finally, we summarize challenges and future directions, including automated edge-AI pipelines and integration with next-generation networking.