Memory Efficient Summarization of Real-Time CCTV Surveillance System.
Main Article Content
Abstract
Video summarization plays a crucial role in efficiently analyzing vast amounts of CCTV surveillance footage. In this research project, we propose a comprehensive approach that leverages state-of-the-art deep learning algorithms for object detection and tracking to create condensed and informative summaries of surveillance videos. The primary components of our framework include the YOLOv5 model for real-time object detection and the Deep SORT algorithm for robust object tracking. Initially, the YOLOv5 model is employed to detect objects within the CCTV footage, providing accurate bounding boxes and classifying the objects. Subsequently, the output from YOLOv5 is fed into the Deep SORT algorithm, which assigns unique IDs to each detected object and employs a Kalman filter to predict object motion and future coordinates. This predictive capability enables the system to discern moving objects within the video stream effectively. By calculating the difference between predicted and actual coordinates, the Deep SORT algorithm efficiently identifies and tracks moving objects, facilitating the extraction of relevant frames containing dynamic elements. These selected frames are then merged to generate a summarized CCTV surveillance video, highlighting key events and minimizing redundant information. The implementation of this approach has yielded promising results, demonstrating its efficacy in creating concise and meaningful video summaries. The proposed methodology not only enhances the efficiency of video analysis but also contributes to the optimization of storage and bandwidth resources in surveillance systems.