Design of a Mechanically Integrated System for Pictorial Detection and Document Information Extraction using U-Net
Main Article Content
Abstract
This paper presents a novel approach to pictorial detection and information extraction from documents employing the U-Net architecture. In the realm of computer vision and document analysis, our proposed methodology leverages the powerful capabilities of U-Net, a convolutional neural network renowned for semantic segmentation tasks. We delve into the intricacies of adapting U-Net for the specific challenges posed by document images, aiming to enhance the accuracy and efficiency of information extraction. The methodology involves a comprehensive training process, where the network learns to identify and isolate relevant pictorial elements within the document. Furthermore, our approach incorporates post-processing techniques to refine the extracted information, ensuring high precision in capturing critical details. We evaluate the performance of the proposed system through extensive experiments on diverse datasets, showcasing its robustness and versatility across various document types. The results demonstrate the efficacy of our approach in automating the extraction of valuable information from documents with pictorial content, paving the way for advancements in fields such as document analysis, information retrieval, and content understanding. This paper contributes to the growing body of research at the intersection of deep learning and document processing, offering a promising solution for efficient information extraction from visual documents.