Home / Articles
SMART DOCUMENT COMPANION TEXT DATA CLASSIFICATION IN DOCUMENTS USING AI |
![]() |
Author Name Dineshbalaji K, Raghulraj M, Kirubaharan A, Monish S Abstract In today’s digital era, the effective management of heterogeneous documents is a critical challenge for numerous applications. This paper presents the Smart Document Companion—a novel system that integrates robust Optical Character Recognition (OCR) preprocessing, embeddingbased few-shot classification, and a transformerbased question answering module. By leveraging advanced image preprocessing techniques and transformer models, our system extracts meaningful textual content from both image and PDF documents, classifies them via semantic similarity to a limited labeled support set, and facilitates interactive query answering directly from the document content. Experimental results demonstrate that our approach achieves competitive accuracy even with minimal training data, offering significant potential for enhancing document management workflows in realworld scenarios.
Key Words: Document Classification, Optical Character Recognition, Transformer Models, Few-shot Learning, Question Answering, Generative AI. Published On : 2025-03-26 Article Download : ![]() |