Please use this identifier to cite or link to this item: http://ir.inflibnet.ac.in/handle/1944/495
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChaudhuri, Anirban Rayen_US
dc.contributor.authorSingh, Debnathen_US
dc.contributor.authorNasipuri, Mitaen_US
dc.contributor.authorBasu, Dipak Kumaren_US
dc.date.accessioned2005-05-10T06:54:33Zen_US
dc.date.accessioned2010-04-08T08:47:52Z-
dc.date.available2005-05-10T06:54:33Zen_US
dc.date.available2010-04-08T08:47:52Z-
dc.date.issued2005-02-02en_US
dc.identifier.isbn81-902079-0-3en_US
dc.identifier.urihttp://hdl.handle.net/1944/495en_US
dc.description.abstractThe transformation of a scanned paper document into an editable form suitable for further processing such as desktop publishing or archiving in a digital library is a complex process. It requires solutions to several problems – document analysis by acquiring knowledge of document layout by a Page Layout Analyzer (PLA), followed by document recognition, which mainly comprises text recognition by Optical Character Recognition (OCR). Besides these two, another important problem is document reconstruction by transforming content into an electronically editable format by keeping the original layout intact. Core OCR modules exist on different Indian scripts, but no such document reconstruction system is available for Indian scripts. The document reconstruction system reported in this paper is the first of its kind on Indian scripts and it addresses document reconstruction for Bengali document images. The system makes use of the knowledge of both document layout extracted by a PLA in a graphical user interface (GUI) and the results of text recognition steps performed by OCR for transformation of paper documents into Rich Text Format.en_US
dc.format.extent528075 bytesen_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoenen_US
dc.publisherINFLIBNET Centreen_US
dc.subjectIndian Scriptsen_US
dc.subjectDesktop Publishingen_US
dc.subjectPage Layout Analysisen_US
dc.subjectOptical Character Recognitionen_US
dc.subjectDocument Reconstructionen_US
dc.subjectEncoding Standarden_US
dc.subjectIndian Languageen_US
dc.titleA Document Reconstruction System for Transferring Bengali Paper Documents into Rich Text Formaten_US
dc.typeArticleen_US
Appears in Collections:CALIBER 2005:Kochi

Files in This Item:
File Description SizeFormat 
05cali_4.pdf515.7 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.