EXTRACTION AND SEGMENTATION OF TEXT FROM IMAGE DOCUMENTS

Kumar, Vijay

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/12216

Title:	EXTRACTION AND SEGMENTATION OF TEXT FROM IMAGE DOCUMENTS
Authors:	Kumar, Vijay
Keywords:	ELECTRONICS AND COMPUTER ENGINEERING;EXTRACTION;SEGMENTATION;IMAGE DOCUMENTS
Issue Date:	2010
Abstract:	Document images are often obtained by digitizing paper documents like books or manuscripts. Document image analysis systems are becoming increasingly visible in everyday life. Accuracy of any Optical Character Recognition (OCR) heavily depends upon Text segmentation from image document and segmentation of text into line, word, and character. In this Dissertation we have studied and proposed a new method for text segmentation from image document using Daubechies wavelet and 2-mean classification. For morphology, we have used morphology operation like dilation and erosion. Dilation adds pixels to the boundaries of objects in an image, while erosion removes pixels from object boundaries. We have obtained good accuracy compared to other methods of text segmentation like haar wavelets, Naive Bayes Classifier method and decision tree method. We have used same input image for the above methods and illustrated the corresponding output images. The proposed method for text segmentation from image document has been implemented in MATLAB. We have also studied and modified the proposed algorithm for segmentation of text into lines, words and characters for Devanagari and Gurmukhi scripts in which we have described the line, word, character and top character segmentation for printed Hindi text in Devanagari script. We have also described the line and word segmentation for printed text in Gurmukhi script. Performance increases in various levels have been obtained. We have observed the performance of segmentation with the help of five documents in devanagari script and five documents in gurmukhi script
URI:	http://hdl.handle.net/123456789/12216
Other Identifiers:	M.Tech
Research Supervisor/ Guide:	Sarje, A. K.
metadata.dc.type:	M.Tech Dessertation
Appears in Collections:	MASTERS' THESES (E & C)

Files in This Item:

File	Description	Size	Format
ECDG20196.pdf		2.05 MB	Adobe PDF	View/Open

Show full item record