A REVIEW ON SEGMENTATION TECHNIQUES OF LINES, WORDS AND CHARACTERS ON GUJARATI HANDWRITTEN DOCUMENT USING OCR
Main Author: | Nilam Mistry*, Sameer Vashi, Vidhi Patel, Kunal Shah, Denish Rixawapla, Foram Rakholiya, Rakesh Savant |
---|---|
Format: | Article Journal |
Terbitan: |
, 2016
|
Subjects: | |
Online Access: |
https://zenodo.org/record/54779 |
Daftar Isi:
- OCR is technique to convert the handwritten or printed document into the digital format by scanning it which can be understandable by a computer. OCR is important and challenging task in many computer vision applications. Segmentation is generally the first stage in any attempt to analyse or interpret an image automatically. Segmentation is separate the document into lines, lines to words and words to characters which has been one of the major laboriousness in handwritten text recognition. The role of segmentation is a crucial in most tasks requiring image analysis. The success or failure of a task is often a direct consequence of the success or failure of segmentation. Handwritten text documents contain text in free flow manner, also writing style of users may different even sometimes same user’s handwriting are different in different time. That is why segmentation is difficult in case of handwritten text document. As this paper focuses on Gujarati language, it contains more curves, overlapping character & slopes. So, it is very difficult to do segmentation on it. In this paper we have applied some of the segmentation techniques to segment the handwritten Guajarati documents & reached to some conclusion.