OCR-based Text Extraction from Images

Vijayalakshmi V.; Amruta Ashok Naik; Chinmayee M.S

doi:10.17148/IJIREEICE.2022.10572

← Back to VOLUME 10, ISSUE 5, MAY 2022

OCR-based Text Extraction from Images

Vijayalakshmi V., Amruta Ashok Naik, Chinmayee M.S

Downloads: Download PDF|DOI: 10.17148/IJIREEICE.2022.10572

👁 11 views📥 1 download

Abstract: “OCR Based Text Extraction From Images” is based on text recognition from image and text-to-speech conversion. It converts the text within an image into speech format and reads it out. Image has text characters which is the main source of information for content-based indexing. The goal of text recognition is to recognize the text from printed hardcopy documents to the desired format.However, these text characters are difficult to be detected and recognized due to their varying sizes and complex backgrounds. In the segmentation step, we model the distribution of grayscale values of pixels. Finally, they are processed by OCR. OCR is the technology that is the answer for extracting data from the images and any documents and convert into computer-readable forms which can be helpful for editing or searching.Images are converted to text files that will be further converted to audio files.

Keywords: OCR,TTS,Text detection, preprocessing, Gaussian blur, spell correction

How to Cite:

[1] Vijayalakshmi V., Amruta Ashok Naik, Chinmayee M.S, “OCR-based Text Extraction from Images,” International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering (IJIREEICE), DOI: 10.17148/IJIREEICE.2022.10572

This work is licensed under a Creative Commons Attribution 4.0 International License.