Optical Character Recognition Development Using Python

Authors

  • Prakhar Sisodia Amity School of Engineering and Technology, Amity University Uttar Pradesh, Lucknow Campus, India
  • Dr. Syed Wajahat Abbas Rizvi Amity School of Engineering and Technology, Amity University Uttar Pradesh, Lucknow Campus, India

DOI:

https://doi.org/10.54060/jieee.2023.75

Keywords:

pytesseract, pdfplumber, matplotlib, opencv-python, scikit-learn

Abstract

Optical Character Recognition (OCR) is a technology used to convert scanned or digital images into editable text. OCR has become an increasingly important tool in the fields of data extraction and information retrieval, allowing for quick and efficient conversion of scanned documents and digital images into text. In this paper, we explore the use of the Python programming language to implement OCR algorithms and systems. We provide a comprehensive overview of existing Python libraries and packages used for OCR, including Tesseract and pytesseract, along with their strengths and limitations. We also examine the different OCR approaches and techniques, including template matching, feature extraction, and encrypting/decrypting the OCR parsed files and discuss their implementation in Python. Finally, we present a case study of a simple OCR system built using Python and evaluate its performance on a sample dataset. The results of our study highlight the potential of Python for OCR implementation and demonstrate its feasibility for real-world applications. 

Downloads

Download data is not yet available.

References

A. L. Reibman and M. Veeraraghawan, “Reliability Modeling: an overview for system design,” IEEE Computer Society, vol.24, no.4, pp.49-57, 1991.

J. L. Lions, “ARIANE 5 Flight - 501 Failures Report,” 2010.

M. R. Lyu, “Handbook of Software Reliability Engineering,” IEEE Computer Society Press Los Alamitos California, ISBN: 978-0070394001, pp.1-850, 1996.

S. R. Dalal, M. R. Lyu and C.L. Mallows, “Software Reliability,” John Wiley & Sons, 2014.

R. A. Khan, K. Mustafa and S. I. Ahson, “Operation Profile-a key Factor for Reliability Estimation,” University Press Gautam Das and V. P. Gulati (Eds) CIT, vol.204, pp.347-354, 2004.

E. E. Ogheneovo, “Software Dysfunction: Why Do Software Fail?” Journal of Computer and Communications, vol.2, pp.25-35, 2014.

S. W. A. Rizvi, V. K. Singh and R. A. Khan, “Revisiting Software Reliability Engineering with Fuzzy Techniques, Proc. of the 3rd IEEE Int. Conf. on Computing for Sustainable Global Development. Published by IEEE Xplore, New Delhi, India, 2016.

H. B. Yadav and D. K. Yadav, “Early Software Reliability Analysis using Reliability Relevant Software Metrics,” International Journal of System Assurance Engineering and Management, pp.1-12, 2014.

S. W. A. Rizvi and R. A. Khan, “Maintainability Estimation Model for Object-Oriented Software in Design Phase (MEMOOD), Journal of Computing, vol.2, no.4, pp.26-32, 2010.

S. W. A. Rizvi and R. A. Khan, “A Critical Review on Software Maintainability Models,” Proceedings of the Conference on Cutting Edge Computer and Electronics Technologies, pp.144-148, 2009.

H. Pham, “System Software Reliability,” London: Reliability Engineering Series Springer, 2006.

A. K. Pandey and N. K. Goyal, “Early Software Reliability Prediction,” Springer India, 2013.

A. L. Goel, “Software Reliability Models: Assumptions Limitations and Applicability,” IEEE Transaction on Software Engi-neering, vol.11, no.12, pp.1411-1423, 1985.

H. B. Yadav and D. K. Yadav, “Early Software Reliability Analysis using Reliability Relevant Software Metrics,” International Journal of System Assurance Engineering and Manage, vol.8, no.4, pp.2097-2108, 2014.

JIEEE_V004_Iss03_S075

Downloads

Published

2023-11-25

How to Cite

[1]
Prakhar Sisodia and Syed Wajahat Abbas Rizvi, “Optical Character Recognition Development Using Python”, J. Infor. Electr. Electron. Eng., vol. 4, no. 3, pp. 1–13, Nov. 2023.

CITATION COUNT

Issue

Section

Research Article

Categories