Getting started with Python OCR on windows?

I have never used python before, and I am not sure where to start. My goal is to take image data, of numbers and multicolored background, and reliably get the correct characters identified. I looked into the tools necessary for this and I found the Anaconda python distribution which included all the possible packages I might need for this, as well as tesseract-ocr and pytesser.

Unfortunately, I'm lost in how to begin. I"m using the PyCharm Community IDE and simply trying to follow this guide: http://www.manejandodatos.es/2014/11/ocr-python-easy/ to get a grasp on OCR.

This is the code I'm using:

from PIL import Image
from pytesser import *

image_file = 'menu.jpg'
im = Image.open(image_file)
text = image_to_string(im)
text = image_file_to_string(image_file)
text = image_file_to_string(image_file, graceful_errors=True)
print "=====output=======n"
print text

and I believe the Anaconda distribution that I'm using has PIL, but I'm getting this error:

C:Usersdiego_000Anacondapython.exe C:/Users/diego_000/PycharmProjects/untitled/test.py
Traceback (most recent call last):
  File "C:/Users/diego_000/PycharmProjects/untitled/test.py", line 2, in <module>
    from pytesser import *
  File "C:Usersdiego_000PycharmProjectsuntitledpytesser.py", line 6, in <module>
    import Image
ImportError: No module named Image

Process finished with exit code 1

Can anyone point me in the right direction?


The document you point to says to use

from PIL import Image

except you use

import Image

and so the interpreter properly says:

ImportError: No module named Image

It looks as if you reordered the lines

from PIL import Image
from pytesser import *

and that pytesser has a improperly coded dependency on PIL. but I can't be certain with the code you provided.

链接地址: http://www.djcxy.com/p/96738.html

上一篇: 当我用pytesser运行tesseract时,如何隐藏控制台窗口

下一篇: 在Windows上开始使用Python OCR?