Flello World! Investigation in to using Pytesseract

I write down quite a few notes, things I need to do, things I need to remember but I only do this when I don’t have my Trello board to hand. I wondered how easy would it be for me to write a script that would create items from a photo of my notes.

To do this I first looked at Pytesseract, a wrapper for Google’s Tesseract-OCR Engine.

Here’s the steps I followed and the interesting… outcome…

Warning! Use the links below at your own risk, always check what you’re downloading!

First I installed Tesseract using the following link: https://tesseract-ocr.github.io/tessdoc/Home.html

Then I pip installed pytesseract and Pillow (a dependency):

pytesseract==0.3.8
Pillow==8.3.2

Then I wrote a simple block of code to open the image, parse it and print the result.

import pathlib

import pytesseract.pytesseract
from PIL import Image


def parse_image(path_to_image):
    with Image.open(path_to_image) as image:
        # The path to Tesseract executable which the wrapper requires
        pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
        # Return the image as a string
        return pytesseract.image_to_string(image)


if __name__ == "__main__":
    # Call the function passing in the image
    print(parse_image(pathlib.Path(__file__).parent / "test.JPG")[:-1])

Here’s my post it note:

Unfortunately here’s my output:

Flello

World !

Not quite what I had hoped for but still pretty amazing!