tesseract ocr has worked well for me in many cases,
but sometimes it separates columns vertically instead
of keeping columns of data on the same line.
For example a jpg showing a list of names and addresses
with the names followed by the addresses on the same line,
but the ocr result has the list of names in a vertical line
followed by the addresses in a vertical line below the names
instead of on the same line as the names.
I found a man page that suggested using --psm N
with 10 different numeric options for N, but got no help.
$: tesseract namesandaddresses.jpg names --psm 4
Resulting error message:
Tesseract Open Source OCR Engine v3.03 with Leptonica
read_params_file: Can't open 4
Any suggestions how to fix this?
---------------------------------------------------
PLUG-discuss mailing list -
PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss