Have you tried using ocrfeeder? It uses tesseract but has tools to let
you get better out put from it, but manually or automatically defining
areas.
Brian Cluff
On 5/22/19 8:22 PM, Joe Lowder wrote:
> tesseract ocr has worked well for me in many cases,
> but sometimes it separates columns vertically instead
> of keeping columns of data on the same line.
>
> For example a jpg showing a list of names and addresses
> with the names followed by the addresses on the same line,
> but the ocr result has the list of names in a vertical line
> followed by the addresses in a vertical line below the names
> instead of on the same line as the names.
>
> I found a man page that suggested using --psm N
> with 10 different numeric options for N, but got no help.
>
> $: tesseract namesandaddresses.jpg names --psm 4
>
> Resulting error message:
> Tesseract Open Source OCR Engine v3.03 with Leptonica
> read_params_file: Can't open 4
>
> Any suggestions how to fix this?
>
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss@lists.phxlinux.org
> To subscribe, unsubscribe, or to change your mail settings:
> https://lists.phxlinux.org/mailman/listinfo/plug-discuss
---------------------------------------------------
PLUG-discuss mailing list -
PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss