[smc-discuss] Boxfile for tesseract and Malayalam letters with expanded spacing

Baiju M baiju.m.mail at gmail.com
Sun Oct 20 08:13:26 PDT 2013


Hi,

I am trying to create a boxfile for tesseract.  My current target is
to recognize Rachana typeface. I am experimenting with LibreOffice to
create a sample TIF file using some Malayalam text.

In LibreOffice, what's happening when we use
Format->Character->Position->Spacing->Expanded for Malayalam
characters ? What's the logic to identify a character ?

Can I get something similar using Pango or any other tool which I can
use as a library (C/Python) or command-line which does similar to
LibreOffice ?

So far I am fine with result of LibreOffice, but I would like to use
something which I can automate.

Regards,
Baiju M



More information about the discuss mailing list