[smc-discuss] ചില്ലക്ഷരങ്ങളുടെ രാഷ്ട്രീയം
Santhosh Thottingal
santhosh.thottingal at gmail.com
Tue Jan 18 09:11:47 PST 2011
2011/1/18 Jayadevan Raja <jayadevanraja at gmail.com>:
> @Santhosh: Your reply was really informative, detailed and excellent,
> showing the major concerns of people who use unicode, around the world.
>
> But the implementation of Unicode string searches and comparisons in text
> processing software must take into account the presence of equivalent code
> points. In the absence of this feature, users searching for a particular
> code point sequence would be unable to find other visually indistinguishable
> glyphs that have a different, but canonically equivalent, code point
> representation.
Yes, Malayalam already has canonically equivalent code points and it
is properly implemented in GNU/Linux.(I wrote that for GNU C library).
Search and Collation on Malayalam works on GNU/Linux based on this
rules.
You can try this in your latest gnu/linux.
eg: കോ === കേ + ാ === ക + ോ
Btw, AFAIK, MS windows does not implement this.
>
> So, aren't atomic chills and non-atomic (composite) chills in Malayalam
> canonically equivalent?
Canonical equivalence is not defined between both type of chillu
>If we use "Normalization Form Canonical Composition"
> everywhere, and consider the chills to be composite characters, isn't the
> problem solved? Isn't the same issue there in almost all major scripts other
> than basic Latin?
Please read http://thottingal.in/blog/2008/06/02/canonical-equivalence-in-unicode-some-notes/
Thanks
Santhosh Thottingal
More information about the discuss
mailing list