[Student-projects] GSOC'14: Spell checker
Shafeeq K
shafeeq94 at gmail.com
Thu Mar 6 08:34:22 PST 2014
Hi,
I'm Shafeeq, a second year CSE student from NSS College of
Engineering, Palakkad.
I am interested in this year's GSOC project "A spell checker for Indic
language that understands inflections". I've been reading and doing a
little of homework for this project, as suggested by the mentor.
I couldn't find any affix rules for malayalam in the corresponding
affix file. Does that mean currently we rely only on the collection of
words for spell check?
Hunspell manual suggests only two-fold suffix stripping. Since it was
mentioned that Indic languages might require as much as 5 levels of
stripping, is Hunspell the way forward? I saw an experimental
indic-stemmer in SILPA. Couldn't we expand it to handle the multilevel
suffix stripping?
About the agglutinations of words and suffixes, I came across a paper
while reading about it [1]. Could you please suggest some other
documents as well?
Thanks.
[1]: http://aclweb.org/anthology//O/O12/O12-1028.pdf
Shafeeq
More information about the Student-projects
mailing list