[Student-projects] Varnam can now stem

Kevin Martin youcancallmekevin at gmail.com
Sun Jun 29 08:22:12 PDT 2014


On Sun, Jun 29, 2014 at 6:53 PM, Vasudev Kamath <kamathvasudev at gmail.com>
wrote:

>
> Off topic not related to this discussion.
>
> aboobacker sidheeque mk <aboobackervyd at gmail.com> writes:
>
> > ഇന്നലെ നമ്മള്‍ ചാറ്റില്‍ ഡിസ്കസ് ചെയ്തതതാണ് , മെയ്ലിങ്ങ് ലിസ്റ്റില്‍
> കൂടി കൊടുക്കാം എന്നു വച്ചു
> > :-)
>
> Can you please translate this?.. I would suggest you restrain from
> writing comments or replies in Malayalam, there are mentors on this list
> who don't understand Malayalam.
>

> >
> > Take two similar words ചിരിക്കുക and ഇരിക്കുക , if you stemmed this ,
> > output will be ചിര and ഇര , but past tense of these words are ചിരിച്ചു ,
> > ഇരുന്നു respectively . Then how to use this stem for prediction ?? ുന്നു
> is
> > not suitable for ചിര and ിച്ചു is not suitable for ഇര . In Malayalam
> > verb alone have ~ 30 different suffix patterns (or paradigms)
> >
>
I thought about what you said yesterday. Strictly speaking, the goal of the
stemmer is not to find the past tense. But it is true that if  ചിരിക്കുക
stems to ചിര then it wouldn't benefit varnam at all.

>  > Similar case with noun :
> > തിരുവനന്തപുരം -> തിരുവനന്തപുരത്ത്
> > മരം->മരത്തില്‍ (not മരത്തില്‍ )
>
>
I did not understand the example about മരം->മരത്തില്‍ . I do not think any
stemmer can stem nouns properly, as the nouns can have foreign roots.
When testing with this[1] article, the stemmer stems with an accuracy of
89%. However, this is a result of not stemming when stemming is not
necessary rather than stemming properly where stemming is necessary. But I
noted that malayali nouns are usually (not always) stemmed correctly.
eg : കോഴിക്കോട്ടെ : കോഴിക്കോട്

[1] ml.wikipedia.org/wiki/തച്ചോളി_ഒതേനൻo

>
> --l
> Vasudev Kamath
> http://copyninja.info
> Connect on ~friendica: copyninja at samsargika.copyninja.info
> IRC nick: copyninja | vasudev {irc.oftc.net | irc.freenode.net}
> GPG Key: C517 C25D E408 759D 98A4  C96B 6C8F 74AE 8770 0B7E
>
> _______________________________________________
> Student-projects mailing list
> Student-projects at lists.smc.org.in
> http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.smc.org.in/pipermail/student-projects-smc.org.in/attachments/20140629/c2e363cb/attachment.html>


More information about the Student-projects mailing list