[Student-projects] GSoC Varnam

Kevin Martin youcancallmekevin at gmail.com
Thu Feb 27 12:38:48 PST 2014


Also, in the ideas page, the task was to infer the stemmed word given a
word with a prefix. Is this a mistake? Shouldn't it have been suffix
instead? Bharat is the stem, bharateey is 'eey' added as a *suffix* to the
stem, right?


On Fri, Feb 28, 2014 at 1:48 AM, Kevin Martin
<youcancallmekevin at gmail.com>wrote:

> Hi,
>
> I'm Kevin and I'm a 3rd year computer science student from College of
> engineering Trivandrum. I accidentally posted my initial inquiries to the
> discussion list of SMC and was thus delayed. I'm interested in improving
> the machine learning capabilities of Varnam. I'm a native speaker of
> malayalam and I believe this can be considered as an added advantage. I
> have built varnam on my machine using the instructions provided at
> gitorious and was attempting to read the source. As directed by Navaneeth K
> N in a previous thread, I began by reading learn.c.
>
>
> However, I quickly ran into doubts. Though I think spending a few more
> hours reading the code carefully would reveal the answers to me, I'd feel
> more comfortable if someone can validate my doubts :
>
> 1) Token : A token is an indivisible word. A token is the basic building
> block. 'tokens' is an object (instance? I mean the non-OOP equivalent of an
> object) of the type varray. 'tokens' contain all the possible patterns of a
> token? For example, മലയാളം മലയാളത്തിന്റെ മലയാളത്തിൽ മലയാള would all go
> under the same varray instance 'tokens'?. And each word ( for eg മലയാളം )
> would occupy a slot at tokens->memory I suppose. Am I right in this regard?
>
> 2) I see the data type 'v_' frequently used. However,I could not find its
> definition! I missed it, of course. Running ctrl+f on a few source files
> did not turn up the definitions. So I thought I would simply ask here! I
> would be really grateful if you can tell me where it is defined and why it
> is defined (what it does)
>
> regards,
>
> Kevin Martin Jose
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.smc.org.in/pipermail/student-projects-smc.org.in/attachments/20140228/962793c0/attachment.html>


More information about the Student-projects mailing list