[smc-discuss] GSOC2013 Proposal - Developing Acoustic and Language Model for Malayalam Recognition

Rahul A.R 2ar.rahul at gmail.com
Tue May 7 05:19:30 PDT 2013


I am sorry that i couldn't edit the proposal in time since i had my
university exam on 6th of May . Kindly consider this comment as my response
to the above requested details.



>"During this period i will be collecting all the voice data and text
>corpora required for the acoustic model and language model >respectively.
Graphemes to phoneme conversion and optimal >text selection algorithms will
be used to optimize the text >corpora . Choosing appropriate speakers based
on data >statistics will also be done during this period."



>Can this be a bit more detailed? As per the given timeline, this >activity
is planned for 15 days. Is it that simple? I can see >technical and
logistical overhead that can make your time >estimate wrong.



I have experience working on a project that involved modelling of a closed
vocabulory acoustic system.



Organisation such as CDAC and LDC-IL have already collected text and speech
corpora for continuous speech recognition in malayalam. I am planning to
collect data from them. I have already contacted and collected Voice Data
(7GB) from CDAC. But i found that there is scope for further improvement in
their data. Looking into the sample data available in the LDC-IL website it
seems that the data they have collected is more refined and better .



But if either of the organisation is not willing to upload the data under
free license , I am planning to conduct a data collection drive. I have
experience collecting data for a closed vocabulory acoustic system and i
believe i can complete sufficient data in 15 days .



>What do you mean by graphemes to phoneme conversion? is it >TTS
functionality? What kind of text selection algorithm is >planned and what
exactly is the purpose of that. Giving some >more detail in to that will
help us in understanding the >complexity and reliability of time planned.




After collecting text data (news text). I will be performing grapheme to
phoneme conversion to split the text into phonemes . We need to have the
largest number of utterances of less frequent phones in the text corpus to
achieve maximum utterance variation . For that we use optimum text
selection algorithm . The algorithm we are planning to implement is based
on a paper submitted in 2011 International Conference on Asian Language
Processing ( Link : cse.iitkgp.ac.in/~pabitra/paper/ialp.pdf ).



>"Once a working acoustic and language model has been >formed further
language specific improvements can be >performed . Consulting linguists for
incorporating Malayalam >grammar rules  to improve the recognition accuracy
of the >speech recognition system is one such method ."



>This activity is planned for 12 days.  It is clearly ambitious. Or >may be
your deliverables for this period is that simple. So can >you give details
on what is the deliverable from this 12 days >and how it helps in next
timeline?




As planned i will use August to address and solve as many of the language
specific problems mentioned in the project proposal. For that we need help
of linguists and i have kept apart 12 days (July  16 - July 28th ) for the
same.This time will be used for two things.



   1. Learn to modify the sphinx engine to make language specific
   improvements in the sphinx engine .
   2. Understand and Identify the important linguistic improvements that
   can be made .



NB : I Have put the same as a comment in google-melange too .


On Thu, May 2, 2013 at 4:23 PM, Rahul A.R <2ar.rahul at gmail.com> wrote:

> Thanks for the feedback .  I have updated my proposal pointing out some of
> the linguistic  and language specific challenges we will face during the
> course of our project .. I have made the time line as clear as possible .
> Kindly go through it .
>
> Link : http://wiki.smc.org.in/User:Ar_rahul/GSoC2013/
>
> Regards,
> A.R.Rahul
>
>
> On Wed, May 1, 2013 at 6:11 PM, Anivar Aravind <anivar.aravind at gmail.com>wrote:
>
>> As Deepa pointed , Your proposal still lags understanding of issue of
>> Language Modeling and Acoustic modeling, its complexities ,
>> linguistical challenges , and possible issues which need to be
>> addressed.  It only talks about your awareness/familiarity with tools.
>>
>> Since there is not much understanding/domain study , the timeline also
>> lacks clarity.
>> You have only limited time  do background study and improve your proposal
>> .
>>
>> Anivar
>>
>>
>> On 5/1/13, Anivar Aravind <anivar.aravind at gmail.com> wrote:
>> > As Deepa pointed , Your proposal still lags understanding of issue of
>> > Language Modeling and Acoustic modeling, its complexities ,
>> > linguistical challenges , and possible issues which need to be
>> > addressed.  It only talks about your awareness/familiarity with tools
>> > .
>> >
>> > Since there is not much understanding/domain study , the timeline also
>> > lacks clarity.  These
>> >
>> > You have  only limited time  do background study and improve your
>> proposal
>> > .
>> >
>> >
>> > --
>> > "[It is not] possible to distinguish between 'numerical' and
>> 'nonnumerical'
>> > algorithms, as if numbers were somehow different from other kinds of
>> > precise
>> > information." - Donald Knuth
>> >
>>
>>
>> --
>> "[It is not] possible to distinguish between 'numerical' and
>> 'nonnumerical'
>> algorithms, as if numbers were somehow different from other kinds of
>> precise
>> information." - Donald Knuth
>> _______________________________________________
>> Swathanthra Malayalam Computing discuss Mailing List
>> Project: https://savannah.nongnu.org/projects/smc
>> Web: http://smc.org.in | IRC : #smc-project @ freenode
>> discuss at lists.smc.org.in
>> http://lists.smc.org.in/listinfo.cgi/discuss-smc.org.in
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.smc.org.in/pipermail/discuss-smc.org.in/attachments/20130507/939ca7a4/attachment-0002.htm>


More information about the discuss mailing list