[Student-projects] Language model and Acoustic model for Malayalam (Deepa P.Gopinath) as GSOC Project

karan singla ksingla025 at gmail.com
Sun Mar 9 16:00:39 PDT 2014


Hello Deepa,

I am Karan, working in LTRC,IIIT-Hyderabad and have also worked in a
project co-funded by AT&T in making an ASR for Hindi and have tried
adaptive acoustic modelling for Kannada and Malyalam( results were not
great )


As suggested by you, we can begin with taking a small speech corpus
available freely available for Malyalam

http://festvox.org/databases/iiit_voices/

Although, this is not sufficient, but just to begin with. We need to record
more data in the future.

For Acoustic Modelling:

There is a freely available phonetic dictionary for Hindi, in which Hindi
graphemes have been mapped to English American Phone set as Sphinx is build
up for English phone set and we don't have enough speech data for creating
a new model. So adaptation is only possible at first.

As Malayalam is a Dravidian language, I guess there is a phonetic
dictionary available for Telugu in speech lab at my university but I need
to check if they can share. So then adapting from Telugu will be a better
option as it can be called "close" to Malayalam than Hindi.

So after making a model with this dictionary, one need to generate phonetic
mapping for all the words in the transcription files of speech corpus.

For Language Modelling :
Transcriptions will be  included for sure. I am not aware of a raw text
available in Malayalam. Is there a raw data avialble ??

Am I thinking right ??

Hoping a reply soon,
Karan Singla
LTRC, IIIT-Hyderabad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.smc.org.in/pipermail/student-projects-smc.org.in/attachments/20140310/ec057e54/attachment-0002.htm>


More information about the Student-projects mailing list