sabato 1 febbraio 2014

VOWELS RECONGNITION USING NEURAL NET

 In the last years the use of vocal recognition systems, and more generally speech recognition system, has captured the interest of many people. With speech recognition we mean the translation of spoken words into text.

In recent years companies like Google and Microsoft showed a lot their interest in this technology: this is strictly related to their investments on handheld systems, like smartphone and tablet. Honestly I think that also in the next years this subject will be trendy: the direction is towards touch-less systems.

In this work I tried to understand how neural nets can be applied, in particular, to vowel recognition. If we think about it, to speak is one of the most important property of humans, but we learn a lot of it when we are child. As you know neural nets try to reproduce human brain behavior: so I thought to apply this technology on a basic (but very complex) human task. My goal is to understand how neural nets are suitable in this field, and moreover what are the strategies to take into account.

As result, goals of this work are to verify if vowels can be recognized only by pronouncing it: if it works, we can try to recognize it in a word.

Moreover, I tried to make things as accessible as possible: just using a browser (in particular Chrome/Safari). I used the web audio API (a new incoming standard related to HTML5) that offer a wide range of instruments. For the neural network side I used a very good library, called brain-js, that worked very well. 

DEMO

In order to use the application, just click this link. If you want run it locally, you have to use a local server (like Apache, or WAMP for window user). You have to put the folder in the www folder, and reach the index from localhost. If you want to test my training set (an Italian one), you can load it from the application. 

SPECIFICATION DOCUMENT 

If you want to continue the reading, download the whole document here