DSpace Repository

SPEAKER DEPENDANT ISOLATED HINDI WORD RECOGNITION

Show simple item record

dc.contributor.author D, Shankar Babu
dc.date.accessioned 2014-12-05T06:16:18Z
dc.date.available 2014-12-05T06:16:18Z
dc.date.issued 2005
dc.identifier M.Tech en_US
dc.identifier.uri http://hdl.handle.net/123456789/13153
dc.guide Anand, R. S.
dc.description.abstract Although speech recognition products are already available in the market at present, their development is mainly based on statistical techniques which work under very specific assumptions. This thesis examines how artificial neural networks can benefit a speaker dependent isolated speech recognition system. Currently, most speech recognition systems are based on hidden Markov models (HMMs), a statistical framework that supports both acoustic and temporal modeling. Despite their state.of-the-art performance, HMMs make a number of suboptimal modeling assumptions that limit their potential effectiveness. Neural networks avoid many of these assumptions, while they can also learn complex functions, generalize effectively, tolerate noise, and support parallelism, while neural networks can readily be applied to acoustic modeling. Neural Network has several theoretical advantages over a pure HMM system, including better acoustic modeling accuracy, better context sensitivity,, more natural discrimination, and a more economical use of parameters. These advantages are confirmed experimentally by a NN that we developed, based on speaker dependent isolated Hindi words on the Resource Management database. Speech recognition involves recording the input speech signal, extracting the key features of the speech, converting the features into codes and finally classification of the codes. A speech recognizer system comprised of two distinct blocks, a Feature Extractor and a Recognizer. The Feature Extractor block uses a Mel-frequency cepstral analysis which translates the incoming speech into a feature vectors and recognizer block uses neural network. In the course of developing this system, we explored two different ways to use neural networks for audio modeling: prediction and classification. We found that predictive networks yield poor -results because of a lack of discrimination, but classification networks gave excellent results. Finally, this thesis reports how -we optimized the accuracy of our system with many natural techniques, such as expanding the input window size, normalizing the inputs, increasing the number of hidden units, converting the network's output activations to log likelihoods, optimizing the learning rate schedule by automatic search, backpropagating error from word level outputs, and using gender dependent networks. en_US
dc.language.iso en en_US
dc.subject ELECTRICAL ENGINEERING en_US
dc.subject SPEAKER DEPENDANT ISOLATED HINDI WORD RECOGNITION en_US
dc.subject SPEECH RECOGNITION en_US
dc.subject HIDDEN MARKOV MODELS en_US
dc.title SPEAKER DEPENDANT ISOLATED HINDI WORD RECOGNITION en_US
dc.type M.Tech Dessertation en_US
dc.accession.number G12332 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record