Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/16182
Title: SPOKEN LANGUAGE IDENTIFICATION USING DEEP NEURAL NETWORKS
Authors: Yadav, Manish Kumar
Keywords: Deep Neural Networks;language Identication;Indian;Five South Asian
Issue Date: Jun-2018
Publisher: I I T ROORKEE
Abstract: This project thesis work studies the use of deep neural networks(DNNs) to address au- tomatic language identication(LID). Recent success of DNNs in speech processing and pattern recognition, has motivated us to include them in Language Identi ction tech- nique using MFCC, Delta and Double Delta MFCC features. DNN architectures has properties that make them suitable for di cult tasks among which Automatic Language Identi cation (LID) can be highlighted. Their capability to model complex functions in high-dimensional spaces and to get a good representation of the input data makes these architectures and algorithms proper for processing complex signals. This Project thesis is intended to study various approaches that combine both deep learning and automatic language recognition elds, to improve the LID task by getting a better representation of voice signals for classi cation purposes so that it can be identi ed which language has been used in that voice signal. In order to do this, DNN, SVM and KNN LID systems have been studied thoroughly and experimentally implemented.For this a completely new dataset of 8 Indian and ve South asian languages has been collected since a formal speech corpus is not available for these languages. The total speech data collected is about 51.24 hours. In this thesis, four major improvement have proposed, over state of the art i-vector mechanism with GMM. First, we replace the GMM based LID classi er with a ve layer DNN. Second, we have used three Acoustic features of MFCC, Delta and Double Delta MFCC which reduces computing costs. Thirdly we proposed Direct approach of using DNN for both Feature extraction and classi cation and nally we have used long term speech sequence of 10 to 15 sec for improving accuracy, however frames are reduced to 21 only to avoid latency. The results of DNN when compared with SVM and KNN classi ers on the same dataset found that DNN outperforms all the other two classi ers.
URI: http://localhost:8081/jspui/handle/123456789/16182
metadata.dc.type: Other
Appears in Collections:MASTERS' THESES (E & C)

Files in This Item:
File Description SizeFormat 
G28086.pdf4.54 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.