SPOKEN LANGUAGE IDENTIFICATION USING DEEP NEURAL NETWORKS

Yadav, Manish Kumar

DSpace Home
→
ELECTRONICS AND COMMUNICATION ENGINEERING (FORMERLY ELECTRONICS & COMPUTER ENGINEERING)
→
MASTERS' THESES (E & C)
→
View Item

dc.contributor.author	Yadav, Manish Kumar
dc.date.accessioned	2025-05-11T15:03:24Z
dc.date.available	2025-05-11T15:03:24Z
dc.date.issued	2018-06
dc.identifier.uri	http://localhost:8081/jspui/handle/123456789/16182
dc.description.abstract	This project thesis work studies the use of deep neural networks(DNNs) to address au- tomatic language identication(LID). Recent success of DNNs in speech processing and pattern recognition, has motivated us to include them in Language Identi ction tech- nique using MFCC, Delta and Double Delta MFCC features. DNN architectures has properties that make them suitable for di cult tasks among which Automatic Language Identi cation (LID) can be highlighted. Their capability to model complex functions in high-dimensional spaces and to get a good representation of the input data makes these architectures and algorithms proper for processing complex signals. This Project thesis is intended to study various approaches that combine both deep learning and automatic language recognition elds, to improve the LID task by getting a better representation of voice signals for classi cation purposes so that it can be identi ed which language has been used in that voice signal. In order to do this, DNN, SVM and KNN LID systems have been studied thoroughly and experimentally implemented.For this a completely new dataset of 8 Indian and ve South asian languages has been collected since a formal speech corpus is not available for these languages. The total speech data collected is about 51.24 hours. In this thesis, four major improvement have proposed, over state of the art i-vector mechanism with GMM. First, we replace the GMM based LID classi er with a ve layer DNN. Second, we have used three Acoustic features of MFCC, Delta and Double Delta MFCC which reduces computing costs. Thirdly we proposed Direct approach of using DNN for both Feature extraction and classi cation and nally we have used long term speech sequence of 10 to 15 sec for improving accuracy, however frames are reduced to 21 only to avoid latency. The results of DNN when compared with SVM and KNN classi ers on the same dataset found that DNN outperforms all the other two classi ers.	en_US
dc.description.sponsorship	INDIAN INSTITUTE OF TECHNOLOGY ROORKEE	en_US
dc.language.iso	en	en_US
dc.publisher	I I T ROORKEE	en_US
dc.subject	Deep Neural Networks	en_US
dc.subject	language Identication	en_US
dc.subject	Indian	en_US
dc.subject	Five South Asian	en_US
dc.title	SPOKEN LANGUAGE IDENTIFICATION USING DEEP NEURAL NETWORKS	en_US
dc.type	Other	en_US