Please use this identifier to cite or link to this item: http://localhost:8081/xmlui/handle/123456789/11519
Title: DEVELOPMENT OF ALGORITHM FOR MULLTIPLE. SPEAKER RECOGNITION
Authors: Anand, Vivek
Keywords: ELECTRICAL ENGINEERING;MULLTIPLE SPEAKER RECOGNITION;BLIND SOURCE SEPARATION;SPEECH RECOGNITION
Issue Date: 2011
Abstract: Multiple Speaker Recognition is the computing task of checking speaker's identity using characteristics extracted from their voice samples. It uses the acoustic features of speech that have been found to differ between individuals. These acoustic patterns reflect both anatomy (e.g., size and shape of the throat and vocal cord) and behavioural patterns (e.g., voice pitch, speed and accent etc). Conventional speaker identification and speech recognition algorithms do not perform well if there are multiple speakers in the background. For high performance speaker identification and speech recognition applications in multiple speaker environments, a speech separation stage is essential. The main aim of this dissertation is Multiple Speaker Recognition, which consists of comparing a speech signal from test speaker to a database of known (trained) speakers. The system can recognize the speaker, which has been trained with a number of speakers. This task has been divided in to two steps (1) Source Separation (2) Feature Extraction and Feature Matching. In this work mixtures of simultaneous speeches are taken as source mixtures, which are successfully separated into independent components using ICA (Independent Component Analysis) technique of BSS (Blind Source Separation), which is a powerful higher order statistical technique. ICA is a method of finding components from multidimensional data. It distinguishes the components on the basis of statistical independence and non Gaussianity. A Speaker Recognition System is developed, distinguishing a particular person from another on the basis of the speech features called Mel Frequency Cepstral Coefficients (MFCC). Large set of points i.e. MFCC coefficients are divided into groups having approximately the same number of points closest to them using Vector Quantization (VQ); i.e. the data is compressed using VQ and then stored. Each group is represented by its centroid point (codebook). The system results in a solution if it has found the lowest distortion distance between the codebook tested and the various trained codebooks. The object of multiple speaker recognition has been achieved and the- accuracy of recognition is quite well.
URI: http://hdl.handle.net/123456789/11519
Other Identifiers: M.Tech
Research Supervisor/ Guide: Anand, R. S.
Dewal, M. L.
metadata.dc.type: M.Tech Dessertation
Appears in Collections:MASTERS' THESES (Electrical Engg)

Files in This Item:
File Description SizeFormat 
EEDG21219.pdf3.95 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.