SPEECH MODELING AND COMPRESSION

Arif, Mohammad

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/17342

Full metadata record

DC Field	Value	Language
dc.contributor.author	Arif, Mohammad	-
dc.date.accessioned	2025-06-30T12:47:06Z	-
dc.date.available	2025-06-30T12:47:06Z	-
dc.date.issued	2013-05	-
dc.identifier.uri	http://localhost:8081/jspui/handle/123456789/17342	-
dc.description.abstract	Speech is the most desirable form of communication among individuals. In speech signal processing, speech modeling and compression are the emerging area of application. The speech signal is modeled with the purpose of synthesizing. The compression is also performed on speech signal taken from the standard databases. The few applications of speech analysis and synthesis are speech modification, speech coding, speech enhancement and speaker recognition. Data compression is the science of messaging data from the input source to obtain a compressed form of the input data with at most some tolerable loss of quality. Such data compression is required due to storage constraints, limited bandwidth and limited capacity on a communicahion channel. The applications of data compression are found in communication systetils. speech and image processing, pattern recognition, information retrieval, storage and cryptography. The speech compression in an area of signal processing that emphasizes on minimizing the bit rate or reduces redundancy of the input speech signal. The compressed signal may either be used for onward transmission for the purpose of further processing with at some tolerable loss of the input information or improve the quality of speech signal. In this research work, the synthesis of the speech signal based on certain models has been considered. Various methods such as method based on DST, SYMPES method, model order reduction method and lIRTE method are proposed. The performance of these methods is also evaluated in terms of some or all the various parameters stich as MSE. SNR, I'SNR. RSE, NRMSE, SI). MOS and Average MOS. Besides the synthesis of speech signal, a number of speech compression algorithms such as DE, 11G. combined DE and tIC, RLE and 'VP have been applied on the standard speech signals. Besides the above algorithms, the speech signals are also compressed by various translorni domain techniqties stich as PlT, DCT and DST. The above speech compression algorithms are applied on the standard databases of the speech signals with an objective of xii' 4 getting CR as high as possible while may retain the quality of compressed speech signal. Further, in the polynomial method, the order of polynomial is reduced with an emphasis to an alriio.st error free transmission of speech signal. In one of the methods to model the speech signal i.e. using discrete transforms such as DS1 and ITT is implemented. The proposed discrete transtwm model generates the speech signal while extracting the signal parameters such as amplitude and phase from the data base. The designed model by discrete transform is tested on a standard speech database of IPA such as Consonants. Conventions and Vowels of American-English. The speech signal evaluation parameters such as MSE. SNR, PSNR, NRMSE and SI) are computed to adjtidge the effectiveness of designed model to generate the speech signal. Another speech generating model known as SYMPES is implemented in a simplified manner, which requires less number of steps as it uses only the first term of all the computed energies conlaining the maximum value of energy among all the components. This way it avoids the cxhaustive number of steps involved in the implementation of standard and systematic SYMPES method. This simplified method is tested on few standard data base of the input speech signal. The results indicate effective performance of the simplified method for modeling speech signal. The performance parameters of speech signal verify the effectiveness of this simplified method to model speech signal. A model order reduction approach such as Schur form is used to reduce the order of the higher order system while retaining all of the inherent properties of the higher order system. Further, the inpul speech signal is applied separately on both the systems and finally the speech signals are reconstructed at the decoder end by dc-convolution operation for both the systems. separately. The standard speech signal parameters such as MSE. SNR, PSNR. NRMSE and SD are computed for both the systems. The computed values of the speech signal parameters indicate that the performance of the reduced order model is better than higher order model for speech signals considered. As proposed. the speech signal is also modeled by IIRTP method. This also rcsults in satisfactory reconstruction of speech signal at the decoder end in non-noisy and noisy environments of the channel. The improvements in computed speech signal parameters signify xiv the effective performance of HRTF model for speech synthesis in both the environments such as channel with noise and without noise. The speech quality test is also performed here. To implement this. 5 listeners have been asked to give their opinion on the generated speech which has been recorded in terms of numerical values of OS. Alter recording the OS of the listeners, the computations of MOS and Average MOS is perfbrmed. The computed MOS and Average MOS values indicate the effective performance of the FIRTE model. In this research work, the speech compression aspect is also considered. A number of speech compression akorithms such as DE. NC. combined DE and tIC and RLE are implemented. The computation of CR is clone at the encoder end. In all the speech compression techniques. 4 the CR values are obtained with the reasonable level of compression. The performance parameters such as CR. MSE, SNR, PSNR and NRMSE are computed for the discrete transform domain techniques such as FFT, DCT and DST applied here for speech compression. II is found that DST gives the better performance compared to FF1' and DCT for the speech signals taken into consideration which is reflected in terms of improvement in above specified performance parameters. The Ti' algorithm is also been applied to the various standard speech signals taken from IPA. it is fotind that this algorithm is also .stlital)le for the speech signal compression. It gives the value of CR as 2 for the speech signals taken into consideration. The MOS and Average MOS are also computed for the different speech signal patterns. The TI' algorithm gives an excellent response in terms of hearing quality of the speech signal at the encoder end. The coding of compressed samples at the encoder end is also performed by this algorithm with the evaluation of MSE. The polynomial method is also implemented to compress the number of the coefficients based on degree and similar to above algorithms, speech signal parameters are computed. The computed speech signal parameters do show an almost error free transmission of speech signal. In a nut shell, it can he said that the reported work in this thesis is an effort to dctermine suitability of various speech modeling techniques to generate the speech signal. An effort is also made to compress the speech signal up to an appropriate level.	en_US
dc.description.sponsorship	INDIAN INSTITUTE OF TECHNOLOGY ROORKEE	en_US
dc.language.iso	en	en_US
dc.publisher	I I T ROORKEE	en_US
dc.subject	Among Individuals	en_US
dc.subject	Data Compression	en_US
dc.subject	Synthesizing	en_US
dc.subject	Speech Modification	en_US
dc.title	SPEECH MODELING AND COMPRESSION	en_US
dc.type	Other	en_US
Appears in Collections:	MASTERS' THESES (Electrical Engg)

Files in This Item:

File	Description	Size	Format
G23255.pdf		16.8 MB	Adobe PDF	View/Open

Show simple item record