Synthetic or artificial speech has been developed steadily during the last decades. Especially, the intelligibility has reached an adequate level for most applications, especially for communication-impaired people. The intelligibility of synthetic speech may also be increased considerably with visual information. The objective of this work is to map the current situation of speech synthesis technology. Speech synthesis may be categorized as restricted (messaging) and unrestricted (text-to-speech) synthesis. The first one is suitable for announcing and information systems while the latter is needed for example in applications for the visually impaired. The text-to-speech procedure consists of two main phases, usually called high- and low-level synthesis. In high-level synthesis the input text is converted into such form that the low-level synthesizer can produce the output speech. The three basic methods for low-level synthesis are the formant, concatenative, and articulatory synthesis. The formant synthesis is based on the modeling of the resonances in the vocal tract and is perhaps the most commonly used during last decades. However, the concatenative synthesis, which is based on playing prerecorded samples from natural speech, is becoming more popular. In theory, the most accurate method is articulatory synthesis, which models the human speech production system directly, but it is also the most difficult approach. Here an attempt is made to develop a Hindi text to speech synthesizer with minimal errors by using concatenation approach.