Analysis of Pitch and Duration in Speech Synthesis using PSOLA

Kavita Waghmare; Sangramsing Kayte; Bharti Gawali

Call for Paper

April Edition

CAE solicits high quality original research papers for the upcoming April edition of the journal. The last date of research paper submission is 30 March 2026

Submit your paper

Know more

The week's pick

Machine Learning Models, Data Preprocessing Techniques and Suite of Metrics for Assessing Solar Power Forecasting: A Comprehensive Review

Asma A.M. Nagaraja

Random Articles

Reseach Article

Analysis of Pitch and Duration in Speech Synthesis using PSOLA

by Kavita Waghmare, Sangramsing Kayte, Bharti Gawali

Communications on Applied Electronics

Foundation of Computer Science (FCS), NY, USA

Volume 4 - Number 4

Year of Publication: 2016

Authors: Kavita Waghmare, Sangramsing Kayte, Bharti Gawali

10.5120/cae2016652061

Kavita Waghmare, Sangramsing Kayte, Bharti Gawali . Analysis of Pitch and Duration in Speech Synthesis using PSOLA. Communications on Applied Electronics. 4, 4 ( February 2016), 10-18. DOI=10.5120/cae2016652061

@article{ 10.5120/cae2016652061,

author = { Kavita Waghmare, Sangramsing Kayte, Bharti Gawali },

title = { Analysis of Pitch and Duration in Speech Synthesis using PSOLA },

journal = { Communications on Applied Electronics },

issue_date = { February 2016 },

volume = { 4 },

number = { 4 },

month = { February },

year = { 2016 },

issn = { 2394-4714 },

pages = { 10-18 },

numpages = {9},

url = { https://www.caeaccess.org/archives/volume4/number4/534-2016652061/ },

doi = { 10.5120/cae2016652061 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-09-04T19:54:06.443472+05:30

%A Kavita Waghmare

%A Sangramsing Kayte

%A Bharti Gawali

%T Analysis of Pitch and Duration in Speech Synthesis using PSOLA

%J Communications on Applied Electronics

%@ 2394-4714

%V 4

%N 4

%P 10-18

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The speech synthesis system is an artificial production of speech with the help of speech synthesizers. It can be achieved using various techniques. During synthesis the smoothing of concatenating points is an important aspect to be studied. This paper attempts to find the effect of pitch-marking process using Time Domain-Pitch Synchronous Overlap and Add (TD-PSOLA) method. The database consists of 60 sentences containing various phones, syllables, phrases which provide prosodic effects in male and female voices. The analysis shows that the pitch –marking process affects the quality of speech in the synthesis process which soothes at concatenation point.

References

Archana Balyan, S. S. Agrawal, Amita Dev,”Speech Synthesis: A review”, IJERT, vol.2 Issue 6, June 20013.
A. Indumathi, Dr. E. Chandra,” Survey on speech synthesis”, Signal Processing: An International Journal (SPIJ), Volume (6): Issue (5): 2012.
Shruti Gupta, Prateek Kumar, “Comparative study of text to speech system for Indian Language”, International Journal of Advances in Computing and Information Technology ISSN 2277-9140 April 2012.
D.Sasirekha, E.chandra,” Text to Speech:A Simple Turorial”, International Journal of Soft Computing and Engineering(IJSCE),ISSN:2231-2307,Volume-2,Issue-1,March 2012.
Mahwash Ahmed,Shibli Nisar,”Text-to-Speech using Phoneme Concatenation”, International Journal of Scientific Engineering and Technology,Vol 3,Feb 2014.
Allum Mousa,”Voice Conversion Using Pitch shifting algorithm by time stretching with PSOLA and Re-Sampling”,Journal of Electrical Engineering Vol.61.No1,2010.
JodoP.Cabra,LuisC.Oliveria,”Pitch-Synchronous Time-Scaling for Prodosic and VoiceQuality bhaTransformations”,INTESPEECH 2005.
R.Muralishankar,A.G.Ramakrishana and P.Prathibha, ”Modification of Pitch using DCT in the Source Domain”,Elsevier-speech communication,vol-42,Feb 2004.
Ulrich Germann,”An Iterative Approach to Pitch-marking of speech signals without Electroglottographic Data,CiteSeer 5M,2006
H.Hussien,M.Wolff,O.Jokisch,F.Duckhorn,G.Strecha and R.Hoffmann,”A Hybrid Speech Signal Based Algorithm for Ptich Marking Using Finite State Machines,INTERSPEECH 2008.
Anant Bhatt,”A PSOLA based Apporach for Voice Morphing”,IJDACR,Feb-2015
Kavita Waghmare, Reena H. Chaudhari, Bharti W. Gawali, “Accent identification using MFCC for Hindi Language”, Advances in Computational Research, Volume 7, Issue 1, 23 January 2015.
Reena H. Chaudhari, Kavita Waghmare, Bharti W. Gawali, “Accent Recognition using MFCC and LPC with Acoustic Features”, International Journal of Innovative Research in Computer and Communication Engineering , Vol. 3, Issue 3, 9 March 2015.
Sangramsing Kayte, Kavita Waghmare, Dr. Bharti Gawali “Marathi Speech Synthesis: A review” International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 6 3708 – 3711, 24 June 2015 (Impact Factor 5.837

Index Terms

Computer Science

Information Sciences

Keywords

Text-to speech (TTS) pitch duration PSOLA pitch-markings.