Call for Paper

CAE solicits original research papers for the August 2020 Edition. Last date of manuscript submission is July 31, 2020.

Read More

Analysis of Pitch and Duration in Speech Synthesis using PSOLA

Kavita Waghmare, Sangramsing Kayte, Bharti Gawali. Published in Signal Processing.

Communications on Applied Electronics
Year of Publication: 2016
Publisher: Foundation of Computer Science (FCS), NY, USA
Authors: Kavita Waghmare, Sangramsing Kayte, Bharti Gawali
10.5120/cae2016652061

Kavita Waghmare, Sangramsing Kayte and Bharti Gawali. Article: Analysis of Pitch and Duration in Speech Synthesis using PSOLA. Communications on Applied Electronics 4(4):10-18, February 2016. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX

@article{key:article,
	author = {Kavita Waghmare and Sangramsing Kayte and Bharti Gawali},
	title = {Article: Analysis of Pitch and Duration in Speech Synthesis using PSOLA},
	journal = {Communications on Applied Electronics},
	year = {2016},
	volume = {4},
	number = {4},
	pages = {10-18},
	month = {February},
	note = {Published by Foundation of Computer Science (FCS), NY, USA}
}

Abstract

The speech synthesis system is an artificial production of speech with the help of speech synthesizers. It can be achieved using various techniques. During synthesis the smoothing of concatenating points is an important aspect to be studied. This paper attempts to find the effect of pitch-marking process using Time Domain-Pitch Synchronous Overlap and Add (TD-PSOLA) method. The database consists of 60 sentences containing various phones, syllables, phrases which provide prosodic effects in male and female voices. The analysis shows that the pitch –marking process affects the quality of speech in the synthesis process which soothes at concatenation point.

References

  1. Archana Balyan, S. S. Agrawal, Amita Dev,”Speech Synthesis: A review”, IJERT, vol.2 Issue 6, June 20013.
  2. A. Indumathi, Dr. E. Chandra,” Survey on speech synthesis”, Signal Processing: An International Journal (SPIJ), Volume (6): Issue (5): 2012.
  3. Shruti Gupta, Prateek Kumar, “Comparative study of text to speech system for Indian Language”, International Journal of Advances in Computing and Information Technology ISSN 2277-9140 April 2012.
  4. D.Sasirekha, E.chandra,” Text to Speech:A Simple Turorial”, International Journal of Soft Computing and Engineering(IJSCE),ISSN:2231-2307,Volume-2,Issue-1,March 2012.
  5. Mahwash Ahmed,Shibli Nisar,”Text-to-Speech using Phoneme Concatenation”, International Journal of Scientific Engineering and Technology,Vol 3,Feb 2014.
  6. Allum Mousa,”Voice Conversion Using Pitch shifting algorithm by time stretching with PSOLA and Re-Sampling”,Journal of Electrical Engineering Vol.61.No1,2010.
  7. JodoP.Cabra,LuisC.Oliveria,”Pitch-Synchronous Time-Scaling for Prodosic and VoiceQuality bhaTransformations”,INTESPEECH 2005.
  8. R.Muralishankar,A.G.Ramakrishana and P.Prathibha, ”Modification of Pitch using DCT in the Source Domain”,Elsevier-speech communication,vol-42,Feb 2004.
  9. Ulrich Germann,”An Iterative Approach to Pitch-marking of speech signals without Electroglottographic Data,CiteSeer 5M,2006
  10. H.Hussien,M.Wolff,O.Jokisch,F.Duckhorn,G.Strecha and R.Hoffmann,”A Hybrid Speech Signal Based Algorithm for Ptich Marking Using Finite State Machines,INTERSPEECH 2008.
  11. Anant Bhatt,”A PSOLA based Apporach for Voice Morphing”,IJDACR,Feb-2015
  12. Kavita Waghmare, Reena H. Chaudhari, Bharti W. Gawali, “Accent identification using MFCC for Hindi Language”, Advances in Computational Research, Volume 7, Issue 1, 23 January 2015.
  13. Reena H. Chaudhari, Kavita Waghmare, Bharti W. Gawali, “Accent Recognition using MFCC and LPC with Acoustic Features”, International Journal of Innovative Research in Computer and Communication Engineering , Vol. 3, Issue 3, 9 March 2015.
  14. Sangramsing Kayte, Kavita Waghmare, Dr. Bharti Gawali “Marathi Speech Synthesis: A review” International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 6 3708 – 3711, 24 June 2015 (Impact Factor 5.837

Keywords

Text-to speech (TTS), pitch, duration, PSOLA, pitch-markings.