Call for Paper

CAE solicits original research papers for the October 2021 Edition. Last date of manuscript submission is September 30, 2021.

Read More

Two Stage Approaches for the Detection and Suppression of Typed Keystrokes in Speech Signals

Rizwan Ullah, Renjie Tong, Yawar Ali Sheikh, Zhongfu Ye. Published in Signal Processing.

Communications on Applied Electronics
Year of Publication: 2016
Publisher: Foundation of Computer Science (FCS), NY, USA
Authors: Rizwan Ullah, Renjie Tong, Yawar Ali Sheikh, Zhongfu Ye
10.5120/cae2016652428

Rizwan Ullah, Renjie Tong, Yawar Ali Sheikh and Zhongfu Ye. Two Stage Approaches for the Detection and Suppression of Typed Keystrokes in Speech Signals. Communications on Applied Electronics 6(2):11-15, November 2016. BibTeX

@article{10.5120/cae2016652428,
	author = {Rizwan Ullah and Renjie Tong and Yawar Ali Sheikh and Zhongfu Ye},
	title = {Two Stage Approaches for the Detection and Suppression of Typed Keystrokes in Speech Signals},
	journal = {Communications on Applied Electronics},
	issue_date = {November 2016},
	volume = {6},
	number = {2},
	month = {Nov},
	year = {2016},
	issn = {2394-4714},
	pages = {11-15},
	numpages = {5},
	url = {http://www.caeaccess.org/archives/volume6/number2/674-2016652428},
	doi = {10.5120/cae2016652428},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

In recent decades, keystroke suppression has got a particular attention due to the increasing use of laptops and computers to capture audio in various communication scenarios such as meetings, audio/video instant messaging etc. In many of these situations, a unique problem of additive keystroke transient noise is faced. Because of the non-stationary, short time and abrupt nature of the keystroke transient, it has been a challenging task for many years. In this paper, two new two-stage approaches for the suppression of keystrokes are proposed. In the first stage the speech is estimated using supervised sparse non-negative factorization, which is common in both of the methods. Then, in the second stage, keystrokes are detected and are suppressed by replacing the corrupted speech frames with the corresponding estimated speech frames obtained in the first stage using two new techniques, which is the core contribution of this work. Experimental results show that the proposed approaches exhibit good performance without significantly degrading the quality of speech.

References

  1. Tong, R. Zhou, Y. Zhang, L. Bao, G. and Ye, Z. A Robust Time-frequency Decomposition Model for Suppression of Mixed Gaussian-impulse Noise in Audio Signals. IEEE Transactions on Audio, Speech and Language Processing, Vol.23, No.1, Pages.69-79, Jan. 2015.
  2. Sigg, C.D. Dikk, T. Buhmann, J. M. Speech enhancement using generative dictionary learning. Audio, Speech, and Language Processing, IEEE Transactions on  (Volume: 20, Issue: 6).
  3. Benesty, J. Chen, J. Huang, Y. Cohen, I. Noise Reduction in Speech Processing. Series: Springer Topics in Signal Processing, Vol. 2, 2009.
  4. Subramanya, A. Seltzer, M. L. and Acero, A. Automatic Removal of Typed Keystrokes from Speech Signals. IEEE signal processing letters, vol. 14, no. 5, may 2007.
  5. Mavaddaty, S. Ahadi, S. M. Seyedin, S. A novel speech enhancement method by learnable sparse and low-rank decomposition and domain adaptation. Speech Communication 76 (2016) 42–60.
  6. Talmon, R. Cohen, I. and Gannot, S. Transient noise reduction using nonlocal diffusion filters, IEEE Trans. Audio, Speech and Lang. Process., vol. 19, Issue 6, pp. 1584–1599, Aug. 2011.
  7. Talmon, R. Cohen, I. and Gannot, S. Clustering and suppression of transient noise in speech signals using diffusion maps, Proc. 36th IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP11), Prague, Czech Republic, May 22-28, 2011.
  8. Talmon, R. Cohen, I. and Gannot, S. Single-channel transient interference suppression with diffusion maps. IEEE trans. on audio, speech, and lang. Process., vol. 21, no. 1, January 2013.
  9. Hirszhorn, A. Dov, D. Talmon, R. and Cohen, I. Transient interference suppression in speech signals based on the OM-LSA algorithm. Int. Workshop on Acoustic Signal Enhancement 2012, 4-6 September 2012, Aachen.
  10. (Arden) Huang, Y. Benesty, J. Audio Signal Processing for Next-Generation Multimedia Communication Systems. Bell Laboratories, Lucent Technologies, kluwer academic publishers, 2004
  11. Wilson, K. W. Raj, B. Smaragdis, P. Divakaran, A. Speech denoising using nonnegative matrix factorization with priors. 2008 IEEE Int. Conf. on Acoustics, Speech and Signal Processing.
  12. Sohrab, F. Erdogan, H. Recognize and separate approach for speech denoising using nonnegative matrix factorization. 23rd European Signal Processing Conf. (EUSIPCO), Aug. 31 2015-Sept. 4 2015.
  13. Schafer, R. W. Rabiner, L. R. Digital Representations of Speech Signals. Proceedings of the ieee, vol. 63, no. 4, april 1975.
  14. Boll, S. Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech, and Signal Process., vol. 27, no. 2, pp. 113 – 120, Apr. 1979.
  15. Smaragdis, P. From learning music to learning to separate. In Forum Acusticum, Mitsubishi Electric Research Laboratories, 201 Broadway, Cambridge MA 02139, USA 2005.
  16. Nandhini, S. Shenbagavalli, A. Voiced/Unvoiced Detection using Short Term Processing. Int. conf. on Innovations in Information, Embedded and Communication Systems (ICIIECS-2014)
  17. Mohammadiha, N. Smaragdis, P. and Leijon, A. Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 10, pp.2140–2151, Oct. 2013.
  18. Luo, Y. Bao, G. Xu, Y. Ye, Z. Supervised Monaural Speech Enhancement Using Complementary Joint Sparse Representations. IEEE signal processing

Keywords

Single channel speech enhancement, short time Fourier transform, supervised sparse non-negative matrix factorization, correlation, keystrokes suppression, thresholding technique.