Investigation of speech disfluencies classification on different threshold selection techniques using energy feature extraction / Raseeda Hamzah and Nursuriati Jamil

Filled pause and Elongation are the two types of speech disfluencies that need more suitable acoustical features to be classified correctly since they are always being misclassified. This work concentrates on developing an accurate and robust energy feature extraction for modelling filled pause and...

Full description

Saved in:
Bibliographic Details
Main Authors: Hamzah, Raseeda (Author), Jamil, Nursuriati (Author)
Format: Book
Published: Penerbit UiTM, 2019-06.
Subjects:
Online Access:Link Metadata
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 repouitm_43819
042 |a dc 
100 1 0 |a Hamzah, Raseeda  |e author 
700 1 0 |a Jamil, Nursuriati  |e author 
245 0 0 |a Investigation of speech disfluencies classification on different threshold selection techniques using energy feature extraction / Raseeda Hamzah and Nursuriati Jamil 
260 |b Penerbit UiTM,   |c 2019-06. 
500 |a https://ir.uitm.edu.my/id/eprint/43819/1/43819.pdf 
520 |a Filled pause and Elongation are the two types of speech disfluencies that need more suitable acoustical features to be classified correctly since they are always being misclassified. This work concentrates on developing an accurate and robust energy feature extraction for modelling filled pause and elongation by investigating different energy features using local maxima points of the speech energy. Method: In this paper, we extracted peak values from each frame of a voiced signal by implementing different thresholding techniques to classify filled pause and elongation. These energy features are evaluated by using statistical naïve Bayes classifier to see the contribution on the classification processes. Various samples of sustained syllables and filled pauses of spontaneous speech were extracted from Malaysian Parliamentary Debate Database of the year 2008. A naïve Bayes was used as a classifier. We performed F-measure evaluation to investigate the significant differences in mean of filled pause and elongation samples. Results: Results revealed that our proposed LM-E has increase the classification with up to 71% and 75% F-measure for elongation and filled pause. Conclusion: The best achieved accuracies in both filled pause and elongation classification were varied depending on the types of thresholding techniques applied during the local maxima of speech energy extraction. The most contributed thresholding technique is our proposed technique which is by using the adaptive height as the threshold that extracts the local maxima of the speech energy (LM-E). 
546 |a en 
690 |a Extraction 
655 7 |a Article  |2 local 
655 7 |a PeerReviewed  |2 local 
787 0 |n https://ir.uitm.edu.my/id/eprint/43819/ 
787 0 |n https://mjoc.uitm.edu.my 
856 4 1 |u https://ir.uitm.edu.my/id/eprint/43819/  |z Link Metadata