Speaker Recognition: Progression and challenges

Speaker recognition is one of the field topics widely used in the field of speech technology, many research works has been conducted and little progress has been made in the past five to six years, and due to the advancement of deep learning techniques in most areas of machine learning, it has been...

Full description

Saved in:
Bibliographic Details
Main Authors: Yusra Al-Irahyim (Author), Qasim Mahmood (Author)
Format: Book
Published: College of Education for Pure Sciences, 2021-09-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_6b87a0f3fda24e74a65d0323028a4cb3
042 |a dc 
100 1 0 |a Yusra Al-Irahyim  |e author 
700 1 0 |a Qasim Mahmood  |e author 
245 0 0 |a Speaker Recognition: Progression and challenges 
260 |b College of Education for Pure Sciences,   |c 2021-09-01T00:00:00Z. 
500 |a 1812-125X 
500 |a 2664-2530 
500 |a 10.33899/edusj.2021.129802.1150 
520 |a Speaker recognition is one of the field topics widely used in the field of speech technology, many research works has been conducted and little progress has been made in the past five to six years, and due to the advancement of deep learning techniques in most areas of machine learning, it has been replaced previous research methods in speaking recognition and verification. The topic of deep learning is now the most advanced solution to verifying and identifying a speaker's identity. The algorithms used are (x-vectors) and (i-vectors) which are considered the baseline in modern work. The aim of this study is to review deep learning methods applied in identifying speakers and tasks for validating older solutions (Gaussian mixture model, Gaussian mixture super vector model and i-vector model) to new solutions using deep neural networks (deep belief network, deep corrective learning network). ) As well as the types of metrics to verify the speaker (cosine distance, probabilistic linear discrimination analysis) as well as the databases used for neural network training (TIMIT, VCTK, VoxCeleb2, LibriSpeech). 
546 |a AR 
546 |a EN 
690 |a speaker recognition 
690 |a speaker verification 
690 |a speaker selection 
690 |a deep learning 
690 |a Education 
690 |a L 
690 |a Science (General) 
690 |a Q1-390 
655 7 |a article  |2 local 
786 0 |n مجلة التربية والعلم, Vol 30, Iss 4, Pp 59-68 (2021) 
787 0 |n https://edusj.mosuljournals.com/article_168076_d41d8cd98f00b204e9800998ecf8427e.pdf 
787 0 |n https://doaj.org/toc/1812-125X 
787 0 |n https://doaj.org/toc/2664-2530 
856 4 1 |u https://doaj.org/article/6b87a0f3fda24e74a65d0323028a4cb3  |z Connect to this object online.