Author
Listed:
- ABDULKHALEQ Q. A. HASSAN
(Department of English, College of Science and Arts at Mahayil, King Khalid University, Abha, Saudi Arabia)
- MESHARI H. ALANAZI
(��Department of Computer Science, College of Sciences, Northern Border University, Arar, Saudi Arabia)
- REEMA G AL-ANAZI
(��Department of Arabic Language and Literature, College of Humanities and Social Sciences, Princess Nourah bint Abdulrahman University, P. O. Box 84428, Riyadh 11671, Saudi Arabia)
- MUHAMMAD SWAILEH A. ALZAIDI
(�Department of English Language, College of Language Sciences, King Saud University, P. O. Box 145111, Riyadh, Saudi Arabia)
- NOUF J. ALJOHANI
(�Department of Language and Translation, University of Jeddah, Jeddah, Saudi Arabia)
- KHADIJA ABDULLAH ALZAHRANI
(��Saudi Arabia Ministry of Education, Riyadh, Saudi Arabia)
- UMKALTHOOM ALZUBAIDI
(*Department of Social Work, Al Nairyah University College, University of Hafr Albatin, Hafar Al Batin, Saudi Arabia)
- ANWER MUSTAFA HILAL
(��†Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia)
Abstract
Currently, Text-to-Speech (TTS) or speech synthesis, the ability of the complex system to generate a human-like sounding voice from the written text, is becoming increasingly popular in speech processing in various complex systems. TTS is the artificial generation of human speech. A classical TTS system translates a language text into a waveform. Several English TTS systems produce human-like, mature, and natural speech synthesizers. On the other hand, other languages, such as Arabic, have just been considered. The present Arabic speech synthesis solution is of low quality and slow, and the naturalness of synthesized speech is lower than that of English synthesizers. Also, they lack crucial primary speech factors, including rhythm, intonation, and stress. Several studies have been proposed to resolve these problems, integrating using concatenative techniques like parametric or unit selection methods. This paper proposes an Applied Linguistics with Artificial Intelligence-Enabled Arabic Text-to-Speech Synthesizer (ALAI-ATTS) model. This ALAI-ATTS technique includes three essential components: data preprocessing through phonetization and diacritization, Extreme Learning Machine (ELM)-based speech synthesis, and Grey Wolf Fractals Optimization (GWO)-based parameter tuning. Initially, the data preprocessing step includes diacritization, where diacritics are restored to unvoweled text to ensure correct pronunciation, followed by phonetization, translating the text into its phonetic representation. Then, the ELM-based speech synthesis model uses the processed dataset for speech generation. ELMs, well known for their excellent generalization performance and fast learning speed, are especially suitable for real-time TTS applications, balancing high-quality speech output and computational efficiency. Lastly, the GWO methodology is employed to tune the parameters of the ELM. The simulation outcomes validate that the ALAI-ATTS technique considerably enhances the intelligibility and naturalness of Arabic synthesized speech compared to existing approaches. The experimental results of the ALAI-ATTS technique portrayed a lesser value of 3.48, 0.15 and 1.37, 0.25 under WER and DER.
Suggested Citation
Abdulkhaleq Q. A. Hassan & Meshari H. Alanazi & Reema G Al-Anazi & Muhammad Swaileh A. Alzaidi & Nouf J. Aljohani & Khadija Abdullah Alzahrani & Umkalthoom Alzubaidi & Anwer Mustafa Hilal, 2024.
"Integrating Applied Linguistics With Artificial Intelligence-Enabled Arabic Text-To-Speech Synthesizer,"
FRACTALS (fractals), World Scientific Publishing Co. Pte. Ltd., vol. 32(09n10), pages 1-13.
Handle:
RePEc:wsi:fracta:v:32:y:2024:i:09n10:n:s0218348x2540050x
DOI: 10.1142/S0218348X2540050X
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:fracta:v:32:y:2024:i:09n10:n:s0218348x2540050x. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: https://www.worldscientific.com/worldscinet/fractals .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.