Semi-development and Performance Evaluation of a Pronunciation Judgment System Using Free Machine Learning Services
DOI:
https://doi.org/10.61453/joit.v2025no25Keywords:
Language Learning, Japanese Language, Pronunciation, Classification, Google Teachable MachineAbstract
With the recent development of the global community, communication skills in languages other than one's native tongue have become essential. While attending school is an effective method of language learning, it is often difficult due to time and financial constraints. If it becomes possible to acquire language pronunciation through self-study, it is expected that the learning process can be significantly shortened. Furthermore, building such a practice system without specialized IT knowledge, such as programming, would be a great benefit to educational settings. This paper describes a pronunciation assessment system for language learning that utilizes a free machine learning service. The target language is Japanese. By providing machine learning with speech data of homonyms that are difficult for non-native speakers to distinguish, we build a system that can assess the accuracy of pronounced words. For machine learning, we use Google Teachable Machine, a free service that allows system building without specialized IT knowledge. Experiments using this method demonstrate that we have constructed a system that can assess the accuracy of native speaker pronunciation with a very high probability.
References
Apple Siri, Apple Inc. (2011). https://www.apple.com/siri/
ELSA Speak, ELSA Corp. (2016). https://elsaspeak.com/
Google Speech-to-Text, Google LLC. (2016). https://cloud.google.com/speech-to-text
Google Teachable Machine, Google LLC. (2017). https://teachablemachine.withgoogle.com/
Ito, M. (2021). Prospects for approaches to speech sound in Japanese language education: from pronunciation correction, speech instruction, and speech learning support to practice of voice-themed dialogue. Waseda Studies in Japanese Language Education, 30, 129-148. DOI: 10.15055/00007802
Katsuse, I. M. (2017). Support system for pronunciation teaching and practice in special education classes for language-disabled children enrolled in regular schools. Transactions Japanese Society for Information and Systems in Education, 34(1), 7-19. DOI: 10.14926/jsise.34.7
Nakamura, T. (1999). The “Ai-chan no Te” speech training system for aurally impaired children. Technical Spoken Language Processing, The Special Interest Group Technical Reports of IPSJ, vol. 25, no. 12, pp. 57-58. CiNii: 110002771589
Okazaki, Y., Matsunaga, S., Tanaka, H., & Watanabe, K. (2014). Development of a Japanese pronunciation practice support system adapted for foreign students’ Japanese levels and native languages. Japanese Society for Information and Systems in Education Research Report, 28(7), 179-186. https://cir.nii.ac.jp/crid/1520009409478048128
Sasaki, K., & Miwa, J. (2015). An interactive system for pronunciation assessment of Japanese geminates using Android devices and its evaluation. IEICE Technical Report, ET2014-79(2015-1), 39-44. IEICE ID: 110009911910
SpeechAce, Speechace LLC. (2018). https://www.speechace.com/
Toda, T., Kinoshita, N. & Chris, S. (2006). Studies in the phonological acquisition process of second languages. Report on the Research Achievements of Grants-in-Aid for Scientific Research. KAKEN: 15320083
Tsubota, Y., Kawahara, T., & Dantsuji, M. (2001). English pronunciation instruction system using pair-wise discrimination between error patterns of Japanese speakers. Proceedings of Meetings on Acoustics, the Acoustical Society of Japan, 2001(2)2, 341-342. https://ci.nii.ac.jp/search?q=110003110344
Umezaki, T., Kuratani k., & Fujiyoshi, H. (1997). Speech training support system for hearing impaired children using the network environment. The Transactions of the Institute of Electronics, Information and Communication Engineers, J80-D-II (4), 925-932. 10.14923/transinfj.J80-D-II.925
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Innovation and Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.