zikele

zikele

人生如此自可乐

零样本文本到语音作为黄金语音生成器:一种系统框架及其在自动发音评估中的适用性

2409.07151v2

中文标题#

零样本文本到语音作为黄金语音生成器:一种系统框架及其在自动发音评估中的适用性

英文标题#

Zero-Shot Text-to-Speech as Golden Speech Generator: A Systematic Framework and its Applicability in Automatic Pronunciation Assessment

中文摘要#

第二语言(L2)学习者可以通过模仿黄金语音来提高他们的发音,尤其是在语音与他们各自的语音特征相一致时。 本研究探讨了这样一个假设:使用零样本文本到语音(ZS-TTS)技术生成的学习者特定的黄金语音可以作为衡量 L2 学习者发音熟练程度的有效指标。 在此探索的基础上,本研究的贡献至少有两个方面:1)设计和开发了一个系统框架,用于评估合成模型生成黄金语音的能力,以及 2)深入研究了在自动发音评估(APA)中使用黄金语音的有效性。 在 L2-ARCTIC 和 Speechocean762 基准数据集上进行的全面实验表明,与一些先前方法相比,我们提出的建模方法在各种评估指标上都能显著提升性能。 据我们所知,本研究是首次探讨黄金语音在 ZS-TTS 和 APA 中的作用,为计算机辅助发音训练(CAPT)提供了一个有前景的方案。

英文摘要#

Second language (L2) learners can improve their pronunciation by imitating golden speech, especially when the speech that aligns with their respective speech characteristics. This study explores the hypothesis that learner-specific golden speech generated with zero-shot text-to-speech (ZS-TTS) techniques can be harnessed as an effective metric for measuring the pronunciation proficiency of L2 learners. Building on this exploration, the contributions of this study are at least two-fold: 1) design and development of a systematic framework for assessing the ability of a synthesis model to generate golden speech, and 2) in-depth investigations of the effectiveness of using golden speech in automatic pronunciation assessment (APA). Comprehensive experiments conducted on the L2-ARCTIC and Speechocean762 benchmark datasets suggest that our proposed modeling can yield significant performance improvements with respect to various assessment metrics in relation to some prior arts. To our knowledge, this study is the first to explore the role of golden speech in both ZS-TTS and APA, offering a promising regime for computer-assisted pronunciation training (CAPT).

PDF 获取#

查看中文 PDF - 2409.07151v2

智能达人抖店二维码

抖音扫码查看更多精彩内容

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.