zikele

zikele

人生如此自可乐

零樣本文本到語音作為黃金語音生成器:一種系統框架及其在自動發音評估中的適用性

2409.07151v2

中文标题#

零樣本文本到語音作為黃金語音生成器:一種系統框架及其在自動發音評估中的適用性

英文标题#

Zero-Shot Text-to-Speech as Golden Speech Generator: A Systematic Framework and its Applicability in Automatic Pronunciation Assessment

中文摘要#

第二語言(L2)學習者可以通過模仿黃金語音來提高他們的發音,尤其是在語音與他們各自的語音特徵相一致時。 本研究探討了這樣一個假設:使用零樣本文本到語音(ZS-TTS)技術生成的學習者特定的黃金語音可以作為衡量 L2 學習者發音熟練程度的有效指標。 在此探索的基礎上,本研究的貢獻至少有兩個方面:1)設計和開發了一個系統框架,用於評估合成模型生成黃金語音的能力,以及 2)深入研究了在自動發音評估(APA)中使用黃金語音的有效性。 在 L2-ARCTIC 和 Speechocean762 基準數據集上進行的全面實驗表明,與一些先前方法相比,我們提出的建模方法在各種評估指標上都能顯著提升性能。 據我們所知,本研究是首次探討黃金語音在 ZS-TTS 和 APA 中的作用,為計算機輔助發音訓練(CAPT)提供了一個有前景的方案。

英文摘要#

Second language (L2) learners can improve their pronunciation by imitating golden speech, especially when the speech that aligns with their respective speech characteristics. This study explores the hypothesis that learner-specific golden speech generated with zero-shot text-to-speech (ZS-TTS) techniques can be harnessed as an effective metric for measuring the pronunciation proficiency of L2 learners. Building on this exploration, the contributions of this study are at least two-fold: 1) design and development of a systematic framework for assessing the ability of a synthesis model to generate golden speech, and 2) in-depth investigations of the effectiveness of using golden speech in automatic pronunciation assessment (APA). Comprehensive experiments conducted on the L2-ARCTIC and Speechocean762 benchmark datasets suggest that our proposed modeling can yield significant performance improvements with respect to various assessment metrics in relation to some prior arts. To our knowledge, this study is the first to explore the role of golden speech in both ZS-TTS and APA, offering a promising regime for computer-assisted pronunciation training (CAPT).

PDF 获取#

查看中文 PDF - 2409.07151v2

智能達人抖店二維碼

抖音掃碼查看更多精彩內容

載入中......
此文章數據所有權由區塊鏈加密技術和智能合約保障僅歸創作者所有。