HiFi-Stream：生成対抗ネットワークを使用したストリーミング音声強化

2503.17141v2

日本語タイトル#

HiFi-Stream：生成対抗ネットワークを使用したストリーミング音声強化

英文タイトル#

HiFi-Stream: Streaming Speech Enhancement with Generative Adversarial Networks

日本語要約#

音声強化技術は、モバイルデバイスや音声ソフトウェアにおけるコア技術となっています。それにもかかわらず、現代の深層学習ソリューションは通常、大量の計算リソースを必要とし、低リソースデバイスでの使用が困難です。私たちは、最近発表された HiFi++ モデルの最適化バージョンである HiFi-Stream を提案します。私たちの実験は、HiFi-Stream が元の HiFi++ と比較してサイズと計算の複雑さが改善されているにもかかわらず、元のモデルの大部分の品質を保持していることを示しています。これにより、現在最小かつ最速のモデルの 1 つとなっています。このモデルはストリーミング設定で評価され、現代のベースライン手法と比較して優れた性能を示しています。

英文要約#

Speech Enhancement techniques have become core technologies in mobile devices and voice software. Still, modern deep learning solutions often require high amount of computational resources what makes their usage on low-resource devices challenging. We present HiFi-Stream, an optimized version of recently published HiFi++ model. Our experiments demonstrate that HiFi-Stream saves most of the qualities of the original model despite its size and computational complexity improved in comparison to the original HiFi++ making it one of the smallest and fastest models available. The model is evaluated in streaming setting where it demonstrates its superior performance in comparison to modern baselines.

PDF 取得#

中文 PDF を見る - 2503.17141v2

スマート達人の抖店 QR コード

抖音でスキャンしてさらに素晴らしいコンテンツを見る