Text2Stereo：利用一致性獎勵將穩定擴散用於立體生成

2506.05367v2

中文标题#

Text2Stereo：利用一致性獎勵將穩定擴散用於立體生成

英文标题#

Text2Stereo: Repurposing Stable Diffusion for Stereo Generation with Consistency Rewards

中文摘要#

在本文中，我們提出了一種新的基於擴散的方法，給定一個文本提示生成立體圖像。由於具有大基線的立體圖像數據集很少，從頭開始訓練擴散模型是不可行的。因此，我們提出利用 Stable Diffusion 學到的強先驗知識，並在立體圖像數據集上進行微調，以適應立體生成任務。為了提高立體一致性與文本到圖像的對齊度，我們進一步使用提示對齊和我們提出的立體一致性獎勵函數來調整模型。全面的實驗表明，我們的方法在生成高質量立體圖像方面優於現有方法。

英文摘要#

In this paper, we propose a novel diffusion-based approach to generate stereo images given a text prompt. Since stereo image datasets with large baselines are scarce, training a diffusion model from scratch is not feasible. Therefore, we propose leveraging the strong priors learned by Stable Diffusion and fine-tuning it on stereo image datasets to adapt it to the task of stereo generation. To improve stereo consistency and text-to-image alignment, we further tune the model using prompt alignment and our proposed stereo consistency reward functions. Comprehensive experiments demonstrate the superiority of our approach in generating high-quality stereo images across diverse scenarios, outperforming existing methods.

PDF 获取#

查看中文 PDF - 2506.05367v2

智能達人抖店二維碼

抖音掃碼查看更多精彩內容