zikele

zikele

人生如此自可乐

超越ReLU:切比雪夫-DQN以增強深度Q網絡

2508.14536v1

中文标题#

超越 ReLU:切比雪夫 - DQN 以增强深度 Q 网络

英文标题#

Beyond ReLU: Chebyshev-DQN for Enhanced Deep Q-Networks

中文摘要#

深度 Q 网络(DQN)的性能在很大程度上依赖于其底层神经网络准确近似動作價值函數的能力。 標準的函數逼近器,如多層感知機,可能難以高效地表示許多強化學習問題中固有的複雜價值景觀。 本文介紹了一種新架構,即切比雪夫 - DQN(Ch-DQN),它將切比雪夫多項式基整合到 DQN 框架中,以創建更有效的特徵表示。 通過利用切比雪夫多項式的強大函數逼近特性,我們假設 Ch-DQN 可以更高效地學習並實現更高的性能。 我們在 CartPole-v1 基準測試中評估了我們提出的模型,並將其與參數數量相當的標準 DQN 進行比較。 我們的結果表明,具有適度多項式次數(N=4)的 Ch-DQN 實現了顯著更好的漸近性能,比基線高出約 39%。 然而,我們也發現多項式次數的選擇是一個關鍵的超參數,因為高次數(N=8)可能會對學習產生不利影響。 這項工作驗證了在深度強化學習中使用正交多項式基的潛力,同時突出了模型複雜性所涉及的權衡。

英文摘要#

The performance of Deep Q-Networks (DQN) is critically dependent on the ability of its underlying neural network to accurately approximate the action-value function. Standard function approximators, such as multi-layer perceptrons, may struggle to efficiently represent the complex value landscapes inherent in many reinforcement learning problems. This paper introduces a novel architecture, the Chebyshev-DQN (Ch-DQN), which integrates a Chebyshev polynomial basis into the DQN framework to create a more effective feature representation. By leveraging the powerful function approximation properties of Chebyshev polynomials, we hypothesize that the Ch-DQN can learn more efficiently and achieve higher performance. We evaluate our proposed model on the CartPole-v1 benchmark and compare it against a standard DQN with a comparable number of parameters. Our results demonstrate that the Ch-DQN with a moderate polynomial degree (N=4) achieves significantly better asymptotic performance, outperforming the baseline by approximately 39%. However, we also find that the choice of polynomial degree is a critical hyperparameter, as a high degree (N=8) can be detrimental to learning. This work validates the potential of using orthogonal polynomial bases in deep reinforcement learning while also highlighting the trade-offs involved in model complexity.

文章页面#

超越 ReLU:切比雪夫 - DQN 以增强深度 Q 网络

PDF 获取#

查看中文 PDF - 2508.14536v1

智能达人抖店二维码

抖音扫码查看更多精彩内容

載入中......
此文章數據所有權由區塊鏈加密技術和智能合約保障僅歸創作者所有。