誰看到什麼？用於LLMs認知推理的結構化思維-行動序列

2508.14564v1

中文标题#

誰看到什麼？用於 LLMs 認知推理的結構化思維 - 行動序列

英文标题#

Who Sees What? Structured Thought-Action Sequences for Epistemic Reasoning in LLMs

中文摘要#

近年來，大型語言模型（LLMs）和推理框架的進展為提升自主代理的換位思考能力提供了新的可能性。然而，涉及主動感知、協作推理和換位思考的任務（理解另一個代理能看到或知道什麼）對當前基於 LLM 的系統來說仍然具有持續的挑戰性。本研究探討了由 Fast Downward 規劃器生成的轉換解決方案圖中得出的結構化示例，以在 ReAct 框架內提高基於 LLM 的代理性能的潛力。我們提出了一種結構化的解決方案處理流程，生成三種不同類別的示例：最優目標路徑（G 型）、信息節點路徑（E 型）以及對比替代動作的逐步最優決策序列（L 型）。這些解決方案通過提示 LLM 明確闡述每個決策背後的推理過程，進一步轉換為 “思考 - 行動” 示例。雖然 L 型示例略微減少了澄清請求和整體行動步驟，但它們並未產生一致的改進。代理在需要基本注意過濾的任務中表現成功，但在需要關於被遮擋空間的心理化或權衡認識論行動成本的場景中則遇到困難。這些發現表明，僅靠結構化示例不足以實現穩健的換位思考，強調了顯式信念跟蹤、成本建模和更豐富的環境的必要性，以在基於 LLM 的代理中實現社會基礎的合作。

英文摘要#

Recent advances in large language models (LLMs) and reasoning frameworks have opened new possibilities for improving the perspective -taking capabilities of autonomous agents. However, tasks that involve active perception, collaborative reasoning, and perspective taking (understanding what another agent can see or knows) pose persistent challenges for current LLM-based systems. This study investigates the potential of structured examples derived from transformed solution graphs generated by the Fast Downward planner to improve the performance of LLM-based agents within a ReAct framework. We propose a structured solution-processing pipeline that generates three distinct categories of examples: optimal goal paths (G-type), informative node paths (E-type), and step-by-step optimal decision sequences contrasting alternative actions (L-type). These solutions are further converted into ``thought-action'' examples by prompting an LLM to explicitly articulate the reasoning behind each decision. While L-type examples slightly reduce clarification requests and overall action steps, they do not yield consistent improvements. Agents are successful in tasks requiring basic attentional filtering but struggle in scenarios that required mentalising about occluded spaces or weighing the costs of epistemic actions. These findings suggest that structured examples alone are insufficient for robust perspective-taking, underscoring the need for explicit belief tracking, cost modelling, and richer environments to enable socially grounded collaboration in LLM-based agents.

文章页面#

誰看到什麼？用於 LLMs 認知推理的結構化思維 - 行動序列

PDF 获取#

查看中文 PDF - 2508.14564v1

智能達人抖店二維碼

抖音掃碼查看更多精彩內容