核心概念
Large language models (LLMs) can be leveraged to enhance various aspects of the reinforcement learning (RL) paradigm, including improving sample efficiency, reward function design, generalization, and natural language understanding.
摘要
This survey provides a comprehensive review of the emerging field of integrating LLMs into the RL paradigm, known as LLM-enhanced RL. It proposes a structured taxonomy to systematically categorize the functionalities of LLMs within the classical agent-environment interaction, including roles as information processors, reward designers, decision-makers, and generators.
For each role, the survey summarizes the methodologies, analyzes the specific RL challenges that are mitigated, and provides insights into future directions. As information processors, LLMs can extract meaningful feature representations or translate natural language-based information to formal specifications to reduce the burden on RL agents. As reward designers, LLMs can implicitly provide reward values or explicitly generate executable reward function codes based on their understanding of task objectives and observations. As decision-makers, LLMs can directly generate actions or indirectly provide action candidates and reference policies to guide the RL agent's decision-making process. As generators, LLMs can serve as world model simulators to synthesize accurate trajectories for model-based RL or provide policy explanations to improve interpretability.
The survey also discusses the overall characteristics of LLM-enhanced RL, including its ability to handle multi-modal information, facilitate multi-task learning and generalization, improve sample efficiency, handle long-horizon tasks, and generate reward signals. Finally, it analyzes the potential applications, opportunities, and challenges of this interdisciplinary field to provide a roadmap for future research.
统计
"With extensive pre-trained knowledge and high-level general capabilities, large language models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in aspects such as multi-task learning, sample efficiency, and task planning."
"The recent emergence of large language models (LLMs) has marked a significant milestone in the field of NLP and shown various powerful capabilities in many real-world applications such as medicine, chemical, and embodied control in robots."
"Benefiting from these capabilities, the applications of language models have been shifted from language modeling to task-solving, ranging from basic text classification and sentiment analysis to complex high-level task planning and decision-making."
引用
"With extensive pre-trained knowledge and high-level general capabilities, large language models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in aspects such as multi-task learning, sample efficiency, and task planning."
"The recent emergence of large language models (LLMs) has marked a significant milestone in the field of NLP and shown various powerful capabilities in many real-world applications such as medicine, chemical, and embodied control in robots."
"Benefiting from these capabilities, the applications of language models have been shifted from language modeling to task-solving, ranging from basic text classification and sentiment analysis to complex high-level task planning and decision-making."