M.Sc.Tim Grams, from the Machine Learning and Cognitive Software group at TU Clausthal, will give a presentation on
"Disentangling Exploration of Large Language Models by Optimal Exploitation"
as part of the VAC Colloquium.
Tim Grams has been a doctoral student in Prof. Dr. Christian Bartelt's research group since September 2023. His research interests lie in reinforcement learning, multi-turn reasoning and LLM agents. The aim of his work is to develop autonomous systems that can act intelligently in unknown environments.
Afterwards we look forwards to discussions while enjoying coffee and cookies.
The event is open to anyone interested.
Abstract:
Exploration is a crucial skill for in-context reinforcement learning in unknown environments. However, it remains unclear if large language models can effectively explore a partially hidden state space. This work isolates exploration as the sole objective,
tasking an agent with gathering information that enhances future returns. Within this framework, we argue that measuring agent returns is not sufficient for a fair evaluation. Hence, we decompose missing rewards into their exploration and exploitation components based on the optimal achievable return. Experiments with various models reveal that most struggle to explore the state space, and weak exploration is insufficient. Nevertheless, we found a positive correlation between exploration performance and reasoning capabilities. Our decomposition can provide insights into differences in behaviors driven by prompt engineering, offering a valuable tool for refining performance in exploratory tasks.
The presentation will be given via Zoom: https://uni-rostock-de.zoom-x.de/j/70385377
