Recent progress in reinforcement learning (RL) has powered breakthroughs in various real-world problems (e.g., 1, 2, 3, 4, 5, 6), gathering considerable attention and investment. However, it has also exposed a significant gap between theoretical and experimental developments.

RL theory has grown significantly in the past two decades. Research has characterized the inherent difficulty of various settings, and has designed a wide variety of algorithms (e.g., 7, 8, 9) to reach optimal performances. Furthermore, a huge leap has been made in understanding how to handle large state spaces using function approximation techniques, identifying key structural properties that enable efficient learning (e.g., 10, 11, 12).

Despite theoretical guarantees, applying RL algorithms to complex problems faces challenges. Theoretical algorithms often focus on simplified settings, making them hard to apply to real-world complexities. Furthermore, optimizing for worst-case scenarios, which include unlikely situations, can lead to algorithms that perform poorly on practical tasks. Yet, while specialized algorithms offer empirical success, they might not translate to other problems due to their specific design, and the reliance on heuristics and engineering fixes (13) further widens the gap between theory and practice.

A prominent area that has seen a surge of interest in RL is generative language modeling. Pre-training these models can be viewed as a form of imitation learning (14), while post-training typically implements RL algorithms for purposes like instruction tuning with RL from human feedback or enhancing reasoning capabilities (15). While these successes make the practical utility of RL undeniable, the RL community finds itself at a crossroads. The algorithms employed are frequently variants of classical methods, and exploring beyond these presents a key challenge. Conversely, the success of these models prompts new questions for RL theory, suggesting that frameworks leveraging pre-trained models might offer a more effective paradigm than learning from scratch under traditional assumptions (16).

Following the success of the ICML 2024 edition, the Second Workshop on Aligning Reinforcement Learning Experimentalists and Theorists (ARLET) aims to bridge this discrepancy and promote collaboration. By bringing together experts from both sides, we want to facilitate meaningful discussions and draw a path for future RL research. Motivated by the take-home messages from the previous edition, we seek to encourage: (i) theorists to ask experimentalists for concrete problems to solve, and (ii) experimentalists to seek theoretical guidance on how to approach these problems.

Highlights

A panel to discuss the current state of RL

Idea track: send your problems

Research track: send your solutions

We are excited to have an idea track along with the canonical call for papers, which we hope will foster increased collaboration within the community. The talks will also be followed by a panel discussion on the current state of RL. See the schedule for details.