Carlo Alfano
Carlo Alfano
Home
Publications
Talks
Light
Dark
Automatic
Publications
Type
Date
2025
2023
2022
2021
Multilingual Self-Taught Faithfulness Evaluators
The growing use of large language models (LLMs) has increased the need for automatic evaluation systems, particularly to address the …
Carlo Alfano
,
Aymen Al Marjani
,
Zeno Jonke
,
Amin Mantrach
,
Saab Mansour
,
Marcello Federico
PDF
Meta-Learning Objectives for Preference Optimization
Evaluating preference optimization (PO) algorithms on LLM alignment is a challenging task that presents prohibitive costs, noise, and …
Carlo Alfano
,
Silvia Sapora
,
Jakob Nicolaus Foerster
,
Patrick Rebeschini
,
Yee Whye Teh
PDF
Learning mirror maps in policy mirror descent
Policy Mirror Descent (PMD) is a popular framework in reinforcement learning, serving as a unifying perspective that encompasses …
Carlo Alfano
,
Sebastian Rene Towers
,
Silvia Sapora
,
Chris Lu
,
Patrick Rebeschini
PDF
A Novel Framework for Policy Mirror Descent with General Parametrization and Linear Convergence
Modern policy optimization methods in reinforcement learning, such as Trust Region Policy Optimization and Proximal Policy …
Carlo Alfano
,
Rui Yuan
,
Patrick Rebeschini
PDF
Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization
We analyze the convergence rate of the unregularized natural policy gradient algorithm with log-linear policy parametrizations in …
Carlo Alfano
,
Patrick Rebeschini
PDF
Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning
Cooperative multi-agent reinforcement learning is a decentralized paradigm in sequential decision making where agents distributed over …
Carlo Alfano
,
Patrick Rebeschini
PDF
Cite
×