Carlo Alfano

Carlo Alfano

Applied Scientist

Amazon

Biography

I am an Applied Scientist at Amazon , working on training LLM-based evaluators.

I completed my PhD in the Department of Statistics at the University of Oxford, under the supervision of Patrick Rebeschini and George Deligiannidis. I was funded by EPSRC.

My research interests include reinforcement learning, LLM fine-tuning, optimization and learning theory. In particular, I focus on building and analyzing reinforcement learning algorithms using standard optimization tools, such as natural gradient descent and mirror descent.

Download my CV.

Interests
  • Reinforcement Learning
  • LLM fine-tuning
  • Optimization
Education
  • DPhil in Statistics, 2020-2025

    University of Oxford

  • MSc in Statistical Sciences, 2019-2020

    University of Oxford

  • BSc in Statistics, Economics and Finance, 2016-2019

    Sapienza University of Rome

Publications

(2026). Multilingual Self-Taught Faithfulness Evaluators. To appear in Findings of the Association for Computational Linguistics: EACL 2026.

PDF

(2025). Meta-Learning Objectives for Preference Optimization. Advances in Neural Information Processing Systems (NeurIPS 2025).

PDF

(2025). Learning mirror maps in policy mirror descent. International Conference on Learning Representations (ICLR 2025).

PDF

(2023). A Novel Framework for Policy Mirror Descent with General Parametrization and Linear Convergence. Advances in Neural Information Processing Systems (NeurIPS 2023).

PDF

(2022). Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization. arXiv preprint: 2209.15382.

PDF

(2021). Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning. arXiv preprint: 2109.11692.

PDF

Experience

 
 
 
 
 
Applied Scientist
Amazon
Jun 2025 – Present Spain
Focused on training LLM-based evaluators for LLMs with synthetic data and reinforcement learning.
 
 
 
 
 
Applied Scientist Intern
Amazon
Sep 2024 – Feb 2025 Luxembourg
Focused on building LLM-based evaluators for LLM faithfulness.
 
 
 
 
 
Teaching Assistant
University of Oxford
Oct 2020 – Jun 2023 United Kingdom

Taught Courses:

  • Algorithmic Foundation of Learning
  • Advanced Simulation Methods
 
 
 
 
 
Supervisor
UNIQ+ DeepMind internship at the University of Oxford
Jun 2022 – Sep 2022 United Kingdom

Awards

G-Research Grant for PhD students and postdocs in quantitative fields
EPSRC DTP full scholarship
Full scholarship holder
Honorable mention