Carlo Alfano

Carlo Alfano

Applied Scientist

Amazon

Biography

I am an Applied Scientist at Amazon , working on training LLM-based evaluators.

I completed my PhD in the Department of Statistics at the University of Oxford, under the supervision of Patrick Rebeschini and George Deligiannidis. I was funded by EPSRC.

My research interests include reinforcement learning, LLM fine-tuning, optimization and learning theory. In particular, I focus on building and analyzing reinforcement learning algorithms using standard optimization tools, such as natural gradient descent and mirror descent.

Download my CV.

Interests
  • Reinforcement Learning
  • LLM fine-tuning
  • Optimization
Education
  • DPhil in Statistics, 2020-2025

    University of Oxford

  • MSc in Statistical Sciences, 2019-2020

    University of Oxford

  • BSc in Statistics, Economics and Finance, 2016-2019

    Sapienza University of Rome

Publications

(2025). Meta-Learning Objectives for Preference Optimization. Advances in Neural Information Processing Systems (NeurIPS 2025).

PDF

(2025). Multilingual Self-Taught Faithfulness Evaluators. arXiv preprint: 2507.20752.

PDF

(2025). Learning mirror maps in policy mirror descent. International Conference on Learning Representations (ICLR 2025).

PDF

(2023). A Novel Framework for Policy Mirror Descent with General Parametrization and Linear Convergence. Advances in Neural Information Processing Systems (NeurIPS 2023).

PDF

(2022). Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization. arXiv preprint: 2209.15382.

PDF

(2021). Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning. arXiv preprint: 2109.11692.

PDF

Experience

 
 
 
 
 
Applied Scientist
Amazon
Jun 2025 – Present Spain
Focused on training LLM-based evaluators for LLMs with synthetic data and reinforcement learning.
 
 
 
 
 
Applied Scientist Intern
Amazon
Sep 2024 – Feb 2025 Luxembourg
Focused on building LLM-based evaluators for LLM faithfulness.
 
 
 
 
 
Teaching Assistant
University of Oxford
Oct 2020 – Jun 2023 United Kingdom

Taught Courses:

  • Algorithmic Foundation of Learning
  • Advanced Simulation Methods
 
 
 
 
 
Supervisor
UNIQ+ DeepMind internship at the University of Oxford
Jun 2022 – Sep 2022 United Kingdom

Awards

G-Research Grant for PhD students and postdocs in quantitative fields
EPSRC DTP full scholarship
Full scholarship holder
Honorable mention