Sample Efficient Agent

Haoyu Wang

Sample efficient agentic RL for LLMs, world models, and more.

About

I am currently a research associate and an incoming PhD in Nanyang Technological University (NTU), advised by Prof. Dacheng Tao. I am currently working on Agentic RL for LLMs. Before that, I received my M.S. from Tsinghua University (SIGS), advised by Prof. Xueqian Wang, and my B.Eng. from Xi'an Jiaotong University.

My recent work focuses on agentic rl, safety reasoning, lifelong alignment.

Research Focus

Efficient Trial and Error for LLMs

Using small sub-agents to help the LLM explore, while maintaining the explotation of the LLM via its pretrained knowledge.

LLM Safety Alignment

Training and evaluation methods that make language models reliably follow safety principles under distribution shift.

Safety Reasoning

Eliciting and strengthening internal safety reasoning to improve robustness against jailbreak and adversarial prompts.

Selected Publications

SCOUT paper thumbnail

Language-based Trial and Error Falls Behind in the Era of Experience

ICML 2026

Haoyu Wang, Guozheng Ma, Shugang Cui, Yilun Kong, Haotian Luo, Li Shen, Mengya Gao, Yichao Wu, Xiaogang Wang, Dacheng Tao.

Lifelong safety alignment paper thumbnail

Lifelong Safety Alignment for Language Models

NeurIPS 2025

Haoyu Wang, Zeyu Qin, Yifei Zhao, Chao Du, Min Lin, Xueqian Wang, Tianyu Pang.

Safety reasoning paper thumbnail

Safety Reasoning with Guidelines

ICML 2025

Haoyu Wang, Zeyu Qin, Li Shen, Xueqian Wang, Dacheng Tao, Minhao Cheng.

Step-on-feet tuning paper thumbnail

Step-on-feet Tuning: Scaling Self-alignment of LLMs via Bootstrapping

ICML MHFAIA Workshop 2024

Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, et al.

LLM robustness paper thumbnail

Are Large Language Models Really Robust to Word-level Perturbations?

TMLR, 2025

Haoyu Wang, Guozheng Ma, Cong Yu, Ning Gui, Linrui Zhang, et al.

News

  • 2026.05: Three papers accepted by ICML 2026.
  • 2025.09: One paper accepted by NeurIPS 2025.
  • 2025.05: Two papers accepted by ICML 2025.
  • 2025.02: One paper accepted by TMLR.
  • 2023.09: One paper accepted by NeurIPS 2023.

Experience and Education

  • 2026.08 - 2029.07: PhD Student, Nanyang Technological University.</li>
  • 2025.09 - present: Research Associate, Nanyang Technological University
  • 2024.10 - 2025.07: Associate Member, Sea AI Lab.
  • 2023.09 - 2024.10: Intern, Tencent AI Lab.
  • 2022.09 - 2025.06: M.S., Tsinghua University.
  • 2018.09 - 2022.06: B.Eng., Xi’an Jiaotong University.

Honors

  • Tsinghua Comprehensive Excellence Scholarship.
  • Tsinghua Big Data Practice Scholarship.
  • XJTU Excellent Student Scholarship.