AI Safety and Alignment

Haoyu Wang

Sample efficient agentic RL for LLMs, world models, and more.

Google Scholar · LinkedIn · GitHub · Email

About

I am currently a research associate and an incoming PhD in Nanyang Technological University (NTU), advised by Prof. Dacheng Tao. I am currently working on Agentic RL for LLMs. Before that, I received my M.S. from Tsinghua University (SIGS), advised by Prof. Xueqian Wang, and my B.Eng. from Xi'an Jiaotong University.

My recent work focuses on safety reasoning, lifelong safety alignment, and improving model behavior with synthetic feedback and experience.

Research Focus

Efficient Trial and Error for LLMs

Using small sub-agents to help the LLM explore, while maintaining the explotation of the LLM via its pretrained knowledge.

LLM Safety Alignment

Training and evaluation methods that make language models reliably follow safety principles under distribution shift.

Safety Reasoning

Eliciting and strengthening internal safety reasoning to improve robustness against jailbreak and adversarial prompts.

Selected Publications

Language-based Trial and Error Falls Behind in the Era of Experience

Preprint, 2026

Haoyu Wang, Guozheng Ma, Shugang Cui, Yilun Kong, Haotian Luo, Li Shen, Mengya Gao, Yichao Wu, Xiaogang Wang, Dacheng Tao

Paper / Project

Lifelong safety alignment paper thumbnail

Lifelong Safety Alignment for Language Models

NeurIPS 2025

Haoyu Wang, Zeyu Qin, Yifei Zhao, Chao Du, Min Lin, Xueqian Wang, Tianyu Pang

Paper / Project

Safety Reasoning with Guidelines

ICML 2025

Haoyu Wang, Zeyu Qin, Li Shen, Xueqian Wang, Dacheng Tao, Minhao Cheng

Paper

News

2026.05: Three papers accepted by ICML 2026.
2025.09: One paper accepted by NeurIPS 2025.
2025.05: Two papers accepted by ICML 2025.
2025.02: One paper accepted by TMLR.
2023.09: One paper accepted by NeurIPS 2023.

Experience and Education

2026.08 - 2029: PhD Student, Nanyang Technological University.
2025.09 - present: Research Associate, Nanyang Technological University.
2024.10 - 2025.07: Associate Member, Sea AI Lab.
2023.09 - 2024.10: Intern, Tencent AI Lab.
2022.09 - 2025.06: M.S., Tsinghua University.
2018.09 - 2022.06: B.Eng., Xi'an Jiaotong University.