📝 Publications and Preprints

Language-based Trial and Error Falls Behind in the Era of Experience
Haoyu Wang, Guozheng Ma, Shugang Cui, Yilun Kong, Haotian Luo, Li Shen, Mengya Gao, Yichao Wu, Xiaogang Wang, Dacheng Tao
- An agentic framework of Sub-Scale Collaboration On Unseen Task(SCOUT)

Lifelong Safety Alignment for Language Models
Haoyu Wang, Zeyu Qin, Yifei Zhao, Chao Du, Min Lin, Xueqian Wang, Tianyu Pang
- First Lifelong Safety Alignment framework for Large Language Models

Safety Reasoning with Guidelines
Haoyu Wang^, Zeyu Qin^, Li Shen, Xueqian Wang, Dacheng Tao, Minhao Cheng
- We provide insights on the poor generalization of Refusal Training.
- We include guidelines to request the LLM to do safety reasoning, eliciting its latent knowledge against the jailbreak attacks.

Mastering Massive Multi-Task Reinforcement Learning via MoE Decision Transformer \ Yilun Kong, Guozheng Ma, Qi Zhao, Haoyu Wang, Li Shen, Xueqian Wang, Dacheng Tao

Step-on-feet Tuning: Scaling Self-alignment of LLMs via Bootstrapping
Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Zhong Zhang, Bingzhe Wu, Liu Liu, Yatao Bian, Tingyang Xu, Xueqian Wang, Peilin Zhao

Are large language models really robust to word-level perturbations?
Haoyu Wang, Guozheng Ma, Cong Yu, Ning Gui, Linrui Zhang, Zhiqi Huang, Suwei Ma, Yongzhe Chang, Sen Zhang, Li Shen, Xueqian Wang, Peilin Zhao, Dacheng Tao

Learning better with less: Effective augmentation for sample-efficient visual reinforcement learning
Guozheng Ma, Linrui Zhang, Haoyu Wang, Lu Li, Zilin Wang, Zhen Wang, Li Shen, Xueqian Wang, Dacheng Tao