I am Yaowen Ye (or Elwin, or 叶耀文), a first-year CS PhD student at UC Berkeley advised by Prof. Jacob Steinhardt and Prof. Stuart Russell. I work on understanding the limitations of human oversight of AI systems and developing scalable approaches to address them. Recently, I've been thinking about:

What emergent risks can arise when many agents are deployed in shared environments? Can existing oversight methods scale to address them?
Can we predict which generalizations will emerge during LLM training, especially surprising or problematic ones? How might automated analysis of training data and model internals help?
LLM chatbots operate in an unusual environment: their human users. How can we design incentives that discourage manipulating this environment for higher reward?
How to make RL with imperfect reward functions robust? How to prevent reward hacking?

Ultimately, I aim to ensure that humans can maintain meaningful oversight of AI systems as they scale, thereby keeping them safe.

Feel free to reach out if you're interested in my research! I also enjoy mentoring, so if you are an undergrad and think my advice might be helpful, I'd be happy to connect.

Before joining Berkeley, I did my undergrad at The University of Hong Kong. During my undergrad, I also worked on cognitive reasoning, intuitive physics, learning on graphs, and recommender systems. I was fortunate to be advised by Prof. Yixin Zhu at PKU Cognitive Reasoning Lab and Prof. Chao Huang at HKU Data Intelligence Lab.

Links: [X] [Scholar] [Email] [Give me feedback!]

Publications

2025

Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test
International Conference on Learning Representations (ICLR), 2026.
Xiaoyuan Zhu, Yaowen Ye*, Tianyi Qiu*, Hanlin Zhu†, Sijun Tan†, Ajraf Mannan, Jonathan Michala, Raluca Ada Popa, Willie Neiswanger. [paper]

Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision.
International Conference on Learning Representations (ICLR), 2025. Spotlight presentation.
Yaowen Ye*, Cassidy Laidlaw* and Jacob Steinhardt. [paper]

2023

Graph Masked Autoencoder for Sequential Recommendation.
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023.
Yaowen Ye, Chao Huang and Lianghao Xia. [paper]

Masked Graph Transformer for Recommendation.
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023.
Chaoliu Li, Chao Huang, Lianghao Xia, Xubin Ren, Yaowen Ye and Yong Xu. [paper]

Miscellaneous

My favorite landscape architect is Yuchen Xiong.
I am a co-founder of HKU AI4Good, a student group focused on AI safety, alignment, and undergrad AI research.
I am a big fan of Vladimir Horowitz and Claude Debussy! I love classical music, film music, harmony theory, and impromptu piano.
I love photography and cinematography and graphic design (like elegant posters, slides, and websites).
I was a co-leader of HKU Astar, a team that builds cool competitive robots.
My MBTI is INFJ.

Friend Links

Names are listed here because of an old mysterious tradition. Many names do not appear on this list but are still my friends :)

(almost alphabetical order)