I am Yaowen Ye (or Elwin, or 叶耀文), a first-year CS PhD student at UC Berkeley advised by Prof. Jacob Steinhardt and Prof. Stuart Russell. I work on understanding the limitations of human oversight of AI systems and developing scalable approaches to address them. Recently, I've been thinking about:
- How can we implement the debate proposal in practice to achieve its theoretical benefits? Can adversarial debate training produce models that are more truthful than RLHFed models?
- What emergent risks can arise when many agents are deployed in shared environments? Can existing oversight methods scale to address them effectively?
- How to make RL with imperfect reward functions robust?
Feel free to reach out if you're interested in my research! I also enjoy mentoring, so if you are an undergrad and think my advice might be helpful, I'd be happy to connect.
Before joining Berkeley, I did my undergrad at The University of Hong Kong. During my undergrad, I also worked on cognitive reasoning, intuitive physics, learning on graphs, and recommender systems. I was fortunate to be advised by Prof. Yixin Zhu at PKU Cognitive Reasoning Lab and Prof. Chao Huang at HKU Data Intelligence Lab.
Links: [X] [Email] [Give me feedback!]