Publications


Avoiding Catastrophe in Continuous Spaces by Asking for Help

Benjamin Plaut, Hanlin Zhu, Stuart Russell

preprint, 2024


Efficient Prompt Caching via Embedding Similarity

Hanlin Zhu, Banghua Zhu, Jiantao Jiao

preprint, 2024


Towards Optimal Statistical Watermarking

Baihe Huang, Hanlin Zhu, Banghua Zhu, Kannan Ramchandran, Michael I. Jordan, Jason D. Lee, Jiantao Jiao

preprint, 2024


On Representation Complexity of Model-based and Model-free Reinforcement Learning

Hanlin Zhu*, Baihe Huang*, Stuart Russell

International Conference on Learning Representations (ICLR), 2024


End-to-end Story Plot Generator

Hanlin Zhu*, Andrew Cohen*, Danqing Wang, Kevin Yang, Xiaomeng Yang, Jiantao Jiao, Yuandong Tian

preprint, 2023


Learning Personalized Story Evaluation

Danqing Wang, Kevin Yang, Hanlin Zhu, Xiaomeng Yang, Andrew Cohen, Lei Li, Yuandong Tian

preprint, 2023


Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability

Hanlin Zhu, Amy Zhang

Conference on Neural Information Processing Systems (NeurIPS), 2023


Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

Hanlin Zhu, Paria Rashidinejad, Jiantao Jiao

Conference on Neural Information Processing Systems (NeurIPS), 2023


Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao

International Conference on Learning Representations (ICLR), 2023 (Spotlight)


Provably Efficient Reinforcement Learning via Surprise Bound

Hanlin Zhu, Ruosong Wang, Jason D. Lee

Artificial Intelligence and Statistics (AISTATS), 2023


Average-Case Communication Complexity of Statistical Problems

Cyrus Rashtchian, David P. Woodruff, Peng Ye, Hanlin Zhu ($\alpha$-$\beta$ order)

Conference on Learning Theory (COLT), 2021


Vector-Matrix-Vector Queries for Solving Linear Algebra, Statistics, and Graph Problems

Cyrus Rashtchian, David P. Woodruff, Hanlin Zhu ($\alpha$-$\beta$ order)

RANDOM, 2020


Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog

Ryuichi Takanobu, Hanlin Zhu, Minlie Huang

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019