My research group develops principled methods for machine learning — algorithms with theoretical guarantees, mathematical understanding of modern learning systems, and rigorous foundations for emerging applications in generative AI, reinforcement learning, and agents. Recent work spans optimization and sampling, LLM post-training and alignment, reward modeling, formal reasoning, web and multimodal agents, and embodied AI. We aim to bridge theory and practice across the topics below.
We study the mathematical theory of modern machine learning, including learning theory for overparameterized models and deep neural networks, generalization, and sample complexity. The goal is a rigorous understanding of why current methods succeed and where they break.
Optimization and Sampling Algorithms
We develop principled algorithms with provable guarantees for large-scale learning, spanning convex and nonconvex optimization, adaptive and structured-gradient methods, bilevel optimization, and Langevin- and diffusion-based sampling.
Reinforcement Learning and Adaptive Decision Making
We design sample-efficient reinforcement learning algorithms with provable guarantees under partial observability, distribution shift, corruption, and adversarial conditions, including robust offline RL and sequential decision making in dynamic environments, with applications to generative AI and agents.
LLM Post-Training, Alignment, and Reasoning
We pursue principled approaches to aligning and improving large language models: reward modeling, preference optimization, multimodal and reasoning fine-tuning, and test-time computation — moving beyond ad-hoc recipes toward methods with strong theoretical justification.
Agents, Embodied Systems, and Robotics
We develop principled learning methods for agents that act in the world, including digital agents with tool use and multimodal capabilities (vision–language and GUI agents). We are also beginning to expand into embodied and robotic agents, combining reinforcement learning, multimodal perception, and physical reasoning for autonomous control.