Publications
2026
, "PMGT-VR: A Decentralized Proximal-Gradient Algorithmic Framework With Variance Reduction," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 48, (1), pp. 408-420, 2026. [doi]
, "An Improved Autoregressive Evaluation Paradigm for Large Language Models," ACM Trans. Intell. Syst. Technol., Association for Computing Machinery, New York, NY, USA, vol. 17, (4), April, 2026. [doi]
, "The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness," IEEE Transactions on Information Theory, 2026.
, "RM-R1: Reward modeling as reasoning," ICLR, 2026.
, "Towards a Sharp Analysis of Offline Policy Learning for $ f $-Divergence-Regularized Contextual Bandits," ICLR, 2026.
, "OpenGenAlign: A Preference Dataset and Benchmark for Trustworthy Reward Modeling in Open-Ended, Long-Context Generation," Findings of ACL, 2026.
2025
, "Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic," Transactions on Machine Learning Research, 2025.
, "Hessian-aware zeroth-order optimization," IEEE transactions on pattern analysis and machine intelligence, IEEE, 2025.
, "GEC: A unified framework for interactive decision making in MDP, POMDP, and beyond," Mathematics of Operations Research, 2025.
, "Entropy-Regularized Process Reward Model," Transactions on Machine Learning Research, 2025.
, "SEE-DPO: Self Entropy Enhanced Direct Preference Optimization," Transactions on Machine Learning Research, 2025.
, "Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning," Journal of Machine Learning Research, vol. 26, (128), pp. 1–47, 2025.
, "Fully First-Order Methods for Decentralized Bilevel Optimization," IEEE Transactions on Signal Processing, vol. 73, pp. 4734–4747, 2025.
, "AdaGrad under Anisotropic Smoothness: A Fine-Grained Analysis," The Thirteenth International Conference on Learning Representations, 2025.
, "Personalized Visual Instruction Tuning," The Thirteenth International Conference on Learning Representations, 2025.
, "Building Math Agents with Multi-Turn Iterative Preference Learning," The Thirteenth International Conference on Learning Representations, 2025.
, "Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods," Forty-second International Conference on Machine Learning, 2025.
, "MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving," Forty-second International Conference on Machine Learning, 2025.
, "Catoni Contextual Bandits are Robust to Heavy-tailed Rewards," Forty-second International Conference on Machine Learning, 2025.
, "EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents," Forty-second International Conference on Machine Learning, 2025.
, "Logarithmic Regret for Online KL-Regularized Reinforcement Learning," Forty-second International Conference on Machine Learning, 2025.
, "From Lists to Emojis: How Format Bias Affects Model Alignment," The 63rd Annual Meeting of the Association for Computational Linguistics, 2025.
, "ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting," The 63rd Annual Meeting of the Association for Computational Linguistics, 2025.
, "FANS: Formal Answer Selection for LLM Natural Language Math Reasoning Using Lean4," The 2025 Conference on Empirical Methods in Natural Language Processing, 2025.
, "Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability," The 2025 Conference on Empirical Methods in Natural Language Processing, 2025.
, "MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning," The 2025 Conference on Empirical Methods in Natural Language Processing, 2025.
, "MergeBench: A Benchmark for Merging Domain-Specialized LLMs," Neurips Datasets and Benchmarks Track, 2025.
, "Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL," Neurips, 2025.
, "GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents," Neurips, 2025.
, "ASGO: Adaptive Structured Gradient Optimization," Neurips, 2025.
, "Sharp Analysis for KL-Regularized Contextual Bandits and RLHF," Neurips, 2025.
2024
, "On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data," Transactions on Machine Learning Research, 2024.
, "Environment Invariant Linear Least Squares," Annals of Statistics, vol. 52, (5), pp. 2268–2292, 2024.
, "RLHF Workflow: From Reward Modeling to Online RLHF," Transactions on Machine Learning Research, 2024.
, "Fast Rates in Pool-Based Batch Active Learning," Journal of Machine Learning Research, vol. 25, (262), pp. 1–42, 2024.
, "On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training," Journal of Machine Learning Research, vol. 25, (356), pp. 1–46, 2024.
, "Spurious Feature Diversification Improves Out-of-distribution Generalization," The Twelfth International Conference on Learning Representations, 2024.
, "Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise," The Twelfth International Conference on Learning Representations, 2024.
, "Towards Robust Offline Reinforcement Learning under Diverse Data Corruption," The Twelfth International Conference on Learning Representations, 2024.
, "Reverse Diffusion Monte Carlo," The Twelfth International Conference on Learning Representations, 2024.
, "PerceptionGPT: Effectively Fusing Visual Perception into LLM," Conference on Computer Vision and Pattern Recognition 2024, 2024.
, "R-tuning: Teaching large language models to refuse unknown questions," NAACL' 24, 2024.
, "LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models," NAACL' 24 (Demo Track), 2024.
, "Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint," International Conference on Machine Learning, 2024.
, "The Non-linear $F$-Design and Applications to Interactive Learning," International Conference on Machine Learning, 2024.
, "Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning," International Conference on Machine Learning, 2024.
, "Faster Sampling via Stochastic Gradient Proximal Sampler," International Conference on Machine Learning, 2024.
, "Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption," International Conference on Machine Learning, 2024.
, "Ragtruth: A hallucination corpus for developing trustworthy retrieval-augmented language models," ACL, 2024.
, "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization," ECCV, 2024.
, "Mitigating the Alignment Tax of RLHF," The 2024 Conference on Empirical Methods in Natural Language Processing, 2024.
, "TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts," The 2024 Conference on Empirical Methods in Natural Language Processing, 2024.
, "MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance," The 2024 Conference on Empirical Methods in Natural Language Processing, 2024.
, "The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs," The 2024 Conference on Empirical Methods in Natural Language Processing, 2024.
, "Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts," Findings of EMNLP 2024, 2024.
, "LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning," The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
, "Do CLIP Models Always Generalize Better than ImageNet Models?," The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
, "Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm," The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
, "Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference," The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
, "Online Iterative Reinforcement Learning from Human Feedback with General Preference Model," The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
, "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs," The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
, "Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions," The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024.
2023
, "Black-box prompt learning for pre-trained language models," Transactions on Machine Learning Research, 2023.
, "Multi-Consensus Decentralized Accelerated Gradient Descent," Journal of Machine Learning Research, vol. 24, (306), pp. 1–50, 2023.
, "RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment," Transactions on Machine Learning Research, 2023.
, "Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game," ICLR, 2023.
, "Hashtag-Guided Low-Resource Tweet Classification," The Web Conference, 2023.
, "Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes," ICML, 2023.
, "Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models' Memories," ACL, 2023.
, "VO$Q$L: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation," COLT, 2023.
, "Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency," COLT, 2023.
, "Catalyst Acceleration of Error Compensated Methods Leads to Better Communication Complexity," Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR, 2023.
, "Double pessimism is provably efficient for distributionally robust offline reinforcement learning: Generic algorithm and robust partial coverage," Neurips, 2023.
, "A theoretical analysis of optimistic proximal policy optimization in linear markov decision processes," Neurips, 2023.
, "Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training," Neurips, 2023.
, "Corruption-Robust Offline Reinforcement Learning with General Function Approximation," Neurips, 2023.
, "Posterior Sampling for Competitive RL: Function Approximation and Partial Observation," Neurips, 2023.
, "Double Randomized Underdamped Langevin with Dimension-Independent Convergence Guarantee," Neurips, 2023.
, "DetGPT: Detect What You Need via Reasoning," The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
, "Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data," Findings of EMNLP 2023, 2023.
, "Doolittle: Benchmarks and Corpora for Academic Writing Formalization," The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
, "Plum: Prompt learning using metaheuristic," Findings of ACL, 2023.
2022
, "Convex Formulation of Overparameterized Deep Neural Networks," IEEE Transactions on Information Theory, vol. 68, (8), pp. 5340-5352, 2022.
, "Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning," SIAM Journal on Mathematics of Data Science, vol. 4, (2), pp. 834-857, 2022. [doi]
, "Weakly Supervised Disentangled generative causal representation learning," JMLR, vol. 23, pp. 1–55, 2022.
, "When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint," Journal of Machine Learning Research, vol. 23, (214), pp. 1–32, 2022.
, "Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums," International Conference on Learning Representations, 2022.
, "HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning," International Conference on Learning Representations, 2022.
, "Bayesian Invariant Risk Minimization," CVPR, 2022.
, "Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling," COLT, 2022.
, "Probabilistic Bilevel Coreset Selection," ICML, 2022.
, "Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets," ICML, 2022.
, "Sparse Invariant Risk Minimization," ICML, 2022.
2021
, "Estimating and inferring the maximum degree of stimulus-locked time-varying brain connectivity networks," Biometrics, Wiley Online Library, vol. 77, (2), pp. 379–390, 2021. [doi]
, "Finite-sample analysis for decentralized batch multiagent reinforcement learning with networked agents," IEEE Transactions on Automatic Control, IEEE, vol. 66, (12), pp. 5925–5940, 2021. [doi]
, "Mathematical Models of Overparameterized Neural Networks," Proceedings of the IEEE, vol. 109, (5), pp. 683–703, 2021.
, "A Stochastic Extra-Step Quasi-Newton Method for Nonsmooth Nonconvex Optimization," Mathematical Programming, 2021. [doi]
, "DeEPCA: Decentralized Exact PCA with Linear Convergence Rate," Journal of Machine Learning Research, vol. 22, (238), pp. 1-27, 2021.
, "Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation," ACL, 2021.
, "TILGAN: Transformer-based Implicit Latent GAN for Diverse and Coherent Text Generation," Findings of ACL, 2021.
, "DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neural Networks," KDD, 2021. [code available]
, "Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks," COLT, 2021.
, "Efficient Neural Network Training via Forward and Backward Propagation Sparsification," Neurips, 2021.
, "Error Compensated Distributed SGD can be Accelerated," Neurips, 2021.
2020
, "End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, pp. 1317–1332, 2020.
, "Proximal Gradient Method for Nonsmooth Optimization over the Stiefel Manifold," Siam Journal of Optimization, vol. 30, (1), pp. 210-239, 2020.
, "MAP Inference via L2-Sphere Linear Program Reformulation," International Journal of Computer Vision, 2020.
, "Local uncertainty sampling for large-scale multiclass logistic regression," Ann. Statist., vol. 48, (3), pp. 1770-1788, 2020. [doi]
, "Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization," ICML, 2020. [code available]
, "Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems," Neurips, 2020.
, "Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS," Neurips, 2020.
, "Decentralized Accelerated Proximal Gradient Descent," Neurips, 2020.
, "How to Characterize The Landscape of Overparameterized Convolutional Neural Networks," Neurips, 2020.
, "Improving Chinese Word Segmentation with Wordhood Memory Networks," Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8274–8285, 2020.
, "ZEN: pre-training chinese text encoder enhanced by n-gram representations," Findings of EMNLP, 2020. [code available]
2019
, "Robust Frequent Directions with Application in Online Learning," Journal of Machine Learning Research, vol. 20, (45), pp. 1-41, 2019.
, "Utilizing Second Order Information in Minibatch Stochastic Variance Reduced Proximal Iterations," Journal of Machine Learning Research, vol. 20, (42), pp. 1-56, 2019.
, "Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python," Journal of Machine Learning Research, vol. 20, (44), pp. 1-5, 2019.
, "Layer-Wise Learning Strategy for Nonparametric Tensor Product Smoothing Spline Regression and Graphical Models," Journal of Machine Learning Research, vol. 20, (119), pp. 1-38, 2019.
, "A framework of composite functional gradient methods for generative adversarial models," IEEE transactions on pattern analysis and machine intelligence, IEEE, vol. 43, (1), pp. 17–32, 2019. [doi]
, "DHER: Hindsight Experience Replay for Dynamic Goals," ICLR, 2019.
, "NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks," ICML, 2019.
, "DOUBLESQUEEZE: Parallel Stochastic Gradient Descent with Double-passError-Compensated Compression," ICML, 2019.
, "Divergence-Augmented Policy Optimization," Neurips, 2019.
2018
, "I-LAMM for Sparse Learning: Simultaneous Control of Algorithmic Complexity and Statistical Error," Annals of Statistics, vol. 46, pp. 814-841, 2018.
, "Pathwise coordinate optimization for sparse learning: Algorithm and theory," Ann. Statist., vol. 46, (1), pp. 180-218, 2018. [doi]
, "Near-Optimal Stochastic Approximation for Online Principal Component Estimation," Mathematical Programming, vol. 167, pp. 75–97, 2018.
, "Gradient Hard Thresholding Pursuit," Journal of Machine Learning Research, vol. 18, (166), pp. 1-43, 2018.
, "Bayesian Model Averaging With Exponentiated Least Squares Loss," IEEE Transactions on Information Theory, vol. 64, (5), pp. 3331-3345, 2018. [doi]
, "Learning to Remember Translation History with a Continuous Cache," Transactions of the Association for Computational Linguistics, vol. 6, pp. 407–420, 2018.
, "Graph-Guided Multi-Task Sparse Learning Model: a Method for Identifying Antigenic Variants of Influenza A(H3N2) Virus," Bioinformatics, vol. 105, pp. 769–782, 2018.
, "Sparse Generalized Eigenvalue Problem: Optimal Statistical Rates via Truncated Rayleigh Flow," Journal of the Royal Statistical Society: Series B, vol. 80, pp. 1057–1086, 2018.
, "A Convex Formulation For High-Dimensional Sparse Sliced Inverse Regression," Biometrika, vol. 105, (4), pp. 769–782, 2018.
, "Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization," ICML, 2018.
, "Composite Functional Gradient Learning of Generative Adversarial Models," ICML, 2018. [code available] [long version]
, "SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator," NIPS, 2018.
, "Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity," NIPS, 2018.
, "Communication Compression for Decentralized Training," NIPS, 2018.
2017
, "Hierarchical Contextual Attention Recurrent Neural Network for Map Query Suggestion," IEEE Transactions on Knowledge and Data Engineering, 2017. [doi]
, "A General Distributed Dual Coordinate Optimization Framework for Regularized Loss Minimization," Journal of Machine Learning Research, vol. 18, (115), pp. 1-52, 2017.
, "Efficient Distributed Learning with Sparsity," ICML, 2017.
2016
, "Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization," Mathematical Programming, vol. 155, pp. 105–145, 2016. [code available]
, "Towards More Efficient SPSD Matrix Approximation and CUR Matrix Decomposition," Journal of Machine Learning Research, vol. 17, pp. 1–49, 2016.
, "Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings," ICML, 2016. [code available]
, "Generalized Hierarchical Sparse Model for Arbitrary-Order Interactive Antigenic Sites Identification in Flu Virus Data," KDD' 16, 2016.
, "Exact Recovery of Hard Thresholding Pursuit," NIPS, 2016.
2015
, "Learning Sparse Low-Threshold Linear Classifiers," Journal of Machine Learning Research, vol. 16, pp. 1275–1304, 2015.
, "Effective Use of Word Order for Text Categorization with Convolutional Neural Networks," NAACL' 15, 2015. [code available]
, "Stochastic Optimization with Importance Sampling for Regularized Loss Minimization," ICML' 15, 2015.
, "Adaptive Stochastic Alternating Direction Method of Multipliers," ICML' 15, 2015.
, "Crowd Fraud Detection in Internet Advertising," WWW' 15, 2015.
, "Local Smoothness in Variance Reduced Optimization," NIPS, 2015.
2014
, "Partial Gaussian Graphical Model Estimation," IEEE Transactions on Information Theory, vol. 60, pp. 1673–1687, 2014.
, "Random Design Analysis of Ridge Regression," Foundations of Computational Mathematics, Springer US, pp. 1-32, 2014. [doi]
, "Aggregation of affine estimators," Electron. J. Statist., vol. 8, pp. 302-327, 2014. [doi]
, "Learning Nonlinear Functions Using Regularized Greedy Forest," PAMI, vol. 36, pp. 942–954, 2014. [code available]
, "Optimal computational and statistical rates of convergence for sparse nonconvex learning problems," Ann. Statist., vol. 42, (6), pp. 2164-2201, 2014. [doi]
, "A Proximal Stochastic Gradient Method with Progressive Variance Reduction," SIAM Journal on Optimization, vol. 24, pp. 2057–2075, 2014.
, "Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization," ICML' 14, 2014. [long version]
, "Communication-Efficient Distributed Optimization using an Approximate Newton-type Method," ICML' 14, 2014.
, "Compressed Counting Meets Compressed Sensing," COLT' 14, 2014.
2013
, "Multistage Convex Relaxation for Feature Selection," Bernoulli, vol. 19, pp. 2277–2293, 2013.
, "Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization," Journal of Machine Learning Research, vol. 14, pp. 567–599, 2013.
, "Truncated Power Method for Sparse Eigenvalue Problems," Journal of Machine Learning Research, vol. 14, pp. 899–925, 2013. [code available]
, "A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem," SIAM Journal on Optimization, vol. 23, (2), pp. 1062-1091, 2013. [doi]
, "A Joint Matrix Completion and Filtering Model for Influenza Serological Data Integration," PLoS ONE, vol. 8, (7), pp. e69842, 2013. [doi]
, "Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes," ICML'13, 2013.
, "Accelerated Mini-Batch Stochastic Dual Coordinate Ascent," NIPS' 13, 2013.
2012
, "A Spectral Algorithm for Learning Hidden Markov Models," Journal of Computer and System Sciences, vol. 78, (5), pp. 1460-1480, 2012.
, "Tail inequalities for sums of random matrices that depend on the intrinsic dimension," Electronic Communications in Probability, vol. 17, pp. article 14, 2012.
, "A General Theory of Concave Regularization for High Dimensional Sparse Estimation Problems," Statistical Science, vol. 27, pp. 576–593, 2012.
, "Deviation Optimal Learning using Greedy Q-aggregation," Annals of Statistics, vol. 40, pp. 1878–1905, 2012.
, "Identifying antigenicity associated sites in highly pathogenic H5N1 influenza virus hemagglutinin by using sparse learning," Journal of Molecular Biology, 2012.
, "A tail inequality for quadratic forms of subgaussian random vectors," Electronic Communications in Probability, vol. 17, pp. article 52, 2012.
, "Random Design Analysis of Ridge Regression," COLT'12, 2012. [long version]
, "A Proximal-Gradient Homotopy Method for the L1-Regularized Least-Squares Problem," ICML'12, 2012. [long version]
, "Selective Labeling via Error Bound Minimization," NIPS'12, 2012.
2011
, "Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations," IEEE Transactions on Information Theory, vol. 57, pp. 4689–4708, 2011. [code available]
, "Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation," PLoS Comput Biol, Public Library of Science, vol. 7, (6), pp. e1001106, 6, 2011. [doi]
, "Sparse Recovery with Orthogonal Matching Pursuit under RIP," IEEE Transactions on Information Theory, vol. 57, pp. 6215 - 6221, 2011.
, "Concepts and applications for influenza antigenic cartography," Influenza and Other Respiratory Viruses, vol. 5, (Suppl. 1), pp. 204–207, 2011.
, "Robust Matrix Decomposition with Sparse Corruptions," IEEE Transactions on Information Theory, vol. 57, pp. 7221–7234, 2011.
, "Learning with Structured Sparsity," Journal of Machine Learning Research, vol. 12, (103), pp. 3371-3412, 2011.
, "Efficient Optimal Learning for Contextual Bandits," UAI'01, 2011. [long version]
, "Greedy Model Averaging," NIPS' 11, 2011.
, "Learning to Search Efficiently in High Dimensions," NIPS' 11, 2011.
, "Spectral Methods for Learning Multivariate Latent Tree Structure," NIPS' 11, 2011.
2010
, "The Benefit of Group Sparsity," Annals of Statistics, vol. 38, pp. 1978–2004, 2010.
, "Analysis of Multi-stage Convex Relaxation for Sparse Regularization," Journal of Machine Learning Research, vol. 11, (35), pp. 1081-1107, 2010. [code available]
, "Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints," Siam Journal on Optimization, vol. 20, pp. 2807–2832, 2010.
, "A Computational Framework for Influenza Antigenic Cartography," PLoS Comput Biol, Public Library of Science, vol. 6, (10), pp. e1000949, 10, 2010. [doi]
, "Improved Local Coordinate Coding using Local Tangents," ICML' 10, 2010.
, "Agnostic Active Learning Without Constraints," NIPS' 10, 2010.
, "Deep Coding Network," NIPS' 10, 2010.
2009
, "Some sharp performance bounds for least squares regression with $L_1$ regularization," Ann. Statist., vol. 37, (5A), pp. 2109-2144, 2009. [doi]
, "On the Consistency of Feature Selection using Greedy Least Squares Regression," Journal of Machine Learning Research, vol. 10, (19), pp. 555-568, 2009.
, "Sparse Online Learning via Truncated Gradient," Journal of Machine Learning Research, vol. 10, (28), pp. 777-801, 2009.
, "Classifying Search Quries Using the Web as a Source of Knowledge," ACM Transactions on the Web, vol. 3, pp. 1–28, 2009.
, "Fundamental Statistical Techniques," Handbook of Natural Language Processing, N. Indurkhya, F. Damerau (editors), Chapman & Hall/CRC, 2009.
, "Learning with Structured Sparsity," International Conference on Machine Learning 2009, 2009. [long version]
, "Learning Nonlinear Dynamic Models," ICML' 09, 2009.
, "A Spectral Algorithm for Learning Hidden Markov Models," COLT' 09, 2009. [long version]
, "Multi-label prediction via compressed sensing," NIPS' 09, 2009.
, "Nonlinear Learning using Local Coordinate Coding," NIPS' 09, 2009. [long version]
2008
, "Graph-based Semi-supervised Learning and Spectral Kernel Design," IEEE Trans. Info. Theory, vol. 54, pp. 275–288, 2008.
, "An Online Relevant Set Algorithm for Statistical Machine Translation," IEEE Transactions on Audio, Speech, and Language processing, vol. 16, (7), pp. 1274–1286, 2008.
, "Statistical Analysis of Bayes Optimal Subset Ranking," IEEE Transactions on Information Theory, vol. 54, (11), pp. 5140-5154, 2008.
, "Sparse Online Learning via Truncated Gradient," NIPS'08, 2008.
2007
, "On the Effectiveness of Laplacian Normalization for Graph Semi-supervised Learning," Journal of Machine Learning Research, vol. 8, pp. 1489–1517, 2007.
, "A Block Bigram Prediction Model for Statistical Machine Translation," ACM Transactions on Speech and Language Processing, vol. 4, 2007.
, "Two-view Feature Generation Model for Semi-supervised Learning," ICML'07, 2007.
, "Margin Based Active Learning," COLT'07, 2007.
, "Robust Classification of Rare Queries Using Web Knowledge," SIGIR'07, 2007.
2006
, "From $ε$-entropy to KL-entropy: Analysis of Minimum Information Complexity Density Estimation," The Annals of Statistics, vol. 34, pp. 2180–2210, 2006.
, "Information Theoretical Upper and Lower Bounds for Statistical Estimation," IEEE Trans. Info. Theory, vol. 52, pp. 1307–1321, 2006.
, "Subset Ranking using Regression," Proc. COLT'06, 2006. [long version]
, "A Discriminative Global Training Algorithm for Statistical MT," ACL'06, 2006. [long version]
, "Learning on Graph with Laplacian Regularization," NIPS'06, 2006. [long version]
2005
, "Boosting with Early Stopping: Convergence and Consistency," The Annals of Statistics, vol. 33, pp. 1538–1579, 2005.
, "Learning Bounds for Kernel Regression using Effective Data Dimensionality," Neural Computation, vol. 17, pp. 2077–2098, 2005.
, "A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data," Journal of Machine Learning Research, vol. 6, pp. 1817–1853, 2005.
, "Localized Upper and Lower Bounds for Some Estimation Problems," COLT 05, 2005.
, "Analysis of Spectral Kernel Design based Semi-supervised Learning," NIPS 05, 2005. [long version]
2004
, "Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization," The Annals of Statistics, vol. 32, pp. 56–85, 2004.
, "Text Categorization for a Comprehensive Time-Dependent Benchmark," Information Processing & Management, vol. 40, pp. 209-221, 2004.
, "Statistical Analysis of Some Multi-category Large Margin Classification Methods," Journal of Machine Learning Research, vol. 5, pp. 1225–1251, 2004.
, "Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms," ICML 04, pp. 919–926, 2004.
, "Focused Named Entity Recognition using Machine Learning," SIGIR 04, 2004.
, "On the Convergence of MDL Density Estimation," COLT 2004, pp. 315–330, 2004.
, "Column-Generation Boosting Methods for Mixture of Kernels," KDD 2004, 2004.
, "Support Vector Classification with Input Data Uncertainty," NIPS 04, 2004.
2003
, "Sequential Greedy Approximation for Certain Convex Optimization Problems," IEEE Transaction on Information Theory, vol. 49, pp. 682–691, 2003.
, "Leave-one-out Bounds for Kernel Methods," Neural Computation, vol. 15, pp. 1397–1437, 2003.
, "Generalization Error Bounds for Bayesian Mixture Algorithms," Journal of Machine Learning Research, vol. 4, pp. 839–860, 2003.
, "Greedy Algorithms for Classification - Consistency, Convergence Rates, and Adaptivity," Journal of Machine Learning Research, vol. 4, pp. 713–741, 2003.
, "Updating an NLP System to Fit New Domains: an empirical study on the sentence segmentation problem," Proceedings CoNLL-2003, pp. 56–62, 2003.
, "Named Entity Recogintion through Classifier Combination," Proceedings CoNLL-2003, pp. 168–171, 2003.
, "A Robust Risk Minimization based Named Entity Recognition System," Proceedings CoNLL-2003, pp. 204–207, 2003.
, "On the Convergence of Boosting Procedures," ICML 03, pp. 904–911, 2003. [long version]
, "HowtogetaChineseName (Entity) : Segmentation and Combination Issues," EMNLP 2003, pp. 200-207, 2003.
2002
, "On the dual formulation of regularized linear systems," Machine Learning, vol. 46, pp. 91–129, 2002.
, "On the Consistency of Instantaneous Rigid Motion Estimation," International Journal of Computer Vision, vol. 46, pp. 51–79, 2002.
, "Covering Number Bounds of Certain Regularized Linear Function Classes," Journal of Machine Learning Research, vol. 2, pp. 527–550, 2002.
, "Text Chunking based on a Generalization of Winnow," Journal of Machine Learning Research, vol. 2, pp. 615–637, 2002.
, "Recommender Systems Using Linear Classifiers," Journal of Machine Learning Research, vol. 2, pp. 313–334, 2002.
, "A Decision-Tree-Based Symbolic Rule Induction System for Text Categorization," IBM Systems Journal, vol. 41, pp. 428–437, 2002.
, "Two-sided Arnoldi and non-symmetric Lanczos Algorithms," SIAM Journal on Matrix Analysis and Applications, vol. 24, pp. 303–319, 2002.
, "Approximation Bounds for Some Sparse Kernel Regression Algorithms," Neural Computation, vol. 14, pp. 3013–3042, 2002.
, "The Consistency of Greedy Algorithms for Classification," COLT 02, pp. 319–333, 2002. [long version]
, "Statistical Behavior and Consistency of Support Vector Machines, Boosting, and Beyond," ICML 02, pp. 690–697, 2002. [long version]
, "Effective dimension and Generalization of Kernel Learning," NIPS 2002, 2002. [long version]
, "Data-Dependent Bounds for Bayesian Mixture Methods," NIPS 2002, 2002. [long version]
, "Experiments in High-Dimensional Text Categorization," SIGIR 2002, 2002. [long version]
2001
, "Text Categorization based on regularized linear classification methods," Information Retrieval, vol. 4, pp. 5–31, 2001.
, "Rank-one approximation to high order tensors," SIAM Journal on Matrix Analysis and Applications, vol. 23, pp. 534–550, 2001.
, "Empirical Study of Recommender Systems Using Linear Classifiers," The Fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 16–27, 2001.
, "Some Sparse Approximation Bounds for Regression Problems," The Eighteenth International Conference on Machine Learning, pp. 624–631, 2001. [long version]
, "Text Chunking using Regularized Winnow," 39th Annual Meeting of the Association for Computational Linguistics, pp. 539–546, 2001. [long version]
, "A Sequential Approximation Bound for Some Sample-Dependent Convex Optimization Problems with Applications in Learning," 14th Annual Conference on Computational Learning Theory, pp. 65–81, 2001.
, "A Leave-one-out Cross Validation Bound for Kernel Methods with Applications in Learning," 14th Annual Conference on Computational Learning Theory, pp. 427–443, 2001. [long version]
, "A General Greedy Approximation Algorithm with Applications," Advances in Neural Information Processing Systems 14, T. G. Dietterich, S. Becker, Z. Ghahramani (editors), MIT Press, Cambridge, MA, 2001. [long version]
, "Generalization Performance of Some Learning Problems in Hilbert Functional Spaces," Advances in Neural Information Processing Systems 14, T. G. Dietterich, S. Becker, Z. Ghahramani (editors), MIT Press, Cambridge, MA, 2001.
2000
, "A method for reduced-order modeling and simulation of large interconnect circuits and its application to PEEC models including retardation," IEEE Trans. Circ. Sys., vol. 47, pp. 261–273, 2000.
, "A probability analysis on the value of unlabeled data for classification problems," ICML 00, pp. 1191–1198, 2000.
, "Active learning using adaptive resampling," The Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 91–98, 2000.
, "Convergence of Large Margin Separable Linear Classification," Advances in Neural Information Processing Systems 13, pp. 357–363, 2000.
, "Regularized Winnow Methods," Advances in Neural Information Processing Systems 13, pp. 703–709, 2000.