Peng Xu

Ph.D, Stanford Univeristy

Ph.D. in ICME, Stanford Univerisity
Advisors: Chris R‌é and Michael Mahoney

About Me

I am a principal engineer at Numbers Station working on AI platforms for enterprise data automation. Previously I was an applied scientist in Amazon Web Services AI Labs working on Kendra. I received my Ph.D. in Institute for Computational and Mathematical Engineering(ICME) from Stanford Univeristy, where I was co-advised by Chris R‌é and Michael Mahoney. My research interest lies in randomized linear algebra, optimization and machine learning. In particular I am interested in leveraging tools from randomized linear algebra to provide efficient and scalable solutions for large-scale optimization and learning problems. Before joining Stanford, I recieved my B.S. in Mathematics from Fudan University.

Research

[google scholar]

Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner,
Danilo Ribeiro, Shen Wang, Xiaofei Ma, Rui Dong, Xiaokai Wei, Henry Zhu, Xinchi Chen, Zhiheng Huang, Peng Xu, Andrew Arnold, Dan Roth
arXiv preprint, 2022. [arXiv]

Contrastive Document Representation Learning with Graph Attention Networks,
Peng Xu, Xinchi Chen, Xiaofei Ma, Zhiheng Huang, Bing Xiang,
Findings of EMNLP, 2021. [arXiv]

Attention-guided Generative Models for Extractive Question Answering,
Peng Xu*, Davis Liang*, Zhiheng Huang, Bing Xiang,
arXiv preprint, 2021. [arXiv]

Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering,
Alexander Hanbo Li, Patrick Ng, Peng Xu, Henghui Zhu, Zhiguo Wang, Bing Xiang,
ACL, 2021. [arXiv]

Inexact Newton-CG Algorithms With Complexity Guarantees,
Zhewei Yao, Peng Xu, Fred Roosta, Stephen J. Wright, Michael W. Mahoney
arXiv preprint, 2021. [arXiv]

Multiplicative Position-aware Transformer Models for Language Understanding,
Zhiheng Huang, Davis Liang, Peng Xu, Bing Xiang,
arXiv preprint, 2021. [arXiv]

Embedding-based Zero-shot Retrieval through Query Generation,
Davis Liang*, Peng Xu*, Siamak Shakeri, Cicero Nogueira dos Santos, Ramesh Nallapati, Zhiheng Huang, Bing Xiang,
arXiv preprint, 2020. [arXiv][code]

Improve Transformer Models with Better Relative Position Embeddings,
Zhiheng Huang, Davis Liang, Peng Xu, Bing Xiang,
Findings of EMNLP, 2020. [arXiv]

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding,
Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang,
arXiv preprint, 2020. [arXiv]

Domain Adaptation with BERT-based Domain Classification and Data Selection,
Xiaofei Ma, Peng Xu, Zhiguo Wang, Ramesh Nallapati, and Bing Xiang,
Procs. of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo), 2019. [paper]

Passage Ranking with Weak Supervision,
Peng Xu, Xiaofei Ma, Ramesh Nallapati, and Bing Xiang,
arXiv preprint, 2019, [arXiv]
ICLR LLD workshop, 2019. [paper]

Trust Region Based Adversarial Attack on Neural Networks,
Zhewei Yao, Amir Gholami, Peng Xu, Kurt Keutzer, and Michael W. Mahoney,
arXiv preprint, 2018. [arXiv]
Computer Vision and Pattern Recognition (CVPR), 2019.

Newton-MR: Newton's Method Without Smoothness or Convexity,
Fred Roosta, Yang Liu, Peng Xu, and Michael W. Mahoney,
arXiv preprint, 2018. [arXiv]

Inexact Non-Convex Newton-Type Methods,
Zhewei Yao, Peng Xu, Fred Roosta, and Michael W. Mahoney,
arXiv preprint, 2018. [arXiv]

GIANT: Globally Improved Approximate Newton Method for Distributed Optimization,
Shusen Wang, Fred Roosta, Peng Xu, and Michael W. Mahoney,
Neural Information Processing Systems (NeurIPS), 2018. [arXiv][spark code]

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study,
Peng Xu, Fred Roosta, and Michael W. Mahoney,
arXiv preprint, 2017. [arXiv][code]
SIAM International Conference on Data Mining (ICDM), 2020. [paper]

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information,
Peng Xu, Fred Roosta, and Michael W. Mahoney,
arXiv preprint, 2017. [arXiv]
Mathematical Programming, 2019. [paper]

Accelerated Stochastic Power Iteration,
Peng Xu, Bryan He, Christopher De Sa, Ioannis Mitliagkas, and Christopher Ré.
arXiv preprint, 2017. [arXiv][code][blog]
International Conference on Artificial Intelligence and Statistics (AISTATS), 2018. [full paper]

Socratic Learning: Correcting Misspecified Generative Models using Discriminative Models,
Paroma Varma, Bryan He, Dan Iter, Peng Xu, Rose Yu, Christopher De Sa, and Christopher Ré.
arXiv preprint, 2017. [arXiv]

Sub-sampled Newton Methods with Non-uniform Sampling,
Peng Xu, Jiyan Yang, Fred Roosta, Christopher Ré, and Michael W. Mahoney
Neural Information Processing Systems (NeurIPS), 2016. [paper][arXiv][code]

Teaching

Teaching assistant at Stanford University for:

Statistical Learning Theory (CS229T), Winter 2015
Stochastic Methods in Engineering (CME 308), Spring 2015
Linear Algebra with Application to Engineering Computations (CME 200), Autumn 2015

Teaching assistant at PCMI Graduate Summer School 2016: The Mathematics of Data

RandNLA: Randomization in Numerical Linear Algebra, Aug. 2016