Wenhan Xiong

About Me

I am a research scientist at the Gen AI team of Meta. My day-to-day work aims at building open LLMs with leading performance. Before joining Meta, I received my PhD in NLP from University of California, Santa Barbara and my bachelor degree from University of Science and Technology of China.

My recent research interests lie primarily in long-context LLMs targeting tasks that necessitate dense information flow and highly skilled expertise. This line of work typically involves meticulous data recipes, efficient parallel training framework, hardware-aware model architecture designs that go beyond theoretical speedups and dedicated alignment methods. Please do not hessitate to reach out if you are also interested in these topics!

Contact: xwhan.cs@gmail.com

Selected Publications

(Full publication list at Google Scholar)

Effective Long-Context Scaling of Foundation Models
Wenhan Xiong†, Jingyu Liu†, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma
Arxiv, 2023, [pdf]
Code Llama: Open Foundation Models for Code
Baptiste Rozière†, Jonas Gehring†, Fabian Gloeckle†, Sten Sootla†, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi⋄, Jingyu Liu, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve†
Arxiv, 2023, [pdf] [code]
Adapting Pretrained Text-to-Text Models for Long Text Sequences
Wenhan Xiong, Anchit Gupta, Shubham Toshniwal, Yashar Mehdad, Wen-tau Yih
EMNLP 2023, Findings, [pdf] [code]
Simple Local Attentions Remain Competitive for Long-Context Tasks
Wenhan Xiong, Barlas Oğuz, Anchit Gupta, Xilun Chen, Diana Liskovich, Omer Levy, Wen-tau Yih, Yashar Mehdad
NAACL 2022, [pdf]

Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval
Wenhan Xiong^*, Xiang Lorraine Li^*, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, Barlas Oğuz
ICLR 2021, [code] [pdf]

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
Wenhan Xiong, Jingfe Du, William Yang Wang and Veselin Stoyanov
ICLR 2020, [pdf]
"DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning"
Wenhan Xiong, Thien Hoang and William Yang Wang
EMNLP 2017, long oral [code] [pdf] [slides]

Media Coverage

"Meta quietly unveils Llama 2 Long AI that beats GPT-3.5 Turbo and Claude 2 on some tasks" - VentureBeat

"Meta's Llama 2 Long AI Model Stuns with Next-Level Responses, Outperforming Rivals" - TechTimes

"Meta launches own AI code-writing tool: Code Llama" - The Verge

"Meta to launch AI model for writing computer codes" - REUTERS

About Me

Selected Publications

Media Coverage

Services

Area Chair: EMNLP 2022

Session Chair: ACL 2021, NAACL 2021

Program committee/reviewer: ACL (2018,2019,2020,2021), EMNLP (2018,2019,2020,2021), NAACL (2019,2021), NeurIPS (2021), ICLR (2021), ACL ARR