Yixuan Su (苏熠暄)
Email: ys484 AT outlook.com
Work Experience
- May 2023 - Present
Research Scientist at Cohere
- 2021 Summer,
Research intern with Lei Shu and Yi Zhang
Amazon AWS AI, Seattle, US - 2020 Summer,
Research intern with David Vandyke
Apple Siri, Cambridge, UK - 2018-2019,
Research intern with Yan Wang and Xiaojiang Liu
Tencent AI Lab, Shenzhen, China - 2017 Summer,
Research intern with Lei Ji
Microsoft Research Asia, Beijing, China
Education
- Oct. 2019 - Nov. 2023
Ph.D. in Computation, Cognition and Language, University of Cambridge
Advisor: Prof. Nigel Collier
Thesis Committee: Prof. Andreas Vlachos and Prof. Mirella Lapata. - Oct. 2017 - Sep. 2018
M.Phil in Machine Learning, Speech and Language Technology, University of Cambridge
Advisor: Prof. Anna Korhonen and Dr. Simon Baker - Sep. 2013 - Jun. 2017
B.S. in Department. of Engineering, Beijing Institute of Technology
I also collaborated with many awesome people.
- Danqi Chen, Princeton University
- Dani Yogatama, DeepMind
- Lingpeng Kong, University of Hong Kong
News
- [2023-Nov-06] Completed my Ph.D. Viva and now I am officially a Dr.! Huge thanks to my advisor, Prof. Nigel Collier, and my thesis committee, Prof. Andreas Vlachos and Prof. Mirella Lapata!
- [2023-Oct] One paper accepted to TLLM 2023, one paper accepted to NeurIPS 2023, and two papers accepted to EMNLP 2023.
- [2023-May] Start my journey at Cohere!
- [2023/05/23] Release PandaGPT, the first foundation model capable of instruction-following data across six modalities.
- [2023/05/04] Release OpenAlpaca, a fully open-source instruction-following model based on OpenLLaMA.
- [2023/02/14] Our manuscript "Contrastive Search Is What You Need For Neural Text Generation" is accepted to TMLR 2023!
- [2022/11/22] Released a technical report, "An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation", that compares Contrastive Search with Meta's recently proposed Contrastive Decoding. [arxiv] [code]
- [2022/10/26] Released our new preprint "Contrastive Search Is What You Need For Neural Text Generation". Check it out! [arxiv] [code]
- [2022/09/14] Two papers (including SimCTG) accepted to NeurIPS 2022. See you in New Orleans!
- [2022/08/02] One paper accepted to CIKM 2022. See you in Atlanta!
- [2022/05/06] I am excited to release my latest work, MAGIC, the SOTA method on zero-shot multi-modal text generation tasks (e.g., zero-shot image captioning and visually grounded story generation). Check it out! [arxiv] [code]
- [2022/04/08] One paper accepted to NAACL 2022. See you in Seattle!
- [2022/02/24] Two papers accepted to ACL 2022. See you in Dublin!
Publications
(*: equal contribution)
Selected Publications
-
PandaGPT: One Model To Instruction-Follow Them All
[project page]
[online demo]
[paper]
[code]
Yixuan Su*, Tian Lan*, Huayang Li*, Jialu Xu, Yan Wang, and Deng Cai*
In Proceedings of the 1st Workshop on Taming Large Language Models (TLLM 2023)
-
OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA
[project page]
[model cards]
Yixuan Su*, Tian Lan*, and Deng Cai
-
[TMLR'23] Contrastive Search Is What You Need For Neural Text Generation
[arxiv]
[paper]
[code]
Yixuan Su and Nigel Collier
In Transactions on Machine Learning Research (TMLR 2023)
-
Language Models Can See: Plugging Visual Controls in Text Generation
[arxiv]
[code]
Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lingpeng Kong, and Nigel Collier
arXiv:2205.02655
-
[NeurIPS'22] A Contrastive Framework for Neural Text Generation
[arxiv]
[paper]
[code]
Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, and Nigel Collier
In Advances in Neural Information Processing Systems (NeurIPS 2022 Spotlight).
-
[NAACL'22-Findings] TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning
[arxiv]
[paper]
[code]
Yixuan Su, Fangyu Liu, Zaiqiao Meng, Tian Lan, Lei Shu, Ehsan Shareghi, and Nigel Collier
In Findings of the North American Chapter of Association for Computational Linguistics (NAACL 2022).
-
[ACL'22] Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System
[arxiv]
[paper]
[code]
Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, and Yi Zhang
In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2022).
Full Publications (In Chronological Order)
-
Specialist or Generalist? Instruction Tuning for Specific NLP Tasks
[arxiv]
[paper]
[code]
Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, and Deng Cai
In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP 2023).
-
Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization
[arxiv]
[paper]
[code]
Zihao Fu, Yixuan Su, Zaiqiao Meng, and Nigel Collier
In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP 2023).
-
Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective
[arxiv]
[paper]
[code]
Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, and Yixuan Su
In Advances in Neural Information Processing Systems (NeurIPS 2023).
-
PandaGPT: One Model To Instruction-Follow Them All
[project page]
[online demo]
[paper]
[code]
Yixuan Su*, Tian Lan*, Huayang Li*, Jialu Xu, Yan Wang, and Deng Cai*
In Proceedings of the 1st Workshop on Taming Large Language Models (TLLM 2023)
-
OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA
[project page]
[model cards]
Yixuan Su*, Tian Lan*, and Deng Cai
-
[TMLR'23] Contrastive Search Is What You Need For Neural Text Generation
[arxiv]
[paper]
[code]
Yixuan Su and Nigel Collier
In Transactions on Machine Learning Research (TMLR 2023)
-
[GEM'22] Plug-and-Play Recipe Generation with Content Planning
[arxiv]
[code]
Yinhong Liu, Yixuan Su, Ehsan Shareghi, and Nigel Collier
In Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2022)
-
Momentum Decoding: Open-ended Text Generation As Graph Exploration
[arxiv]
[code]
Tian Lan*, Yixuan Su*, Shuhang Liu, Heyan Huang, and Xian-Ling Mao
arXiv:2212.02175
-
An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation
[arxiv]
[code]
Yixuan Su and Jialu Xu
arXiv:2211.10797
-
Language Models Can See: Plugging Visual Controls in Text Generation
[arxiv]
[code]
Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lingpeng Kong, and Nigel Collier
arXiv:2205.02655
-
[NeurIPS'22] A Contrastive Framework for Neural Text Generation
[arxiv]
[paper]
[code]
Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, and Nigel Collier
In Advances in Neural Information Processing Systems (NeurIPS 2022 Spotlight).
-
[NeurIPS'22] Measuring and Reducing Model Update Regression in Structured Prediction for NLP
[arxiv]
[paper]
Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, and Yi Zhang
In Advances in Neural Information Processing Systems (NeurIPS 2022).
-
[CIKM'22] From Easy to Hard: A Dual Curriculum Learning Framework for Context-Aware Document Ranking
[arxiv]
[paper]
[code]
Yutao Zhu, Jian-Yun Nie, Yixuan Su, Haonan Chen, Xinyu Zhang, and Zhicheng Dou
In Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM 2022).
-
[NAACL'22-Findings] TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning
[arxiv]
[paper]
[code]
Yixuan Su, Fangyu Liu, Zaiqiao Meng, Tian Lan, Lei Shu, Ehsan Shareghi, and Nigel Collier
In Findings of the North American Chapter of Association for Computational Linguistics (NAACL 2022).
-
[ACL'22] Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System
[arxiv]
[paper]
[code]
Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, and Yi Zhang
In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2022). -
[ACL'22] Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models
[arxiv]
[paper]
[code]
Zaiqiao Meng, Fangyu Liu, Ehsan Shareghi, Yixuan Su, Charlotte Collins, and Nigel Collier
In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2022). -
A Survey on Retrieval-Augmented Text Generation
[arxiv]
Huayang Li*, Yixuan Su*, Deng Cai*, Yan Wang*, and Lemao Liu*
arXiv:2202.01110
-
Exploring Dense Retrieval for Dialogue Response Selection
[arxiv]
[code]
Tian Lan, Deng Cai, Yan Wang, Yixuan Su, Xian-Ling Mao, and Heyan Huang
arXiv:2110.06612
-
[EMNLP'21-Findings] Plan-then-Generate: Controlled Data-to-Text Generation via Planning
[arxiv]
[paper]
[code]
Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, and Nigel Collier
In Findings of the Empirical Methods in Natural Language Processing (EMNLP 2021). -
[EMNLP'21-Findings] Few-Shot Table-to-Text Generation with Prototype Memory
[arxiv]
[paper]
[code]
Yixuan Su, Zaiqiao Meng, Simon Baker, and Nigel Collier
In Findings of the Empirical Methods in Natural Language Processing (EMNLP 2021). -
[ACL'21] Dialogue Response Selection with Hierarchical Curriculum Learning
[arxiv]
[paper]
[code]
Yixuan Su*, Deng Cai*, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, and Yan Wang
In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2021). -
[ACL'21-Findings] Keep the Primary, Rewrite the Secondary: A Two-Stage Approach for Paraphrase Generation
[paper]
Yixuan Su, David Vandyke, Simon Baker, Yan Wang, and Nigel Collier
In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL 2021). -
[TASLP'21] Prototype-to-Style: Dialogue Generation With Style-Aware Editing on Retrieval Memory
[arxiv]
[paper]
Yixuan Su, Yan Wang, Deng Cai, Simon Baker, Anna Korhonen, and Nigel Collier
In IEEE Transactions on Audio, Speech and Language Processing (TASLP 2021). -
[EACL'21] Non-Autoregressive Text Generation with Pre-trained Language Models
[arxiv]
[paper]
[code]
Yixuan Su, Deng Cai, Yan Wang, David Vandyke, Simon Baker, Piji Li, and Nigel Collier
In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL 2021). -
Stylistic dialogue generation via information-guided reinforcement learning strategy
[arxiv]
Yixuan Su, Deng Cai, Yan Wang, Simon Baker, Anna Korhonen, Nigel Collier, and Xiaojiang Liu
arXiv:2004.02202
Invited Talks
- January 2023, IR Group, University of Glasgow
- October 2022, NLG Student Webinar, Chinese Information Processing Society of China
- August 2022, MLNLP Webinar
- August 2022, NLP Group, Princeton University
- June 2022, Language Technology Lab, University of Cambridge
- April 2022, NLP Group, University of Washington
- April 2022, Language Team, DeepMind, London, UK
- March 2022, NLP Group, University of Oxford
- March 2022, NLP Group, Nara Institute of Science and Technology
- February 2022, NLP Group, Tencent AI Lab, Shenzhen, China
Students I Mentored
Professional Service
-
Program Committee Member/Reviewer:
NeurIPS (2023-), ACL (2020-), EMNLP (2021-), NAACL (2021), AAAI (2021), EACL (2021), TASLP (2022-), ACL Rolling Review (2021-)