Xiangming Gu

I am a final-year Ph.D. candidate from National University of Singapore and a student researcher at Google Deepmind. I obtained my bachelor degrees from Tsinghua University in 2021, and was a research intern at Sea AI Lab.

My recent research focus is to understand, advance and safely deploy generative models and agents. My next vision is to enable LLMs/agents to solve challenging problems, such as scientific discovery.

I am looking for full-time positions of research scientist or member of technical staff, please contact me if you are interested in my research.

Email  /  Google Scholar  /  Openreview  /  Linkedin  /  Twitter  /  Github

profile photo

News

Selected Research
* denotes equal contribution, † denotes correspondence. Please see my Google Scholar for full list.
LLMs Reasoning
Parallel and Sequential Test-Time-Scaling in Large Reasoning Models
Xiangming Gu and the Team
Released as Google Deepmind Technical Report, 2025.
LLMs Pre-training and Attention
When Attention Sink Emerges in Language Models: An Empirical View
Xiangming Gu, Tianyu Pang†, Chao Du, Qian Liu, Fengzhuo Zhang, Cunxiao Du, Ye Wang†, Min Lin
Highlights: (i) We presented mechanism understanding of attention sink, massive activations, and value drains in LLMs. (ii) We answered when attention sink emerges in LLMs from pre-training perspective. (iii) We represented the early explorations on LLM architecture designs from the perspective of attention sink biases, and eliminating attention sink. (iv) This research can explain the motivations of attention biases used in large-scale LLMs, like GPT-OSS (OpenAI), and MiMo-V2-Flash (Xiaomi).
Published in International Conference on Learning Representations (ICLR), Singapore, Singapore, 2025. (Spotlight)
Also in Annual Conference on Neural Information Processing Systems Workshop on Attributing Model Behavior at Scale (ATTRIB @ NeurIPS), Vancouver, Canada, 2024. (Oral)
pdf / code / video / long talk / slides / poster
Why Do LLMs Attend to the First Token?
Federico Barbero*†, รlvaro Arroyo*, Xiangming Gu, Christos Perivolaropoulos, Michael Bronstein, Petar Veliฤkoviฤ‡, Razvan Pascanu
Highlights: (i) We empirically and theoretically showed that LLMs need "no-op" to avoid over-mixing, especially in long-context scenario. (ii) We demonstrated that attention sink is one way to approximate "no-op". (iii) This research can explain the motivations of gated attention used in large-scale LLMs, like Qwen3-Next (Alibaba).
Published in Conference on Language Modeling (COLM), Montreal, Canada, 2025.
pdf / slides
Memorization, Generalization, and Safety
Extracting Alignment Data in Open Models
Federico Barbero†, Xiangming Gu, Christopher A. Choquette-Choo, Chawin Sitawarin, Matthew Jagielski, Itay Yona, Petar Veliฤkoviฤ‡, Ilia Shumailov, Jamie Hayes
Highlights: (i) We showed that with only chat template as input, alignment (post-trained) data can be extracted from post-trained (either SFT or RLVR) LLMs. (ii) We presented "di-steal-ation": the extracted data can be used to train (either SFT or RLVR) a base model, recovering a meaningful amount of the original performance.
Released as Technical Report, 2025.
pdf
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu*, Xiaosen Zheng*, Tianyu Pang*†, Chao Du, Qian Liu, Ye Wang†, Jing Jiang†, Min Lin
Highlights: (i) We presented theoretical framework of infectious jailbreak (or AI virus), which could jailbreak large-scale LLM-based multi-agents exponentially fast. (ii) We empirically validated the success of infectious jailbreak in LLM-based multi-agents (up to 1 million agents).
Published in International Conference on Machine Learning (ICML), Vienna, Austria, 2024.
Also in International Conference on Learning Representations Workshop on Large Language Model Agents (LLMAgents @ ICLR), Vienna, Austria, 2024.
pdf / project page / code / video / slides / ICML poster / GYSS poster / WIRED press
On Memorization in Diffusion Models
Xiangming Gu, Chao Du†, Tianyu Pang†, Chongxuan Li, Min Lin, Ye Wang
Highlights: (i) We showed that diffusion models have theoretical optimal solutions, which can only memorize training data. (ii) We empirically explored how training recipes affect memorization of diffusion models.
Published in Transactions on Machine Learning Research (TMLR), 2025.
pdf / code

Open-Sourced Projects
gemma_penzai: A JAX Research Toolkit for Visualizing, Manipulating, and Understanding Gemma Models with Multi-modal Support based on Penzai.

Experience and Education
project_img Google Deepmind
Student Researcher


05.2025 - 10.2025 (London, United Kingdom), 11.2025 - 01.2026 (Singapore)
Hosted by Petar Veliฤkoviฤ‡ and Larisa Markeeva.
Also worked closed with Razvan Pascanu and Soham De.
Exploring reasoning and test-time-scaling of LLMs. Developing tools to debug LLMs.
project_img Sea AI Lab (Sea Limited)
Research Intern


03.2023 - 04.2025 (Singapore)
Mentored by Tianyu Pang and Chao Du.
Also worked closed with Qian Liu and Min Lin.
Understanding, advancing, and safely deploying generative models and agents.
project_img National University of Singapore
Ph.D. candidate in Computer Science


08.2021 - 02.2026 (Singapore)
Supervised by Prof. Ye Wang.
Research on speech, singing and multi-modality.
project_img Tsinghua University
B.E. degree in Electronic Engineering and B.S. degree in Finance


08.2017 - 06.2021 (Beijing, China)
Supervised by Prof. Jiansheng Chen.
Research on computer vision.

Honors and Awards
Dean's Graduate Research Excellence Award, National University of Singapore, 2024
Research Achievement Award, National University of Singapore, 2025/2022
MM'22 Top Paper Award, Association for Computing Machinery, 2022
President's Graduate Fellowship, National University of Singapore, 2021-2025
Tsinghua's Friend- Zheng Geru Scholarship (Academic Excellence Scholarship), Tsinghua University, 2018

Talks and Sharings
[2025.11]: Department of Electronic Engineering, Tsinghua University and Tencent Hunyuan, invited talk on Attention Sink in LLMs and its Applications.
[2025.10]: Google Deepmind Team DL: Agent Frontier, talk on Looking into LLMs: From Tokens to Solutions.
[2025.06]: Google Deepmind Team DL: Agent Frontier, talk on Understanding Attention Sink in (Large) Language Models.
[2025.05]: ASAP Seminar Series, invited talk on When Attention Sink Emerges in Language Models: An Empirical View.
[2025.04]: Singapore Alignment Workshop, poster presentation on Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast.
[2025.02]: NUS Research Week Open House, invited talk on On the Interpretability and Safety of Generative Models .
[2025.01]: Global Young Scientists Summit, poster presentation on Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast.

Academic Services
Conference reviewer for NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL ARR, MM, IJCAI, AISTATS
Journal reviewer for TPAMI, TOMM, TASLP, RA-L

Teaching Services
Teaching Assistant, CS4347/CS5647, Sound and Music Computing, Fall 2024
Teaching Assistant, CS6212, Topics in Media, Spring 2024
Teaching Assistant, CS5242, Neural Networks and Deep Learning, Spring 2023
Teaching Assistant, CS3244, Machine Learning, Fall 2022
Teaching Assistant, CS4243, Computer Vision and Pattern Recognition, Spring 2022

Miscellaneous
I love tourism, movies, food, etc. I have been lived in ๐Ÿ‡จ๐Ÿ‡ณ๐Ÿ‡ธ๐Ÿ‡ฌ๐Ÿ‡ฌ๐Ÿ‡ง, and travelled to ๐Ÿ‡น๐Ÿ‡ญ๐Ÿ‡ซ๐Ÿ‡ฎ๐Ÿ‡ต๐Ÿ‡น๐Ÿ‡ง๐Ÿ‡ช๐Ÿ‡บ๐Ÿ‡ธ๐Ÿ‡ญ๐Ÿ‡ฐ๐Ÿ‡ฒ๐Ÿ‡พ๐Ÿ‡จ๐Ÿ‡ฆ๐Ÿ‡ฆ๐Ÿ‡ช๐Ÿ‡ฆ๐Ÿ‡น๐Ÿ‡ฏ๐Ÿ‡ต๐Ÿ‡ญ๐Ÿ‡บ๐Ÿ‡จ๐Ÿ‡ฟ๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ป๐Ÿ‡ฆ๐Ÿ‡ญ๐Ÿ‡ท๐Ÿ‡ซ๐Ÿ‡ท๐Ÿ‡จ๐Ÿ‡ญ๐Ÿ‡ฉ๐Ÿ‡ช๐Ÿ‡ณ๐Ÿ‡ฑ๐Ÿ‡ฐ๐Ÿ‡ท for holidays/conferences.


You've probably seen this website template before, thanks to Jon Barron.