Xingdi Yuan · 袁 星柢
Principal Researcher · Microsoft Research, Montréal
I aim to develop machines that can read, write, and use language as a tool, in both static datasets and interactive worlds.
At the Froggy team
, my recent work focuses on teaching agentic coding systems to understand codebases, leverage memory, and to solve long-horizon tasks with tools. To achieve that, I create synthetic data, train models, build harnesses, and design evaluation pipelines.
Selected Publications
-
debug-gym: A Text-Based Environment for Interactive Debugging2025Equipping an LLM-based coding agent with debuggers such as pdb.
-
BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills2025Generating unintentional synthetic bugs by asking LLMs to design new features.
-
Gistify! Codebase-Level Understanding via Runtime Execution2025Generating an executable gist from a codebase and an entrypoint.
-
Evolving Programmatic Skill Networks2025Forming a compositional network of skills (executable programs) that evolves through experience.
-
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs Outstanding PaperACL 2025Using contextual entrainment to understand how LLMs become distracted by “irrelevant” contextual information in the prompt.
-
Can Language Models Serve as Text-Based World Simulators?ACL 2024Can LLMs predict p(s_t+1 | s_t, a_t) for text-based games?
-
ALFWorld: Aligning Text and Embodied Environments for Interactive LearningICLR 2021Bridging ALFRED (visual) and TextWorld (text).
-
Learning Dynamic Knowledge Graphs to Generalize on Text-Based GamesNeurIPS 2020Building and maintaining a knowledge graph serves as an agent's belief.
-
Interactive Fiction Games: A Colossal AdventureAAAI 2020Super hard text-based adventure games designed for human players.
-
Interactive Machine Comprehension with Information Seeking AgentsACL 2020Gamifying any machine reading comprehension tasks by making the document partially observed, and equipping the agent with Ctrl+F.
-
Interactive Language Learning by Question AnsweringEMNLP 2019Training information seeking behaviors in text-based environments.
-
TextWorld: A Learning Environment for Text-based GamesComputer Games Workshop, ICML/IJCAI 2018A sandbox learning environment for the training and evaluation of agents on text-based games.
Experience
-
2017 – Present
Microsoft Research, Montréal
Principal Researcher -
2015 – 2017
Maluuba
(Acquired by Microsoft)
Education
-
2015
M.S. Computer Science
New York University -
2011
B.S. Communications Engineering
Beijing University of Technology
北京工业大学
Service
- Area Chair: ICML, NeurIPS, ARR
- Action Editor: TACL
- Outstanding Reviewer: EMNLP 2020, NeurIPS 2021, ICLR 2022, ICML 2022
- Workshop Organizer: Wordplay Workshop (NeurIPS 2020, NAACL 2022, ACL 2024, EMNLP 2025), KBRL Workshop (IJCAI 2020)
Fun Stuff
asciiko (アス子)
A deep ascii art generator. Check out the YouTube demo or the open-sourced code.