Language Agents: Foundations and Risks

Uploaded by

Juan Pablo Manson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views8 pages

Language Agents: Foundations and Risks

Uploaded by

Juan Pablo Manson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Language Agents: Foundations, Prospects, and Risks

Yu Su1 Diyi Yang2 Shunyu Yao3 Tao Yu4

1
The Ohio State University, 2 Stanford University, 3 Princeton University, 4 University of Hong Kong
[email protected], [email protected], [email protected], [email protected]

1 Introduction
A heated discussion thread in AI and NLP is au-
tonomous agents, usually powered by large lan-
guage models (LLMs), that can follow language
instructions to carry out diverse and complex tasks
in real-world or simulated environments. There are
numerous proof-of-concept efforts on such agents
recently, including ChatGPT Plugins,1 AutoGPT,2
generative agents (Park et al., 2023), just to name a
few. The public is also showing an unprecedentedly
high level of excitement. For example, AutoGPT Figure 1: A conceptual framework for language agents.
has received 147K stars in just 4 months, making it
the fastest growing repository in the Github history, earlier AI agents, we suggest that these AI agents
despite its experimental nature with many known capable of using language for thought and commu-
and sometimes serious limitations. nication should be called “language agents,” for
However, the concept of agent has been intro- language being their most salient trait.
duced into AI since its dawn. So what has changed Language played a critical role in the evolution
recently? We argue that the most fundamental of biological intelligence, and now artificial intelli-
change is the capability of using language. Con- gence may be following a similar evolutionary path.
temporary AI agents use language as a vehicle for This is remarkable and concerning at the same time.
both thought and communication, a trait that was Despite the rapid progress, there has been a sig-
unique to humans. This dramatically expands the nificant lack of systematic discussions regarding
breadth and depth of the problems these agents can the conceptual definition, theoretical foundation,
possibly tackle, autonomously. The capability of promising directions, and risks associated with lan-
using language, bestowed by their LLM founda- guage agents. This proposed tutorial endeavors
tions, allows these agents to 1) use a wide range to fill this gap by giving a comprehensive account
of tools and reconcile their heterogeneous syntax of language agents based on both contemporary
and semantics (Parisi et al., 2022; Schick et al., and classic AI research while drawing connections
2023; Qin et al., 2023a; Patil et al., 2023; Qin to cognitive science, neuroscience, and linguistics
et al., 2023b; Mialon et al., 2023), 2) operate in when appropriate.
complex environments and ground to environment-
2 Outline of Tutorial Content
specific semantics (Brohan et al., 2023b; Yao et al.,
2022a; Gu et al., 2023; Wang et al., 2023a; Deng This cutting-edge tutorial will be half-day and
et al., 2023; Zhou et al., 2023), 3) conduct complex cover a conceptual framework for language agents
language-driven reasoning (Wei et al., 2022; Shinn as well as important topic areas including tool
et al., 2023; Chen et al., 2023), and 4) form sponta- augmentation, grounding, reasoning and planning,
neous multi-agent systems (Park et al., 2023; Liu multi-agent systems, and risks and societal impact.
et al., 2023b). Therefore, to distinguish from the
2.1 Overview [30mins]
1
https://openai.com/blog/chatgpt-plugins
2
https://github.com/Significant-Gravitas/ What are language agents and how they differ
Auto-GPT from the previous generations of AI agents? We
17
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, pages 17–24
November 12–16, 2024. ©2024 Association for Computational Linguistics
will start by discussing why the capability of us- particularly problematic for tools that produce side
ing language for thought and communication em- effects in the world (e.g., a tool for sending emails).
powered by LLMs is the defining trait of the We will discuss the challenges and opportunities
contemporary agents, drawing connections to the around tool augmentation.
role language played in the evolution of biolog-
ical intelligence (Dennett, 2013). We will then 2.3 Grounding [30mins]
discuss a potential conceptual framework for lan- Most of the transformative applications of language
guage agents (Figure 1) and how each component agents involve connecting an agent to some real-
(agent/embodiment/environment) differs from pre- world environments (e.g., through tools or em-
vious agents. One foundational construct is mem- bodiment), be it databases (Cheng et al., 2023),
ory. We will discuss the resemblances and differ- knowledge bases (Gu et al., 2023), the web (Deng
ences between a language agent/LLM’s memory et al., 2023; Zhou et al., 2023), or the physical
and human memory, including the storage mecha- world (Brohan et al., 2023a). Each environment is
nism (Kandel, 2007), long-term memory (LLM’s a unique context that provides possibly different
parametric memory/vector databases), and work- interpretations of natural language. Grounding, i.e.,
ing memory (in-context learning), and how such the linking of (natural language) concepts to con-
memory may support general-purpose language- texts (Chandu et al., 2021), thus becomes a central
driven reasoning. We will wrap up this section by and pervasive challenge. There are two types of
outlining the key technical and societal aspects that grounding related to language agents:
will be discussed in the rest of the tutorial.
• Grounding natural language to an environ-
2.2 Tool Augmentation [30mins] ment (Gu et al., 2023). This is also closely
Tool augmentation or tool use (Schick et al., 2023; related to the meaning of natural language,
Mialon et al., 2023) is a natural extension of lan- which, as Bender and Koller (2020) put it, is
guage agents due to their capability of using lan- the mapping from an utterance to its commu-
guage for thought and communication. Language nicative intent.
agents start to demonstrate a possibility of au- • Grounding an agent’s decisions in its own con-
tonomously understanding and reconciling the het- text (i.e., working memory), which includes
erogeneous syntax and semantics (e.g., XML vs. external information from tools (Liu et al.,
JSON) of different tools (i.e., using language for 2023a; Yue et al., 2023; Gao et al., 2023;
communication), and orchestrating the tool exe- Cheng et al., 2023).
cution results into a coherent reasoning process
(i.e., using language for thought). At present, tool We will discuss the current work on both types of
augmentation mainly serves three purposes: grounding, the remaining challenges, and promis-
ing future directions.
• Provide up-to-date and/or domain-specific in-
formation (Nakano et al., 2021; Lazaridou 2.4 Reasoning and Planning [30mins]
et al., 2022; Guu et al., 2020).
The simplest way for language agents to interact
• Provide specialized capabilities (e.g., high-
with external worlds is to generate the next action
precision calculation) that a language agent
via the LLM (Nakano et al., 2021; Schick et al.,
may not have or be best at (Schick et al., 2023;
2023), but the mapping from context to action is
Shen et al., 2023; Cheng et al., 2023; Gao
often non-trivial and such approaches often require
et al., 2022).
fine-tuning to learn the mapping. Inspired by prior
• Enable a language agent to act in external
work that leverages intermediate reasoning to im-
environments (Liang et al., 2022; Wang et al.,
prove LLM performance (Nye et al., 2021; Wei
2023a).
et al., 2022), approaches such as ReAct (Yao et al.,
Two metrics are essential for practical tool aug- 2022b) start to leverage intermediate reasoning for
mentation: robustness, i.e., accuracy in using tools, better acting by flexibly analyzing environmental
and flexibility, i.e., ease of integrating a new tool. observations, making plans, tracking task status,
While existing efforts, e.g., ChatGPT Plugins, have recovering from exceptions, etc. Subsequent stud-
made meaningful progress on flexibility, robust- ies (Shinn et al., 2023; Chen et al., 2023) further
ness still presents a significant challenge. This is leverage LLM reasoning for explicit self evaluation,
18
critic, or reflection, to further improve agent per- augmentation can largely increase faithfulness of
formance. On the other hand, the simplest way for model output, but hallucination issues might still
language agents to plan multiple steps of actions is exist and could lead to misleading, unsecure, and
to generate an action plan (Huang et al., 2022), but even harmful output especially when it comes to
the token-by-token autoregressive decoding makes high-stake scenarios, raising key concerns towards
it hard to forecast planned future, backtrack from privacy and truthfulness of the resulting interac-
error, or maintain a global exploration structure tion. Bias and fairness remain another primary risk,
for planning. To this end, recent works have be- as language agents might inherit biases from the
gun to enhance LMs with re-planning (Song et al., training corpus. The simulated AI agents might per-
2022) or tree search algorithms (Yao et al., 2023; petuate stereotypes or discriminate against certain
Hao et al., 2023) to systematically explore and groups of people (Schramowski et al., 2022). Other
make decisions in the planning space, analogous potential risks include: the lack of transparency in
to planning-based agents such as AlphaGo (Silver why AI agents behave in their decision-making pro-
et al., 2016). We will also discuss the recent trend cess, the robustness in AI agents in terms of being
that blurs the boundary between reasoning and act- manipulated by malicious actors (Zou et al., 2023),
ing, which leads to a more unified methodology be- and the ethics in terms of what AI agents can and
tween reasoning and planning (e.g., Monte-Carlo cannot do, etc. Our tutorial will provide a detailed
tree search applied for both reasoning (Hao et al., walkthrough of these potential risks in AI agents
2023) and action planning (Silver et al., 2016)). (Aher et al., 2023), using a few representative case
studies to demonstrate how such risks might affect
2.5 Multi-Agent Systems [30mins] downstream applications, and how human-in-the-
When AI agents are equipped with the capabil- loop (Wu et al., 2022) or mixed initiative agents can
ity of using language for thought and communica- be leveraged to build more responsible language
tion, it starts to enable multi-agent systems quite agents. More importantly, we will briefly discuss
different from the conventional ones (Ferber and the multifaceted impact of language agents, when it
Weiss, 1999)—agents can now act and communi- comes to user trust (Hancock et al., 2020; Liu et al.,
cate with each other in a more autonomous fash- 2022), and cultural and societal implications. We
ion. On the one hand, agents may now be gen- will also discuss efforts on evaluating and bench-
erated with minimal specification instead of pre- marking language agents (Liu et al., 2023c,d).
programmed and can continually evolve through
use and communication to produce complex so- 3 Other Required Information
cial behaviors (Park et al., 2023), collaborate for
task solving (Wu et al., 2023; Qian et al., 2023; The proposed tutorial is considered a cutting-edge
Hong et al., 2023), or debate for more divergent tutorial that gives a systematic account of the
and faithful reasoning (Chan et al., 2023; Liang emerging topic of language agents. There is no
et al., 2023; Du et al., 2023). On the other hand, prior tutorial at *CL conferences that has covered
human users are also agents, and these artificial this topic. There are a few recent tutorials covering
language agents can interact with human agents in some related aspects of language agents, such as
much richer and more flexible ways than before. “ACL’23: Tutorial on Complex Reasoning over Nat-
There are numerous emerging opportunities, such ural Language” on reasoning, “ACL’23: Retrieval-
as providing guardrails and alignment for language based Language Models and Applications” on re-
agents (Bai et al., 2022) and resolving uncertain- trieval augmentation, and “EMNLP’23: Mitigating
ties (Yao et al., 2020). We will discuss the oppor- Societal Harms in Large Language Models” on
tunities and challenges in this new generation of societal considerations of LLMs. However, there
multi-agent and human-AI collaborative systems. lacks a comprehensive coverage on the foundations,
prospects, and risks of language agents, a void this
2.6 Risks and Societal Impact [30mins] proposed tutorial aspires to fill.
Despite being powerful in a wide range of tasks,
language agents are very likely to suffer from key 3.1 Target Audience and Prerequisites
risks and societal harms (Wang et al., 2023b). The This tutorial is targeted at a broad audience who are
first aspect is towards hallucination. The afore- interested in language agents. There are no strict
mentioned memory module, retrieval, or even tool prerequisites for the audience’s background, but
19
having 1) basic knowledge of machine learning and intelligence. His work at Microsoft has been de-
deep learning and 2) basic knowledge of language ployed as the official conversational interface for
models will help deeper understanding. Microsoft Outlook. His work on language agents
has won awards such as Outstanding Paper Award
3.2 Diversity and Inclusion at ACL’23 and COLING’22 and from the Ama-
We deeply value diversity and strongly believe it zon Alexa Prize Challenge. He has given 30+
can greatly help realize the tutorial’s goal and will invited talks internationally. Homepage: https:
ensure diversity in the following aspects: //ysu1989.github.io/.
Diversity of instructors. The instructor team has Diyi Yang is an assistant professor in the Computer
a diverse background including faculty members Science Department at Stanford University. Her
and graduate students from four institutes spanning research focuses on human-centered natural lan-
two continents and from different gender groups. guage processing and computational social science.
Diversity of participants. Language agents are an Diyi has organized four workshops at NLP con-
emerging multi-disciplinary research topic with a ferences: Widening NLP Workshops at NAACL
very high level of interests in both academia and in- 2018 and ACL 2019, Causal Inference workshop
dustry, so we expect a diverse audience. To further at EMNLP 2021, NLG Evaluation workshop at
promote the awareness of the tutorial in underrep- EMNLP 2021, and Shared Stories and Lessons
resented communities, we will work with affinity Learned workshop at EMNLP 2022. She gave a
groups such as Black in AI, WiNLP, and LatinX tutorial at ACL 2022 on Learning with Limited
in AI to broadcast the tutorial as well as solicit Data, and a tutorial at EACL 2023 on Summariz-
suggestions on the tutorial content. ing Conversations at Scale. Homepage: https:
Diversity of topics. Given the multi-disciplinary //cs.stanford.edu/~diyiy/.
nature of language agents, the materials of this tu- Shunyu Yao is a PhD student at Princeton NLP
torial will cover both contemporary and classic Group, advised by Karthik Narasimhan and sup-
AI/NLP research as well as related discussions ported by Harold W. Dodds Fellowship. His re-
from reinforcement learning, cognitive science, search focuses on various facets of developing
neuroscience, linguistics, human-computer interac- language agents, such as reasoning, acting, learn-
tion, and social science. ing, and benchmarking. Homepage: https://
3.3 Tutorial Logistics ysymyth.github.io.
Tao Yu is an assistant professor of computer sci-
Estimated audience size. Based on prior tutorials
ence at The University of Hong Kong. He com-
and workshops we organized on related topics, we
pleted his Ph.D. at Yale University and was a post-
expect 100-150 attendees including researchers
doctoral fellow at the University of Washington.
and practitioners in related fields.
His research aims to build language model agents
Open access. All materials will be released online
that ground language instructions into code or ac-
on a dedicated website for the tutorial.
tions executable in real-world environments. Tao
Preferred venue. We prefer to have the tutorial
is the recipient of an Amazon Research Award
co-located with ACL 2024 or EMNLP 2024.
and Google Scholar Research Award. He has co-
3.4 Breadth organized multiple workshops and a tutorial related
to language agents at ACL, EMNLP, and NAACL.
At least 60% of the tutorial will center around work
Homepage: https://taoyds.github.io/.
done by researchers other than the instructors. This
tutorial categorizes promising approaches for lan-
5 Ethics Statement
guage agents into several groups, and each of these
groups includes a significant amount of other re- Language agents, with the ability of autonomously
searchers’ works. acting in the real world, pose significant potential
ethical and safety risks. A main purpose of this
4 Tutorial Instructors
proposed tutorial is to systematically define and
Yu Su is a distinguished assistant professor of en- analyze the unique capabilities and associated risks
gineering at the Ohio State University. His re- of language agents. We have a dedicated section on
search investigates the role of language as a ve- risks and societal impact, and we also cover related
hicle for thought and communication in artificial discussion in every other section when appropriate.
20
References Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenen-
baum, and Igor Mordatch. 2023. Improving factual-
Gati V Aher, Rosa I Arriaga, and Adam Tauman Kalai. ity and reasoning in language models through multia-
2023. Using large language models to simulate mul- gent debate. arXiv preprint arXiv:2305.14325.
tiple humans and replicate human subject studies.
In International Conference on Machine Learning,
Jacques Ferber and Gerhard Weiss. 1999. Multi-agent
pages 337–371. PMLR.
systems: an introduction to distributed artificial in-
Yuntao Bai, Saurav Kadavath, Sandipan Kundu, telligence, volume 1. Addison-wesley Reading.
Amanda Askell, Jackson Kernion, Andy Jones,
Anna Chen, Anna Goldie, Azalia Mirhoseini, Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon,
Cameron McKinnon, et al. 2022. Constitutional Pengfei Liu, Yiming Yang, Jamie Callan, and Gra-
ai: Harmlessness from ai feedback. arXiv preprint ham Neubig. 2022. Pal: Program-aided language
arXiv:2212.08073. models. ArXiv, abs/2211.10435.

Emily M. Bender and Alexander Koller. 2020. Climbing Tianyu Gao, Howard Yen, Jiatong Yu, and Danqi Chen.
towards NLU: On meaning, form, and understanding 2023. Enabling large language models to generate
in the age of data. In Proceedings of the 58th Annual text with citations. arXiv preprint arXiv:2305.14627.
Meeting of the Association for Computational Lin-
guistics, pages 5185–5198, Online. Association for Yu Gu, Xiang Deng, and Yu Su. 2023. Don’t generate,
Computational Linguistics. discriminate: A proposal for grounding language
models to real-world environments. In Proceedings
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen of the 61st Annual Meeting of the Association for
Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Computational Linguistics (Volume 1: Long Papers),
Ding, Danny Driess, Avinava Dubey, Chelsea Finn, pages 4928–4949, Toronto, Canada. Association for
et al. 2023a. Rt-2: Vision-language-action models Computational Linguistics.
transfer web knowledge to robotic control. arXiv
preprint arXiv:2307.15818. Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasu-
pat, and Mingwei Chang. 2020. Retrieval augmented
Anthony Brohan, Yevgen Chebotar, Chelsea Finn, Karol
language model pre-training. In International confer-
Hausman, Alexander Herzog, Daniel Ho, Julian
ence on machine learning, pages 3929–3938. PMLR.
Ibarz, Alex Irpan, Eric Jang, Ryan Julian, et al. 2023b.
Do as i can, not as i say: Grounding language in
robotic affordances. In Conference on Robot Learn- Jeffrey T Hancock, Mor Naaman, and Karen Levy.
ing, pages 287–318. PMLR. 2020. Ai-mediated communication: Definition, re-
search agenda, and ethical considerations. Journal of
Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Computer-Mediated Communication, 25(1):89–100.
Wei Xue, Shanghang Zhang, Jie Fu, and Zhiyuan
Liu. 2023. Chateval: Towards better llm-based eval- Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong,
uators through multi-agent debate. arXiv preprint Zhen Wang, Daisy Zhe Wang, and Zhiting Hu. 2023.
arXiv:2308.07201. Reasoning with language model is planning with
world model. arXiv preprint arXiv:2305.14992.
Khyathi Raghavi Chandu, Yonatan Bisk, and Alan W
Black. 2021. Grounding ‘grounding’ in NLP. In Sirui Hong, Xiawu Zheng, Jonathan Chen, Yuheng
Findings of the Association for Computational Lin- Cheng, Ceyao Zhang, Zili Wang, Steven Ka Shing
guistics: ACL-IJCNLP 2021, pages 4283–4305, On- Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, et al.
line. Association for Computational Linguistics. 2023. Metagpt: Meta programming for multi-
agent collaborative framework. arXiv preprint
Xinyun Chen, Maxwell Lin, Nathanael Schärli, and arXiv:2308.00352.
Denny Zhou. 2023. Teaching large language models
to self-debug. arXiv preprint arXiv:2304.05128. Wenlong Huang, Pieter Abbeel, Deepak Pathak, and
Igor Mordatch. 2022. Language models as zero-shot
Zhoujun Cheng, Tianbao Xie, Peng Shi, Chengzu
planners: Extracting actionable knowledge for em-
Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong,
bodied agents. In International Conference on Ma-
Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer,
chine Learning, pages 9118–9147. PMLR.
Noah A. Smith, and Tao Yu. 2023. Binding language
models in symbolic languages. ICLR.
Eric R Kandel. 2007. In search of memory: The emer-
Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, gence of a new science of mind. WW Norton &
Samuel Stevens, Boshi Wang, Huan Sun, and Yu Su. Company.
2023. Mind2web: Towards a generalist agent for the
web. arXiv preprint arXiv:2306.06070. Angeliki Lazaridou, Elena Gribovskaya, Wojciech
Stokowiec, and Nikolai Grigorev. 2022. Internet-
Daniel C Dennett. 2013. The role of language in intelli- augmented language models through few-shot
gence. Sprache und Denken/Language and Thought, prompting for open-domain question answering.
page 42. ArXiv.
21
Jacky Liang, Wenlong Huang, F. Xia, Peng Xu, Karol Joon Sung Park, Joseph C O’Brien, Carrie J Cai, Mered-
Hausman, Brian Ichter, Peter R. Florence, and Andy ith Ringel Morris, Percy Liang, and Michael S
Zeng. 2022. Code as policies: Language model Bernstein. 2023. Generative agents: Interactive
programs for embodied control. 2023 IEEE Inter- simulacra of human behavior. arXiv preprint
national Conference on Robotics and Automation arXiv:2304.03442.
(ICRA), pages 9493–9500.
Shishir G Patil, Tianjun Zhang, Xin Wang, and
Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Joseph E Gonzalez. 2023. Gorilla: Large language
Yan Wang, Rui Wang, Yujiu Yang, Zhaopeng Tu, and model connected with massive apis. arXiv preprint
Shuming Shi. 2023. Encouraging divergent thinking arXiv:2305.15334.
in large language models through multi-agent debate.
arXiv preprint arXiv:2305.19118. Chen Qian, Xin Cong, Cheng Yang, Weize Chen,
Yusheng Su, Juyuan Xu, Zhiyuan Liu, and Maosong
Nelson F Liu, Tianyi Zhang, and Percy Liang. 2023a. Sun. 2023. Communicative agents for software de-
Evaluating verifiability in generative search engines. velopment. arXiv preprint arXiv:2307.07924.
arXiv preprint arXiv:2304.09848.

Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen,
Zhou, Andrew M Dai, Diyi Yang, and Soroush Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang,
Vosoughi. 2023b. Training socially aligned language Chaojun Xiao, Chi Han, et al. 2023a. Tool
models in simulated human society. arXiv preprint learning with foundation models. arXiv preprint
arXiv:2305.16960. arXiv:2304.08354.

Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xu- Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan
anyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang,
Kaiwen Men, Kejuan Yang, et al. 2023c. Agent- Bill Qian, et al. 2023b. Toolllm: Facilitating large
bench: Evaluating llms as agents. arXiv preprint language models to master 16000+ real-world apis.
arXiv:2308.03688. arXiv preprint arXiv:2307.16789.

Yihe Liu, Anushk Mittal, Diyi Yang, and Amy Bruck- Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta
man. 2022. Will ai console me when i lose my pet? Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola
understanding perceptions of ai-mediated email writ- Cancedda, and Thomas Scialom. 2023. Toolformer:
ing. In Proceedings of the 2022 CHI conference on Language models can teach themselves to use tools.
human factors in computing systems, pages 1–13. arXiv preprint arXiv:2302.04761.

Zhiwei Liu, Weiran Yao, Jianguo Zhang, Le Xue, Patrick Schramowski, Cigdem Turan, Nico Andersen,
Shelby Heinecke, Rithesh Murthy, Yihao Feng, Constantin A Rothkopf, and Kristian Kersting. 2022.
Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, Large pre-trained language models contain human-
et al. 2023d. Bolaa: Benchmarking and orchestrating like biases of what is right and wrong to do. Nature
llm-augmented autonomous agents. arXiv preprint Machine Intelligence, 4(3):258–268.
arXiv:2308.05960.
Yongliang Shen, Kaitao Song, Xu Tan, Dong Sheng Li,
Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christo- Weiming Lu, and Yue Ting Zhuang. 2023. Hugging-
foros Nalmpantis, Ram Pasunuru, Roberta Raileanu, gpt: Solving ai tasks with chatgpt and its friends in
Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, hugging face. ArXiv, abs/2303.17580.
Asli Celikyilmaz, et al. 2023. Augmented language
models: a survey. arXiv preprint arXiv:2302.07842.
Noah Shinn, Federico Cassano, Beck Labash, Ash-
Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, win Gopinath, Karthik Narasimhan, and Shunyu
Long Ouyang, Christina Kim, Christopher Hesse, Yao. 2023. Reflexion: Language agents with
Shantanu Jain, Vineet Kosaraju, William Saunders, verbal reinforcement learning. arXiv preprint
et al. 2021. WebGPT: Browser-Assisted Question- arXiv:2303.11366.
Answering with Human Feedback. arXiv preprint
arXiv:2112.09332. David Silver, Aja Huang, Chris J Maddison, Arthur
Guez, Laurent Sifre, George Van Den Driessche, Ju-
Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, lian Schrittwieser, Ioannis Antonoglou, Veda Pan-
Henryk Michalewski, Jacob Austin, David Bieber, neershelvam, Marc Lanctot, et al. 2016. Mastering
David Dohan, Aitor Lewkowycz, Maarten Bosma, the game of go with deep neural networks and tree
David Luan, et al. 2021. Show your work: Scratch- search. nature, 529(7587):484–489.
pads for intermediate computation with language
models. arXiv preprint arXiv:2112.00114. Chan Hee Song, Jiaman Wu, Clayton Washington,
Brian M Sadler, Wei-Lun Chao, and Yu Su. 2022.
Aaron Parisi, Yao Zhao, and Noah Fiedel. 2022. Llm-planner: Few-shot grounded planning for em-
Talm: Tool augmented language models. ArXiv, bodied agents with large language models. arXiv
abs/2205.12255. preprint arXiv:2212.04088.
22
Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Man- Andy Zou, Zifan Wang, J Zico Kolter, and Matt Fredrik-
dlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and An- son. 2023. Universal and transferable adversarial
ima Anandkumar. 2023a. Voyager: An open-ended attacks on aligned language models. arXiv preprint
embodied agent with large language models. arXiv arXiv:2307.15043.
preprint arXiv:2305.16291.

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao

Appendix
Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang,
Xu Chen, Yankai Lin, et al. 2023b. A survey on large A Past Tutorials/Workshops by the
language model based autonomous agents. arXiv Instructors
preprint arXiv:2308.11432.
The instructors of the proposed tutorial have given
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten tutorials or co-organized workshops at leading in-
Bosma, Ed Chi, Quoc Le, and Denny Zhou. 2022. ternational conferences as follows:
Chain of thought prompting elicits reasoning in large
language models. arXiv preprint arXiv:2201.11903. Yu Su:

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, • ACL’21: Workshop on Natural Language Process-
Shaokun Zhang, Erkang Zhu, Beibin Li, Li Jiang, ing for Programming
Xiaoyun Zhang, and Chi Wang. 2023. Auto-
gen: Enabling next-gen llm applications via multi- • ACL’20: Workshop on Natural Language Inter-
agent conversation framework. arXiv preprint faces
arXiv:2308.08155.
• WWW’18: Tutorial on Scalable Construction and
Xingjiao Wu, Luwei Xiao, Yixuan Sun, Junhang Zhang, Querying of Massive Knowledge Bases
Tianlong Ma, and Liang He. 2022. A survey of
human-in-the-loop for machine learning. Future • CIKM’17: Tutorial on Construction and Querying
Generation Computer Systems, 135:364–381. of Large-scale Knowledge Bases
Shunyu Yao, Howard Chen, John Yang, and Karthik
Narasimhan. 2022a. Webshop: Towards scalable Diyi Yang:
real-world web interaction with grounded language
agents. Advances in Neural Information Processing • EACL’23: Tutorial on Summarizing Conversa-
Systems, 35:20744–20757.
tions at Scale
Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, • ACL’22: Tutorial on Learning with Limited Data
Thomas L Griffiths, Yuan Cao, and Karthik
Narasimhan. 2023. Tree of thoughts: Deliberate • EMNLP’21: Workshop on Causal Inference &
problem solving with large language models. arXiv NLP
preprint arXiv:2305.10601.
• NAACL’18 & ACL’19: Widening NLP Workshop
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak
Shafran, Karthik Narasimhan, and Yuan Cao. 2022b.
React: Synergizing reasoning and acting in language Tao Yu:
models. arXiv preprint arXiv:2210.03629.
• ACL’23: Tutorial on Complex Reasoning over
Ziyu Yao, Yiqi Tang, Wen-tau Yih, Huan Sun, and Natural Language
Yu Su. 2020. An imitation game for learning se-
mantic parsers from user interaction. In Proceed- • NAACL’22: Structured and Unstructured Knowl-
ings of the 2020 Conference on Empirical Methods edge Integration Workshop
in Natural Language Processing (EMNLP), pages
6883–6902, Online. Association for Computational • EMNLP’20: Interactive and Executable Semantic
Linguistics. Parsing Workshop
Xiang Yue, Boshi Wang, Kai Zhang, Ziru Chen, Yu Su,
and Huan Sun. 2023. Automatic evaluation of at- B Recommended Reading List
tribution by large language models. arXiv preprint
arXiv:2305.06311. The audience is recommended (but not required)
to read the following papers before the tutorial to
Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, facilitate more engagement during the tutorial:
Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan
Bisk, Daniel Fried, Uri Alon, et al. 2023. Webarena:
A realistic web environment for building autonomous • Daniel C Dennett. The role of language in
agents. arXiv preprint arXiv:2307.13854. intelligence. (Dennett, 2013)
23
• Timo Schick, Jane Dwivedi-Yu, Roberto • Emily M. Bender and Alexander Koller.
Dessì, Roberta Raileanu, Maria Lomeli, Luke Climbing towards NLU: On meaning, form,
Zettlemoyer, Nicola Cancedda, and Thomas and understanding in the age of data. (Bender
Scialom. Toolformer: Language models can and Koller, 2020)
teach themselves to use tools. (Schick et al.,
2023)

• Jason Wei, Xuezhi Wang, Dale Schuurmans,

Maarten Bosma, Ed Chi, Quoc Le, and Denny
Zhou. Chain of thought prompting elicits rea-
soning in large language models. (Wei et al.,
2022)

• Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du,

Izhak Shafran, Karthik Narasimhan, and Yuan
Cao. React: Synergizing reasoning and acting
in language models. (Yao et al., 2022b)

• Gati V Aher, Rosa I Arriaga, and Adam Tau-

man Kalai. Using large language models to
simulate multiple humans and replicate hu-
man subject studies. (Aher et al., 2023)

• Lei Wang, Chen Ma, Xueyang Feng, Zeyu

Zhang, Hao Yang, Jingsen Zhang, Zhiyuan
Chen, Jiakai Tang, Xu Chen, Yankai Lin, et
al. A survey on large language model based
autonomous agents. (Wang et al., 2023b)

• Yu Gu, Xiang Deng, and Yu Su. Don’t gener-

ate, discriminate: A proposal for grounding
language models to real-world environments.
(Gu et al., 2023)

• Zhoujun Cheng, Tianbao Xie, Peng Shi,

Chengzu Li, Rahul Nadkarni, Yushi Hu, Caim-
ing Xiong, Dragomir Radev, Mari Ostendorf,
Luke Zettlemoyer, Noah A. Smith, and Tao
Yu. Binding language models in symbolic
languages. (Cheng et al., 2023)

• Joon Sung Park, Joseph C O’Brien, Car-

rie J Cai, Meredith Ringel Morris, Percy
Liang, and Michael S Bernstein. Generative
agents: Interactive simulacra of human behav-
ior. (Park et al., 2023)

• Patrick Schramowski, Cigdem Turan, Nico

Andersen, Constantin A Rothkopf, and Kris-
tian Kersting. Large pre-trained language
models contain human-like biases of what is
right and wrong to do. (Schramowski et al.,
2022)
24

Cognitive Architectures For Language Agents: Theodore R. Sumers Shunyu Yao Karthik Narasimhan Thomas L. Griffiths
No ratings yet
Cognitive Architectures For Language Agents: Theodore R. Sumers Shunyu Yao Karthik Narasimhan Thomas L. Griffiths
32 pages
MNLP 2024 Tutorial On Language Agents (Public) - AI Agent Defiion
No ratings yet
MNLP 2024 Tutorial On Language Agents (Public) - AI Agent Defiion
20 pages
Cognitive Articleas
No ratings yet
Cognitive Articleas
64 pages
Large Language Model Agent: A Survey On Methodology, Applications and Challenges
No ratings yet
Large Language Model Agent: A Survey On Methodology, Applications and Challenges
26 pages
Lan Graph
No ratings yet
Lan Graph
7 pages
Agents in Software Engineering: Survey, Landscape, and Vision
No ratings yet
Agents in Software Engineering: Survey, Landscape, and Vision
19 pages
Language Models: A Guide For The Perplexed
No ratings yet
Language Models: A Guide For The Perplexed
35 pages
Survey LLM-Agents 2025
No ratings yet
Survey LLM-Agents 2025
44 pages
The Evolution of AI Agents & Agentic Systems - by Cobus Greyling - Nov, 2024 - Medium
100% (1)
The Evolution of AI Agents & Agentic Systems - by Cobus Greyling - Nov, 2024 - Medium
15 pages
Demystifying AI Agents Curated by ProductHood School 1735180517
No ratings yet
Demystifying AI Agents Curated by ProductHood School 1735180517
17 pages
Reducing LLM Hallucination Research Paper
No ratings yet
Reducing LLM Hallucination Research Paper
36 pages
Talking About Large Language Models
No ratings yet
Talking About Large Language Models
12 pages
Levels of AI Agents - From Rules To Large Language Models
No ratings yet
Levels of AI Agents - From Rules To Large Language Models
8 pages
AI Agents: LLM vs. Traditional
No ratings yet
AI Agents: LLM vs. Traditional
15 pages
Artificial Intelligence and Linguistic Landscape R
No ratings yet
Artificial Intelligence and Linguistic Landscape R
25 pages
Genie
No ratings yet
Genie
40 pages
How AI Agents Can Help Supercharge Language Models - A Handbook For Developers
No ratings yet
How AI Agents Can Help Supercharge Language Models - A Handbook For Developers
127 pages
10.1515 - Jccall 2022 0032
No ratings yet
10.1515 - Jccall 2022 0032
26 pages
LLM Agents Transform Conversational AI
No ratings yet
LLM Agents Transform Conversational AI
2 pages
A Survey On LLM-Based Agents: Common Workflows and Reusable LLM-Profiled Components
No ratings yet
A Survey On LLM-Based Agents: Common Workflows and Reusable LLM-Profiled Components
20 pages
Aeon - Co-Can Philosophy Help Us Get A Grip On The Consequences of AI
No ratings yet
Aeon - Co-Can Philosophy Help Us Get A Grip On The Consequences of AI
10 pages
Against AI Understanding and Sentience-Large Language Models, Meaning, and The-Durt, Christoph, Froese, Tom, Fuchs, Thomas 2023 LLMs
No ratings yet
Against AI Understanding and Sentience-Large Language Models, Meaning, and The-Durt, Christoph, Froese, Tom, Fuchs, Thomas 2023 LLMs
15 pages
LLM Agents and Tool Use
No ratings yet
LLM Agents and Tool Use
62 pages
LLMs in Psychology: A Comprehensive Review
No ratings yet
LLMs in Psychology: A Comprehensive Review
34 pages
LLM Review
No ratings yet
LLM Review
16 pages
Large Action Models: From Inception To Implementation: Lu Wang Fangkai Yang Chaoyun Zhang Junting Lu
No ratings yet
Large Action Models: From Inception To Implementation: Lu Wang Fangkai Yang Chaoyun Zhang Junting Lu
25 pages
Large Language Models Versus Natural Language Understanding and Generation
No ratings yet
Large Language Models Versus Natural Language Understanding and Generation
13 pages
Evaluation Methods for LLM Agents
No ratings yet
Evaluation Methods for LLM Agents
20 pages
Thesis 1
No ratings yet
Thesis 1
41 pages
Human-Agent Communication Challenges
No ratings yet
Human-Agent Communication Challenges
27 pages
Open-source Language Agent Framework
No ratings yet
Open-source Language Agent Framework
9 pages
A Survey of Large Language Model Agents For
No ratings yet
A Survey of Large Language Model Agents For
13 pages
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents
No ratings yet
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents
9 pages
2025 NLP Lecture 01 Course Overview
No ratings yet
2025 NLP Lecture 01 Course Overview
60 pages
Agentic2 8754756
No ratings yet
Agentic2 8754756
13 pages
Understanding Large Language Models
No ratings yet
Understanding Large Language Models
6 pages
Ch0. Course Seminar
No ratings yet
Ch0. Course Seminar
50 pages
Navin
No ratings yet
Navin
12 pages
Understanding Large Language Models
No ratings yet
Understanding Large Language Models
47 pages
Agentic Workflows in LLMs
No ratings yet
Agentic Workflows in LLMs
43 pages
NLP Research Paper 4
No ratings yet
NLP Research Paper 4
7 pages
Levels of AI Agents: From Rules To Large Language Models: Yu Huang Roboraction - AI
No ratings yet
Levels of AI Agents: From Rules To Large Language Models: Yu Huang Roboraction - AI
11 pages
Comparing LLMs Using A Unified Performance Ranking System
No ratings yet
Comparing LLMs Using A Unified Performance Ranking System
13 pages
2024 Acl-Long 279
No ratings yet
2024 Acl-Long 279
42 pages
A E C P T L L M: A P ' G: N Mpirical Ategorization of Rompting Echniques FOR Arge Anguage Odels Ractitioner S Uide
No ratings yet
A E C P T L L M: A P ' G: N Mpirical Ategorization of Rompting Echniques FOR Arge Anguage Odels Ractitioner S Uide
16 pages
LLM Agents Making Agent Tools: Georg Wölflein, Dyke Ferber, Daniel Truhn, Ognjen Arandjelovi C, Jakob N. Kather
No ratings yet
LLM Agents Making Agent Tools: Georg Wölflein, Dyke Ferber, Daniel Truhn, Ognjen Arandjelovi C, Jakob N. Kather
38 pages
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning To Language Agents
No ratings yet
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning To Language Agents
37 pages
Pranay Report
No ratings yet
Pranay Report
26 pages
Introduction to NLP & Linguistics
No ratings yet
Introduction to NLP & Linguistics
28 pages
(Coursera) GenAI
No ratings yet
(Coursera) GenAI
27 pages
A PHD Students Perspective On Research I
No ratings yet
A PHD Students Perspective On Research I
27 pages
Automated Interview References
No ratings yet
Automated Interview References
16 pages
LLM AI Agents: Task Planning & Tools
No ratings yet
LLM AI Agents: Task Planning & Tools
36 pages
AI Agents in Financial Auditing
No ratings yet
AI Agents in Financial Auditing
33 pages
NLP for Cognitive Systems
No ratings yet
NLP for Cognitive Systems
21 pages
How LLM Agents Learn and Reason
No ratings yet
How LLM Agents Learn and Reason
2 pages
Introduction To Generative AI and Prompt Engineering
No ratings yet
Introduction To Generative AI and Prompt Engineering
3 pages
Artificial Intelligence in Business Management: Unlocking Opportunities, Addressing Challenges, and Transforming Corporate Leadership
No ratings yet
Artificial Intelligence in Business Management: Unlocking Opportunities, Addressing Challenges, and Transforming Corporate Leadership
13 pages
Grounding DINO 1.5: Advancements in Detection
No ratings yet
Grounding DINO 1.5: Advancements in Detection
25 pages
The Data Driven Enterprise of 2025 Final
No ratings yet
The Data Driven Enterprise of 2025 Final
10 pages
Data Warehouse Optimization Tips
No ratings yet
Data Warehouse Optimization Tips
10 pages
The Cios Guide To Successful Process Automation
No ratings yet
The Cios Guide To Successful Process Automation
32 pages
Simatic Scada Systems: Efficient To A New Level
No ratings yet
Simatic Scada Systems: Efficient To A New Level
6 pages
macOS Sonoma For Dummies Guy Hart-Davis Instant Download
100% (1)
macOS Sonoma For Dummies Guy Hart-Davis Instant Download
40 pages
Nokia TV Repair Issues Summary
No ratings yet
Nokia TV Repair Issues Summary
2 pages
Make A 2D Arcade Game in A Weekend With Unity Create Your First 2D Arcade Game in A Weekend 1st Edition Jodessiah Sumpter PDF Download
100% (2)
Make A 2D Arcade Game in A Weekend With Unity Create Your First 2D Arcade Game in A Weekend 1st Edition Jodessiah Sumpter PDF Download
54 pages
Threat Intelligence Report New FakeBat Variant June 2024
No ratings yet
Threat Intelligence Report New FakeBat Variant June 2024
16 pages
Lab-2 3
No ratings yet
Lab-2 3
3 pages
Worksheet 6th
No ratings yet
Worksheet 6th
6 pages
Projects
No ratings yet
Projects
6 pages
Set 2
No ratings yet
Set 2
11 pages
FlashSystem - Distributed RAID - 2021-Jul-01
No ratings yet
FlashSystem - Distributed RAID - 2021-Jul-01
9 pages
Cybersecurity Internship Application
No ratings yet
Cybersecurity Internship Application
1 page
#WWDC16 Typography and Fonts PDF
100% (1)
#WWDC16 Typography and Fonts PDF
186 pages
Software Architecture Quiz
No ratings yet
Software Architecture Quiz
11 pages
Assure Broşür-2
No ratings yet
Assure Broşür-2
6 pages
Asterisk Comandos
No ratings yet
Asterisk Comandos
7 pages
Applied Cryptography Overview
No ratings yet
Applied Cryptography Overview
30 pages
Electronics 12 01333
No ratings yet
Electronics 12 01333
42 pages
Shotcut Quick Start Guide
No ratings yet
Shotcut Quick Start Guide
3 pages
Croma Online Audit
No ratings yet
Croma Online Audit
8 pages
Baby Names From Nakshatra
No ratings yet
Baby Names From Nakshatra
2 pages
NEST I4.0 User Manual
0% (1)
NEST I4.0 User Manual
52 pages
DBMS Query Processing Guide
No ratings yet
DBMS Query Processing Guide
143 pages
OT3 2.4 Vocabulary Practice
No ratings yet
OT3 2.4 Vocabulary Practice
2 pages
HiBurn User Guide
100% (1)
HiBurn User Guide
68 pages
Knowledge Representation and Reasoning
No ratings yet
Knowledge Representation and Reasoning
8 pages
Microsoft Windows Shortcut Keys
No ratings yet
Microsoft Windows Shortcut Keys
4 pages
E-Commerce: Growth, Features, and Security
No ratings yet
E-Commerce: Growth, Features, and Security
10 pages
Features of Solid State Drives
No ratings yet
Features of Solid State Drives
5 pages
QMS-QO-01A Quality Objectives Plan 2022
No ratings yet
QMS-QO-01A Quality Objectives Plan 2022
1 page
KC ResumeChi
No ratings yet
KC ResumeChi
3 pages