9 -6 2 5 -0 5 6
OCTOBER 11, 2024
t
os
rP
MICHAEL PARZEN
JO ELLERY
Prompt Engineering
AI has become a fixture in our lives, often in the form of text-based assistants based on large
yo
language models1 like OpenAI’s ChatGPT or Google’s Gemini, or image generators like Stability AI's
Stable Diffusion or OpenAI’s DALL-E. These models are interacted with by submitting requests called
prompts, in which the user makes a request, and the model responds.
Getting these models to give the desired output may not be entirely straightforward, however.
Users may often find themselves having to repeat instructions or struggling with output that isn’t as
desired. Using AI assistants and pre-trained models well, then, requires the use of prompt engineering,
or choosing the right request for a given task. With correct prompts, we can improve the accuracy,
op
nuance and efficiency of our interactions with these models. In essence, models like these are predictors
of the text or image that we want to follow from our request. Prompt engineering is the art (or science)
or making these predictions match our desires.2
Basic Principles of Prompting
tC
In addition to formal prompt engineering techniques, there are several best practices prompters
should follow when interacting with generative AI. We’ll see some of these principles reprised in our
specific prompting techniques.
1) Be specific and clear. When asking the AI a question or to perform a task, include all relevant
information. Try to limit the scope of the question to ensure you get the most accurate answer.
No
Poor: Tell me about the 1800s.
Better: Tell me about the 1800s in France.
Best: Explain the history of the Napoleonic wars in France, with a focus on the political and
social impacts of the wars.
2) Specify the structure of the desired output. If you want the AI to provide slides, bullet points,
or a detailed explanation, you need to say so.
Do
1 Large language models (LLMs) are AI models built on a type of neural network called a transformer and trained on large
amounts of data to understand and reproduce human language.
2 For a more detailed coverage of prompt engineering and its place in AI, see Liu et al. 2023 in ACM Computing Surveys.
Professor Michael Parzen and Doctoral Student Jo Ellery prepared this note as the basis for class discussion.
Copyright © 2024 President and Fellows of Harvard College. To order copies or request permission to reproduce materials, call 1-800-545-7685,
write Harvard Business School Publishing, Boston, MA 02163, or go to www.hbsp.harvard.edu. This publication may not be digitized, photocopied,
or otherwise reproduced, posted, or transmitted, without the permission of Harvard Business School.
This document is authorized for educator review use only by Sajid Iqbal, University of Engineering and Technology UET Lahore until May 2025. Copying or posting is an infringement of
copyright. [email protected] or 617.783.7860
625-056 Prompt Engineering
t
os
3) Break down complex tasks. For complex, multi-part tasks, don’t ask for everything at once.
Instead, break the task into steps. This helps to ensure that you can course-correct if something
goes wrong along the way.
Prompt Engineering Techniques
rP
There are several specific techniques which can help to generate better responses from AI.
Providing Examples
One way of encouraging the model to give your desired result is to provide it with an example of a
similar answer in your prompt. For example, suppose you want to translate several sentences into
Chinese. Naively, you might ask the model:
yo
Translate the following into Mandarin Chinese: I would like to have dinner with you.
There are several problems you might run into in asking this question. For example, the model
could return pinyin (romanization of Chinese) when you wanted characters, or vice versa; or it could
return traditional instead of simplified characters, or vice versa; or the translation could simply be
wrong. Instead, we can provide the model an example to help it answer in a more useful way. We
would put the following into the model:
op
Translate the following into Mandarin Chinese:
Input: I like dogs. Output: Wǒ xǐhuān gǒu. 我喜欢狗。
Input: I want to go to China. Output: Wǒ xiǎng qù zhōngguó. 我想去中国。
tC
Now translate: I would like to have dinner with you.
By providing examples, we indicate that we want both pinyin and characters, and that we want to
use simplified characters. Additionally, providing examples can help to ensure a correct translation.
This technique can be applied to almost anything, from asking the model to solve word problems or
coding problems to generating sample text for a variety of purposes.
Chain-of-Thought Prompting
No
Chain-of-thought prompting is prompting which asks the model to present its steps in coming to
an answer. Because generative AI text models are models which predict the next word in a reply
iteratively, they often have trouble with more complex questions that require behind-the-scenes
thinking. For example, consider the following problem, drawn from Thinking Fast and Slow by Daniel
Kahneman:
A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the
ball cost?
Do
This problem often trips up human answerers and requires some algebra to solve. In chain-of-
thought prompting, we ask the model to show its work when solving such problems. For example, we
might instead provide the prompt:
A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the
ball cost? Show all of the steps you take to find the answer.
This document is authorized for educator review use only by Sajid Iqbal, University of Engineering and Technology UET Lahore until May 2025. Copying or posting is an infringement of
copyright. [email protected] or 617.783.7860
Prompt Engineering 625-056
t
os
We might also combine this prompting style with an example, in order to show the model exactly
how we want the problem to be solved. For example, we could use the following as a prompt.
Q: Mary is making widgets with a machine. She can operate two machines at once, but her
second machine will only be half as effective. She is currently operating one machine and
making forty widgets per day when working for eight hours. How many widgets would she
rP
make per day if we gave her a second machine and she worked ten hours a day?
A: Mary makes five widgets per hour with her first machine. She would make 2.5 widgets per
hour with her second machine, for a total of 7.5 widgets per hour. Working ten hours a day,
she would make 75 widgets.
Q: Tom is making widgets with a machine. He can operate four machines at once, each with
equal efficiency. He is currently operating one machine and making ten units per day when
yo
working for eight hours. How many widgets would he make working for six hours a day with
four machines?
Step-by-Step Prompting
Finally, we might want to ask the AI a question which requires complex knowledge, obscure data,
or several steps to solve. In this case, we can use step-by-step prompting to help arrive at a correct
answer. In particular, suppose we have a question which requires the model to combine several pieces
op
of context, like:
Where are the headquarters of the company where the world's second richest man works?
Entering this into ChatGPT as of August 2024 gives an answer of Elon Musk, Tesla, and Austin
Texas. However, as of August 2024, Jeff Bezos is in fact the world’s second richest person (per Forbes).
The model received too many questions at once, and thus failed to produce a correct answer. Instead,
tC
we can prompt it to search for the correct information by using multiple sequential prompts:
Who is the world’s second richest man right now?
Where does he work?
Where is that company's headquarters?
These sequential prompts instead generate a correct answer.
No
Types of Prompts
In addition to the above prompting techniques, it is also useful to consider various types of prompts
one might give generative models. Beyond asking basic questions, we can also generate useful
responses with the following techniques.
1) Role prompting. Suppose we want to generate text with a specific style. We might ask the
model to generate that text, taking on a given role. For example, we might ask it to, as a business
Do
leader, generate a pitch for a new product.
2) Prompt constraints and context. It is often useful to ask a question and provide additional
constraints or context to the model. For example, you might touch on exactly what you want to
see in a complete answer, or tell the model the context of your question (asking as a business
owner or as a potential investor, for example).
This document is authorized for educator review use only by Sajid Iqbal, University of Engineering and Technology UET Lahore until May 2025. Copying or posting is an infringement of
copyright. [email protected] or 617.783.7860
625-056 Prompt Engineering
t
os
Many types of prompts can be used to induce large language models to perform tasks that would
otherwise require specific, specially trained machine learning models. Note, however, that these
general models often do not perform as well as specifically trained models.
1) Summarization. Suppose we have a lot of text to work through, like review data or worker
evaluations, or even news or financial reports. We might ask an AI assistant to summarize that
rP
text for us in a short description or even a bulleted list.
2) Text classification. Many machine learning models are trained to classify text into specific
groups. We can also ask large language models to classify sets of text; for example, we can ask
it to give the topic of an article from a limited set of possible topics.
3) Sentiment analysis. Models like BERT are often used for sentiment analysis, but they require
specific training. For many applications, we can apply large language models like ChatGPT by
yo
giving the model data (like customer reviews) and asking for the sentiment of the text.
4) Named entity recognition. Again, specialized ML models are often used to draw out specific
proper nouns from a text, a process known as named entity recognition. We can also ask large
language models to perform this task.
Challenges and Limitations
op
Prompt engineering can be quite difficult, and it is an inexact science. Additionally, there is an
element of randomness to large language models which may imply that very small changes in wording
can result in vastly different responses. As such, creating good prompts is incredibly important when
interacting with generative AI.
Even with excellent prompts, there are some fundamental limitations of using large language
tC
models for some tasks. As alluded to in the previous section, many tasks are commonly allocated to
specially trained, single-purpose models which have seen multiple examples—often thousands of
examples—of the task at hand. Such specially trained models will often perform better on specific tasks
than the general-purpose large language models like GPT.
Additionally, a language model used via an API or interface like ChatGPT is constantly being
updated, which implies that anything it produces is unlikely to be reproducible. In essence, asking the
No
same model the same question on two different days will give two different answers, with no clear
ability to discern why. This means that past performance is no guarantee of future results when it
comes to using models like GPT.
Finally, the limitations of prompt engineering include the limitations of large language models
themselves. LLMs are often frozen in time, meaning that complex questions about current events or
relying on recent cultural developments may be answered incorrectly. LLMs may hallucinate and
provide incorrect outputs even when carefully instructed. Finally, LLMs may contain undesirable bias
picked up from their training data.
Do
This document is authorized for educator review use only by Sajid Iqbal, University of Engineering and Technology UET Lahore until May 2025. Copying or posting is an infringement of
copyright. [email protected] or 617.783.7860