Supervised Fine-Tuning (SFT)
In AI training, Supervised Fine-Tuning (SFT) is a process where a pre-trained model is refined on a smaller,
specific dataset with a trainer-provided labeling. It is usually the first stage that involves actual AI Trainers to
refine the model with tailored data. Prior to SFT, the model is being educated using a Training Set.
How does SFT work?
During the SFT process the model is educated
using thousands to millions of instances of
conversations AI Trainers create from scratch,
“playing” both sides, allowing the AI to see a
lot of examples of human interactions. These
conversations imitate different types of users
who speak and think in different ways and
want different things from their respective
conversations.
What is an SFT Conversation?
When people interact, they don’t all interact in exactly the same way and expect
exactly the same thing as a result of the interaction. In the real world, when humans
talk to each other, we have access to a lot of secondary information that allows us
to interpret people’s words, such as body language, tone of voice, gestures, eye
contact, etc. However, an AI cannot access all the extra data. SFT tasks are vital in
helping the model to know what successful conversations look like and how they
function. Think of it as the way an AI can overlap more successfully with normal
human interaction.
WRITING A BASIC SFT CONVERSATION
Plan
Step 1: Develop a User Personality
and Objective.
Step 2: Collect information.
When you’re writing an SFT conversation,
you’re functionally writing creative non-fiction. To craft engaging conversations on diverse topics
You’re creating the personality of the user and you'll need to gather information from the internet to
the objective for the user to be there, the provide accurate answers to user queries. Keep
content you provide to the user has to be these rules in mind:
factually accurate and consistent with the real
world. Most of our clients mandate the user be Do NOT copy and paste from websites straight into
an average American (be sure to review your your conversation.
client-specific Style Guide). Do NOT use information from untrustworthy sites.
Do NOT use info from books or your own brain.
ALWAYS include any research you used.
Step 3: Plan the Conversation.
Develop the Conversation "Plotline", what will
happen in your conversation from beginning to
end. TIPS:
Ensure the conversation flows logically; avoid
abrupt topic changes.
Clarify technical terms or concepts in bot
responses without introducing entirely new
information on every turn.
Don't overly stress about "world-building"; align the
depth of information with the user's personality
preferences.
Step 4: Write the Conversation.
In general, humans are more satisfied with a
Write conversation if it has sufficient Depth (the degree
to which the objective has been met and content
has been conveyed). Depth can be built by going
beyond surface-level responses, contributing to
your overall flow.
Review
Step 5: Make sure Turns connect fluidly.
Once you finish writing your conversation, you need
to review it to ensure that not only it's free from
errors, but also that it has the flow of a natural Step 6: Final Check
conversation. Flow pertains to the smooth and
logical progression of a conversation, ensuring it Check for plagiarism &
maintains a natural sequence of ideas and responses English production
between user prompts and bot responses.
Highlights to remember!
When formulating the initial user query, keep it broad.
Avoid relying on conversational niceties like greetings or thanks.
Develop a clear plotline to ensure a flow in the conversation.
Consider personal experiences and perspectives for inspiration.
Don't worry about including excessive world-building.
Approach writing the conversation like a dialogue in a play.
Avoid being too harshly critical of your own writing.