Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from Instruction-Tuning-with-GPT-4:main #1

Merged
merged 4 commits into from
Apr 7, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update README.md
  • Loading branch information
ChunyuanLI authored Apr 7, 2023
commit ac1010b49b030d207f79658cbdeb5ccb221071a7
31 changes: 20 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,34 +22,43 @@ This is the repo for the GPT-4-LLM, which aims to share data generated by GPT-4

**Usage and License Notices**: The data is intended and licensed for research use only. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.


- [Overview](#overview)
- [GPT-4 Data Release](#data-release)
- [How Good is the Data?](#how-good-is-the-data)
- [Fine-tuning with the Data](#fine-tuning-with-the-data)

## :fire: News

* **[2023.04.06]** Paper and data are released.




## Overview
Large Language Models (LLMs) have shown impressive generalization capabilities such as in-context-learning and chain-of-thoughts reasoning. To enable LLMs to follow natural language instructions and complete real-world tasks, researchers have been exploring methods of instruction-tuning of LLMs. To advance the state of the art of instruction-tuning for LLMs, we present the first attempt to use GPT-4 to generate instruction-following data for LLM finetuning.

## Data Release

[`alpaca_gpt4_data.json`](./data/alpaca_gpt4_data.json) contains 52K instruction-following data generated by GPT-4 with prompts in Alpaca.
* [`alpaca_gpt4_data.json`](./data/alpaca_gpt4_data.json) contains 52K instruction-following data generated by GPT-4 with prompts in Alpaca.
This JSON file has the same format as Alpaca data, except the output is generated by GPT-4:

- `instruction`: `str`, describes the task the model should perform. Each of the 52K instructions is unique.
- `input`: `str`, optional context or input for the task.
- `output`: `str`, the answer to the instruction as generated by `GPT-4`.
- `instruction`: `str`, describes the task the model should perform. Each of the 52K instructions is unique.
- `input`: `str`, optional context or input for the task.
- `output`: `str`, the answer to the instruction as generated by `GPT-4`.


[`alpaca_gpt4_data_zh.json`](./data/alpaca_gpt4_data_zh.json) contains 52K instruction-following data generated by GPT-4 with Alpaca prompts translated into Chinese by ChatGPT. This JSON file has the same format.
* [`alpaca_gpt4_data_zh.json`](./data/alpaca_gpt4_data_zh.json) contains 52K instruction-following data generated by GPT-4 with Alpaca prompts translated into Chinese by ChatGPT. This JSON file has the same format.

[`comparision_data.json`](./data/comparision_data.json) ranked responses from three models, including GPT-4, GPT-3.5 and OPT-IML by asking GPT-4 to rate the quality.
* [`comparision_data.json`](./data/comparision_data.json) ranked responses from three models, including GPT-4, GPT-3.5 and OPT-IML by asking GPT-4 to rate the quality.

- `user_input`: `str`, prompts used for quering LLMs.
- `completion_a`: `str`, a model completion which is ranked higher than completion_b.
- `completion_b`: `str`, a different model completion which has a lower quality score.
- `user_input`: `str`, prompts used for quering LLMs.
- `completion_a`: `str`, a model completion which is ranked higher than completion_b.
- `completion_b`: `str`, a different model completion which has a lower quality score.

[`unnatural_instruction_gpt4_data.json`](./data/unnatural_instruction_gpt4_data.json) contains 9K instruction-following data generated by GPT-4 with prompts in Unnatural Instruction. This JSON file has the same format as Alpaca data.
* [`unnatural_instruction_gpt4_data.json`](./data/unnatural_instruction_gpt4_data.json) contains 9K instruction-following data generated by GPT-4 with prompts in Unnatural Instruction. This JSON file has the same format as Alpaca data.

## How Good is the Data?
## How Good is the Data

Human evaluation was performed on model generation results using Amazon Mechanical Turk following Helpfulness, Honestness and Harmlessness criteria by [Anthropic AI](https://arxiv.org/abs/2112.00861). The results are summarized as follows:
- Two instruction-tuned LLaMA models were compared, fine-tuned on data generated by GPT-4 and GPT-3 respectively.
Expand Down