-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
81 additions
and
21 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
.DS_Store | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,33 +1,91 @@ | ||
# Project | ||
|
||
> This repo has been populated by an initial template to help get you started. Please | ||
> make sure to update the content to build a great experience for community-building. | ||
|
||
As the maintainer of this project, please make a few updates: | ||
<h1 align="center"> | ||
<img src="./docs/static/images/rho_logo.png" width="100" alt="rho-logo" /> | ||
<br> | ||
Rho-1: Not All Tokens Are What You Need | ||
</h1> | ||
|
||
- Improving this README.MD file to provide a great experience | ||
- Updating SUPPORT.MD with content about this project's support experience | ||
- Understanding the security reporting process in SECURITY.MD | ||
- Remove this section from the README | ||
<div align="center"> | ||
|
||
## Contributing | ||
![](https://img.shields.io/badge/Model-Release%20Soon-blue) | ||
![](https://img.shields.io/badge/Code%20License-MIT-green) | ||
|
||
</div> | ||
|
||
<p align="center"> | ||
<a href="https://microsoft.github.io/rho/rho-1.pdf"><b>[📜 Paper]</b></a> • | ||
<!-- <a href="https://arxiv.org/abs/TODO"><b>[📜 Paper]</b></a> • --> | ||
<!-- <a href="https://huggingface.co/TODO"><b>[🤗 HF Models]</b></a> • --> | ||
<a href="https://github.com/microsoft/rho"><b>[🐱 GitHub]</b></a> | ||
<!-- <a href="https://twitter.com/TODO"><b>[🐦 Twitter]</b></a> --> | ||
</p> | ||
|
||
<p align="center"> | ||
<img src="./docs/static/images/acc_vs_tokens_1b_7b.png" width="1000"> | ||
<br> | ||
<em>Figure 1: Rho-1 is trained with Selective Language Modeling (SLM). SLM improves average few-shot accuracy on GSM8k and MATH by over 16%, achieving the baseline performance 5-10x faster.</em> | ||
</p> | ||
|
||
|
||
## 🔥 News | ||
|
||
<!-- - [2024/04/12] 🔥🔥🔥 Rho-Math-v0.1 models released at [🤗 HuggingFace](https://huggingface.co/TODO)! --> | ||
- [2024/04/11] Rho-1 paper and repo released. | ||
- [2024/04/11] 🔥🔥🔥 Rho-1-1B is the first 1B LLM that achieves over 40% accuracy on MATH dataset. | ||
|
||
|
||
## 💡 Introduction | ||
|
||
Rho-1 employs Selective Language Modeling (SLM), which selectively trains on clean and useful tokens that aligned with the desired distribution. | ||
|
||
- When continual pretraining on 15B OpenWebMath corpus, Rho-1 yields an absolute improvement in few-shot accuracy of up to 30% in 9 math tasks. | ||
- After fine-tuning, Rho-1 1B and 7B achieved state-of-the-art results of 40.6\% and 51.8\% on MATH dataset, respectively — matching DeepSeekMath with only 3\% of the pretraining tokens. | ||
|
||
### Selective Lanugage Modeling (SLM) | ||
|
||
<p align="center"> | ||
<img src="./docs/static/images/example.png" width="1000"> | ||
<br> | ||
<em>Figure 2: | ||
<b>Upper:</b> Even an extensively filtered pretraining corpus contains token-level noise. | ||
<b>Left:</b> Previous Causal Language Modeling (CLM) trains on all tokens. | ||
<b>Right:</b> Our proposed Selective Language Modeling (SLM) selectively applies loss on those useful and clean tokens.</em> | ||
</p> | ||
|
||
<p align="center"> | ||
<img src="./docs/static/images/pipeline.png" width="1000"> | ||
<br> | ||
<em>Figure 3: <b>The pipeline of Selective Language Modeling.</b> | ||
SLM optimizes language model performance by concentrating on valuable, clean tokens during pre-training. | ||
It involves three steps: | ||
(Step 1) Initially, train a reference model on high-quality data. | ||
(Step 2) Then, score each token's loss in a corpus using the reference model. | ||
(Step 3) Finally, train the language model selectively on tokens that show higher excess loss compared to the reference loss.</em> | ||
</p> | ||
|
||
|
||
|
||
## 🚀 Quick Start | ||
|
||
Code and models will be release within the next few days. Stay tuned! | ||
|
||
|
||
## 🍀 Contributing | ||
|
||
This project welcomes contributions and suggestions. Most contributions require you to agree to a | ||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us | ||
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com. | ||
|
||
When you submit a pull request, a CLA bot will automatically determine whether you need to provide | ||
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions | ||
provided by the bot. You will only need to do this once across all repos using our CLA. | ||
|
||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). | ||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or | ||
contact [[email protected]](mailto:[email protected]) with any additional questions or comments. | ||
## ☕️ Citation | ||
|
||
If you find this repository helpful, please consider citing our paper: | ||
``` | ||
arXiv on hold, update soon. | ||
``` | ||
|
||
|
||
## Trademarks | ||
## 🌟 Star History | ||
|
||
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft | ||
trademarks or logos is subject to and must follow | ||
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general). | ||
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. | ||
Any use of third-party trademarks or logos are subject to those third-party's policies. | ||
[![Star History Chart](https://api.star-history.com/svg?repos=microsoft/rho&type=Date)](https://star-history.com/#microsoft/rho&Date) |
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
The code is currently being cleaned up and will be released in the next few days. |