Skip to content

Commit

Permalink
🎉upload paper & readme
Browse files Browse the repository at this point in the history
  • Loading branch information
ZubinGou committed Apr 11, 2024
1 parent d37b489 commit 5512496
Show file tree
Hide file tree
Showing 8 changed files with 81 additions and 21 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.DS_Store
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
100 changes: 79 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,91 @@
# Project

> This repo has been populated by an initial template to help get you started. Please
> make sure to update the content to build a great experience for community-building.

As the maintainer of this project, please make a few updates:
<h1 align="center">
<img src="./docs/static/images/rho_logo.png" width="100" alt="rho-logo" />
<br>
Rho-1: Not All Tokens Are What You Need
</h1>

- Improving this README.MD file to provide a great experience
- Updating SUPPORT.MD with content about this project's support experience
- Understanding the security reporting process in SECURITY.MD
- Remove this section from the README
<div align="center">

## Contributing
![](https://img.shields.io/badge/Model-Release%20Soon-blue)
![](https://img.shields.io/badge/Code%20License-MIT-green)

</div>

<p align="center">
<a href="https://microsoft.github.io/rho/rho-1.pdf"><b>[📜 Paper]</b></a> •
<!-- <a href="https://arxiv.org/abs/TODO"><b>[📜 Paper]</b></a> • -->
<!-- <a href="https://huggingface.co/TODO"><b>[🤗 HF Models]</b></a> • -->
<a href="https://github.com/microsoft/rho"><b>[🐱 GitHub]</b></a>
<!-- <a href="https://twitter.com/TODO"><b>[🐦 Twitter]</b></a> -->
</p>

<p align="center">
<img src="./docs/static/images/acc_vs_tokens_1b_7b.png" width="1000">
<br>
<em>Figure 1: Rho-1 is trained with Selective Language Modeling (SLM). SLM improves average few-shot accuracy on GSM8k and MATH by over 16%, achieving the baseline performance 5-10x faster.</em>
</p>


## 🔥 News

<!-- - [2024/04/12] 🔥🔥🔥 Rho-Math-v0.1 models released at [🤗 HuggingFace](https://huggingface.co/TODO)! -->
- [2024/04/11] Rho-1 paper and repo released.
- [2024/04/11] 🔥🔥🔥 Rho-1-1B is the first 1B LLM that achieves over 40% accuracy on MATH dataset.


## 💡 Introduction

Rho-1 employs Selective Language Modeling (SLM), which selectively trains on clean and useful tokens that aligned with the desired distribution.

- When continual pretraining on 15B OpenWebMath corpus, Rho-1 yields an absolute improvement in few-shot accuracy of up to 30% in 9 math tasks.
- After fine-tuning, Rho-1 1B and 7B achieved state-of-the-art results of 40.6\% and 51.8\% on MATH dataset, respectively — matching DeepSeekMath with only 3\% of the pretraining tokens.

### Selective Lanugage Modeling (SLM)

<p align="center">
<img src="./docs/static/images/example.png" width="1000">
<br>
<em>Figure 2:
<b>Upper:</b> Even an extensively filtered pretraining corpus contains token-level noise.
<b>Left:</b> Previous Causal Language Modeling (CLM) trains on all tokens.
<b>Right:</b> Our proposed Selective Language Modeling (SLM) selectively applies loss on those useful and clean tokens.</em>
</p>

<p align="center">
<img src="./docs/static/images/pipeline.png" width="1000">
<br>
<em>Figure 3: <b>The pipeline of Selective Language Modeling.</b>
SLM optimizes language model performance by concentrating on valuable, clean tokens during pre-training.
It involves three steps:
(Step 1) Initially, train a reference model on high-quality data.
(Step 2) Then, score each token's loss in a corpus using the reference model.
(Step 3) Finally, train the language model selectively on tokens that show higher excess loss compared to the reference loss.</em>
</p>



## 🚀 Quick Start

Code and models will be release within the next few days. Stay tuned!


## 🍀 Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [[email protected]](mailto:[email protected]) with any additional questions or comments.
## ☕️ Citation

If you find this repository helpful, please consider citing our paper:
```
arXiv on hold, update soon.
```


## Trademarks
## 🌟 Star History

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.
[![Star History Chart](https://api.star-history.com/svg?repos=microsoft/rho&type=Date)](https://star-history.com/#microsoft/rho&Date)
Binary file added docs/rho-1.pdf
Binary file not shown.
Binary file added docs/static/images/acc_vs_tokens_1b_7b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/static/images/example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/static/images/pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/static/images/rho_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions rho-1/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The code is currently being cleaned up and will be released in the next few days.

0 comments on commit 5512496

Please sign in to comment.