-
Notifications
You must be signed in to change notification settings - Fork 150
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
dba39b4
commit 1d1f12e
Showing
1 changed file
with
34 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
## 1. Pre-Requisites Make venv and install ort genai | ||
|
||
```bash | ||
python -m venv .venv && source .venv/bin/activate | ||
pip install requests numpy --pre onnxruntime-genai | ||
``` | ||
|
||
```bash | ||
python -m venv .venv && source .venv/bin/activate | ||
pip install requests numpy --pre onnxruntime-genai-cuda "olive-ai[gpu]" | ||
``` | ||
|
||
## 2. Acquire model | ||
|
||
```bash | ||
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include "deepseek-r1-distill-qwen-1.5B/*" --local-dir . | ||
``` | ||
OR choose your model and convert to ONNX | ||
|
||
```bash | ||
olive auto-opt --model_name_or_path deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --output_path ./deepseek-r1-distill-qwen-1.5B --device gpu --provider CUDAExecutionProvider --precision int4 --use_model_builder --log_level 1 | ||
``` | ||
|
||
## 3. Play with model | ||
|
||
```bash | ||
curl -s <https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py> | ||
python model-chat.py -m deepseek-r1-distill-qwen-1.5B -e gpu --chat_template "<|begin▁of▁sentence|><|User|>{input}<|Assistant|>" | ||
``` | ||
|
||
```bash | ||
curl -s <https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py> | ||
python model-chat.py -m deepseek-r1-distill-qwen-1.5B/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/ -e cpu | ||
``` |