Skip to content

Commit

Permalink
[DOCS] Update docs to keep in sync with current state (#641)
Browse files Browse the repository at this point in the history
We are in the process of upgrading to q4f16_1.
This PR keeps the docs in sync as the prebuilts for RedPajama are only
at q4f16_0 atm.

An updated PR to docs can be sent once the prebuilts are updated and we
confirmed that the flow works.

The how to compile model flow is kept at q4f16_1 as that already works
and do not depend on prebuilt.
  • Loading branch information
tqchen authored Aug 2, 2023
1 parent f0f5c74 commit b9d7e18
Show file tree
Hide file tree
Showing 10 changed files with 56 additions and 49 deletions.
6 changes: 3 additions & 3 deletions android/MLCChat/app/src/main/assets/app-config.json
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
{
"model_libs": [
"vicuna-v1-7b-q4f16_1",
"RedPajama-INCITE-Chat-3B-v1-q4f16_1"
"RedPajama-INCITE-Chat-3B-v1-q4f16_0"
],
"model_list": [
{
"model_url": "https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int4/",
"local_id": "vicuna-v1-7b-q4f16_1"
},
{
"model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/",
"local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
"model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/",
"local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
}
],
"add_model_samples": []
Expand Down
2 changes: 1 addition & 1 deletion docs/compilation/compile_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ The easiest way is to use MLC-LLM is to clone the repository, and compile models
# clone the repository
git clone [email protected]:mlc-ai/mlc-llm.git --recursive
# enter to root directory of the repo
cd mlc-llm
cd mlc-llm
Verify Installation
^^^^^^^^^^^^^^^^^^^
Expand Down
49 changes: 28 additions & 21 deletions docs/compilation/distribute_compiled_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,13 @@ This page describes how to distribute the model you compiled so others can use t
For demonstration purposes, we show how to compile the `RedPajama-3B instruct model <https://huggingface.co/togethercomputer/RedPajama-INCITE-Instruct-3B-v1>`_
(which has different weights from the RedPajama chat model).

.. note::

We use the quantization option `q4f16_0` here throughout the example
because that was what came with the existing prebuilt (we are upgrading prebuilt for `q4f16_1`).
If you do not need to try out prebuilt and would like to compile the library
from scratch, we recommend `q4f16_1`.


If you have not compiled the RedPajama-3B instruct model,
you can use the following command to compile it:
Expand All @@ -18,19 +25,19 @@ you can use the following command to compile it:

.. code:: shell
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target metal --quantization q4f16_1
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target metal --quantization q4f16_0
.. group-tab:: Linux - CUDA

.. code:: shell
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target cuda --quantization q4f16_1
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target cuda --quantization q4f16_0
.. group-tab:: Vulkan

.. code:: shell
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target vulkan --quantization q4f16_1
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target vulkan --quantization q4f16_0
.. contents:: Table of Contents
Expand All @@ -44,12 +51,12 @@ To begin with, we can check that we have the compilation artifact ready on the d

.. code:: shell
~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1
RedPajama-INCITE-Instruct-3B-v1-q4f16_1-metal.so # ===> the model library
~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0
RedPajama-INCITE-Instruct-3B-v1-q4f16_0-metal.so # ===> the model library
mod_cache_before_build_metal.pkl # ===> a cached file for future builds
params # ===> containing the model weights, tokenizer and chat config
~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params
~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params
mlc-chat-config.json # ===> the chat config
ndarray-cache.json # ===> the model weight info
params_shard_0.bin # ===> the model weights
Expand All @@ -64,7 +71,7 @@ Step 2. Update MLC Chat Configuration JSON
------------------------------------------

You can **optionally** customize the chat config file
``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params/mlc-chat-config.json`` (checkout :ref:`configure-mlc-chat-json` for more detailed instructions).
``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params/mlc-chat-config.json`` (checkout :ref:`configure-mlc-chat-json` for more detailed instructions).
You can also simply use the default configuration and skip this step.

For demonstration purpose, we update ``mean_gen_len`` to 32 and ``max_gen_len`` to 64.
Expand All @@ -79,8 +86,8 @@ Step 3. Specify the Model Lib
An MLC chat app needs to look for the model library to run the model.
In the case of RedPajama-3B instruct model, we already have a prebuilt model lib for RedPajama-3B chat model that shares the
same model architecture and quantization mode as the instruct model.
We can edit ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params/mlc-chat-config.json``
and update the value of field ``model_lib`` to ``"RedPajama-INCITE-Chat-3B-v1-q4f16_1"``.
We can edit ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params/mlc-chat-config.json``
and update the value of field ``model_lib`` to ``"RedPajama-INCITE-Chat-3B-v1-q4f16_0"``.

.. note::

Expand All @@ -93,25 +100,25 @@ and update the value of field ``model_lib`` to ``"RedPajama-INCITE-Chat-3B-v1-q4

.. code:: shell
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --reuse-lib RedPajama-INCITE-Chat-3B-v1-q4f16_1 --target [your target] --quantization q4f16_1
python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --reuse-lib RedPajama-INCITE-Chat-3B-v1-q4f16_0 --target [your target] --quantization q4f16_0
In this way, `mlc_llm.build` does not produce the model library for the instruct model, and in `mlc-chat-config.json`
the ``model_lib`` field is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_1``.
the ``model_lib`` field is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_0``.

Please note that only models with same architecture and compiled with same quantization modes can reuse and share model library.


We should distribute the generated model lib if we want to build a new model architecture or try out customized compilation optimizations.
In this case, we should keep the ``model_lib`` field as ``"RedPajama-INCITE-Instruct-3B-v1-q4f16_1"``.
You can upload the model library ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-metal.so``
In this case, we should keep the ``model_lib`` field as ``"RedPajama-INCITE-Instruct-3B-v1-q4f16_0"``.
You can upload the model library ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/RedPajama-INCITE-Instruct-3B-v1-q4f16_0-metal.so``
and ask others to download it to `dist/prebuilt/lib` directory so the CLI app can pick it up.


Step 4. Upload the Compiled Model Weights
-----------------------------------------

As a next step, we need to upload the model weights.
We only need to upload the files in ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params``.
We only need to upload the files in ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params``.
If you also want to host the compiled models on Hugging Face, you can follow the instructions below:

.. code:: shell
Expand All @@ -121,11 +128,11 @@ If you also want to host the compiled models on Hugging Face, you can follow the
git lfs install
git clone https://huggingface.co/my-huggingface-account/my-redpajama3b-weight-huggingface-repo
cd my-redpajama3b-weight-huggingface-repo
cp path/to/mlc-llm/dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params/* .
cp path/to/mlc-llm/dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params/* .
git add . && git commit -m "Add redpajama-3b instruct model weights"
git push origin main
Here we provide an `example distributed RedPajama-3B instruct model repository <https://huggingface.co/mlc-ai/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/tree/main>`_ which you can refer to.
Here we provide an `example distributed RedPajama-3B instruct model repository <https://huggingface.co/mlc-ai/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/tree/main>`_ which you can refer to.

---------------------------------

Expand All @@ -150,10 +157,10 @@ The steps needed to run models in CLI are similar to the steps to download the p
# Download the model weights
cd dist/prebuilt
git clone https://huggingface.co/my-huggingface-account/my-redpajama3b-weight-huggingface-repo RedPajama-INCITE-Instruct-3B-v1-q4f16_1
git clone https://huggingface.co/my-huggingface-account/my-redpajama3b-weight-huggingface-repo RedPajama-INCITE-Instruct-3B-v1-q4f16_0
cd ../..
# Run CLI
mlc_chat_cli --local-id RedPajama-INCITE-Instruct-3B-v1-q4f16_1
mlc_chat_cli --local-id RedPajama-INCITE-Instruct-3B-v1-q4f16_0
Download the Distributed Models and Run in iOS App
Expand All @@ -163,8 +170,8 @@ For iOS app, model libraries are statically packed into the app at the time of a
Therefore, the iOS app supports running any models whose model libraries are integrated into the app.
You can check the :ref:`list of supported model libraries <prebuilt-models-ios>`.

To download and run the compiled RedPajama-3B instruct model on iPhone, we need to reuse the integrated ``RedPajama-INCITE-Chat-3B-v1-q4f16_1`` model library.
Please revisit :ref:`distribute-model-step3-specify-model-lib` and make sure the ``model_lib`` field of `mlc-chat-config.json` is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_1``.
To download and run the compiled RedPajama-3B instruct model on iPhone, we need to reuse the integrated ``RedPajama-INCITE-Chat-3B-v1-q4f16_0`` model library.
Please revisit :ref:`distribute-model-step3-specify-model-lib` and make sure the ``model_lib`` field of `mlc-chat-config.json` is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_0``.

Now we can download the model weights in iOS app and run the model by following the steps below:

Expand Down
10 changes: 5 additions & 5 deletions docs/deploy/ios.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,18 +54,18 @@ in the root of the MLC-LLM.
cd dist/prebuilt
git lfs install
git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
cd ../..
Validate that the files and directories exist:

.. code:: bash
>>> ls -l ./dist/prebuilt/lib/*-iphone.tar
./dist/prebuilt/lib/RedPajama-INCITE-Chat-3B-v1-q4f16_1-iphone.tar
./dist/prebuilt/lib/RedPajama-INCITE-Chat-3B-v1-q4f16_0-iphone.tar
./dist/prebuilt/lib/vicuna-v1-7b-q3f16_0-iphone.tar
>>> ls -l ./dist/prebuilt/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
>>> ls -l ./dist/prebuilt/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
# chat config:
mlc-chat-config.json
# model weights:
Expand Down Expand Up @@ -107,15 +107,15 @@ run the following command under the ``./ios`` directory:
.. code:: bash
cd ./ios
open ./prepare_params.sh # make sure builtin_list only contains "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
open ./prepare_params.sh # make sure builtin_list only contains "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
./prepare_params.sh
The outcome should be as follows:

.. code:: bash
>>> ls ./dist/
RedPajama-INCITE-Chat-3B-v1-q4f16_1
RedPajama-INCITE-Chat-3B-v1-q4f16_0
Step 4. Build iOS App
^^^^^^^^^^^^^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion docs/get_started/mlc_chat_config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Configure MLCChat in JSON
This page explains the components of a chat configuration and how to customize them for your own purposes.

Each MLC Chat runtime can be configured via an ``mlc-chat-config.json`` file under the directory of each compiled model (e.g.
`RedPajama chat config <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/blob/main/mlc-chat-config.json>`__)
`RedPajama chat config <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/blob/main/mlc-chat-config.json>`__)
which contains the chat configuration. You can customize the chat configuration by modifying this file.
Additionally, the runtimes also provide APIs to optionally override some of the configurations.

Expand Down
4 changes: 2 additions & 2 deletions docs/get_started/try_out.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ and you can try out prebuilt models on the following platforms:
# You can try more models, for example:
# download prebuilt weights of RedPajama-3B
cd dist/prebuilt
git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
cd ../..
mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_1
mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_0
.. note::
If you are using Windows or Linux. Make sure you have the latest Vulkan driver installed.
Expand Down
16 changes: 8 additions & 8 deletions docs/prebuilt_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,12 @@ Prebuilt Models for CLI
* Running data type: float16
* Symmetric quantization
- `link <https://huggingface.co/mlc-ai/mlc-chat-vicuna-v1-7b-q3f16_0>`__
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
- `RedPajama <https://www.together.xyz/blog/redpajama>`__
- * Weight storage data type: int4
* Running data type: float16
* Symmetric quantization
- `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1>`__
- `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0>`__
* - `rwkv-raven-1b5-q8f16_0`
- `RWKV <https://github.com/BlinkDL/RWKV-LM>`__
- * Weight storage data type: uint8
Expand Down Expand Up @@ -117,12 +117,12 @@ Prebuilt Models for iOS
* Running data type: float16
* Symmetric quantization
- `link <https://huggingface.co/mlc-ai/mlc-chat-vicuna-v1-7b-q3f16_0>`__
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
- `RedPajama <https://www.together.xyz/blog/redpajama>`__
- * Weight storage data type: int4
* Running data type: float16
* Symmetric quantization
- `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1>`__
- `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0>`__

The `downloadable iOS app <https://apps.apple.com/us/app/mlc-chat/id6448482937>`_ has builtin RedPajama-3B model support.
To add a model to the iOS app, follow the steps below:
Expand Down Expand Up @@ -184,7 +184,7 @@ For example, if you compile `OpenLLaMA-7B <https://github.com/openlm-research/op
- * Weight storage data type: int3
* Running data type: float16
* Symmetric quantization
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
- GPT-NeoX
- * Weight storage data type: int4
* Running data type: float16
Expand All @@ -210,12 +210,12 @@ Prebuilt Models for Android
* Running data type: float16
* Symmetric quantization
- `link <https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int4>`__
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
* - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
- `RedPajama <https://www.together.xyz/blog/redpajama>`__
- * Weight storage data type: int4
* Running data type: float16
* Symmetric quantization
- `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1>`__
- `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0>`__

------------------

Expand Down Expand Up @@ -264,7 +264,7 @@ MLC-LLM supports the following model architectures:
* - ``minigpt``
- `MiniGPT <https://huggingface.co/Vision-CAIR/MiniGPT-4>`__
- `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/minigpt.py>`__
-
-
* - ``gpt_bigcode``
- `GPTBigCode <https://huggingface.co/docs/transformers/model_doc/gpt_bigcode>`__
- `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gpt_bigcode.py>`__
Expand Down
10 changes: 5 additions & 5 deletions ios/MLCChat/app-config.json
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
{
"model_libs": [
"Llama-2-7b-chat-hf-q3f16_1",
"RedPajama-INCITE-Chat-3B-v1-q4f16_1"
"RedPajama-INCITE-Chat-3B-v1-q4f16_0"
],
"model_list": [
{
"model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/",
"local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
"model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/",
"local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
}
],
"add_model_samples": [
{
"model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/",
"local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
"model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/",
"local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
}
]
}
2 changes: 1 addition & 1 deletion ios/prepare_params.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ mkdir -p dist

declare -a builtin_list=(
"Llama-2-7b-chat-hf-q3f16_1"
# "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
# "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
# "vicuna-v1-7b-q3f16_0"
# "rwkv-raven-1b5-q8f16_0"
# "rwkv-raven-3b-q8f16_0"
Expand Down
4 changes: 2 additions & 2 deletions site/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,9 @@ mlc_chat_cli --local-id vicuna-v1-7b-q3f16_0

# Download prebuilt weights of RedPajama-3B
cd dist/prebuilt
git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
cd ../..
mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_1
mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_0

# Download prebuilt weights of RWKV-raven-1.5B/3B/7B
cd dist/prebuilt
Expand Down

0 comments on commit b9d7e18

Please sign in to comment.