[DOCS] Update docs to keep in sync with current state (#641)

We are in the process of upgrading to q4f16_1. This PR keeps the docs in sync as the prebuilts for RedPajama are only at q4f16_0 atm. An updated PR to docs can be sent once the prebuilts are updated and we confirmed that the flow works. The how to compile model flow is kept at q4f16_1 as that already works and do not depend on prebuilt.
jmpcyc · Aug 2, 2023 · b9d7e18 · b9d7e18
1 parent f0f5c74
commit b9d7e18
Show file tree

Hide file tree

Showing 10 changed files with 56 additions and 49 deletions.
diff --git a/android/MLCChat/app/src/main/assets/app-config.json b/android/MLCChat/app/src/main/assets/app-config.json
@@ -1,16 +1,16 @@
 {
   "model_libs": [
     "vicuna-v1-7b-q4f16_1",
-    "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
+    "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
   ],
   "model_list": [
     {
       "model_url": "https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int4/",
       "local_id": "vicuna-v1-7b-q4f16_1"
     },
     {
-      "model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/",
-      "local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
+      "model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/",
+      "local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
     }
   ],
   "add_model_samples": []

diff --git a/docs/compilation/compile_models.rst b/docs/compilation/compile_models.rst
@@ -30,7 +30,7 @@ The easiest way is to use MLC-LLM is to clone the repository, and compile models
     # clone the repository
     git clone [email protected]:mlc-ai/mlc-llm.git --recursive
     # enter to root directory of the repo
-    cd mlc-llm  
+    cd mlc-llm
 
 Verify Installation
 ^^^^^^^^^^^^^^^^^^^

diff --git a/docs/compilation/distribute_compiled_models.rst b/docs/compilation/distribute_compiled_models.rst
@@ -8,6 +8,13 @@ This page describes how to distribute the model you compiled so others can use t
 For demonstration purposes, we show how to compile the `RedPajama-3B instruct model <https://huggingface.co/togethercomputer/RedPajama-INCITE-Instruct-3B-v1>`_
 (which has different weights from the RedPajama chat model).
 
+.. note::
+
+    We use the quantization option `q4f16_0` here throughout the example
+    because that was what came with the existing prebuilt (we are upgrading prebuilt for `q4f16_1`).
+    If you do not need to try out prebuilt and would like to compile the library
+    from scratch, we recommend `q4f16_1`.
+
 
 If you have not compiled the RedPajama-3B instruct model,
 you can use the following command to compile it:
@@ -18,19 +25,19 @@ you can use the following command to compile it:
 
         .. code:: shell
 
-            python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target metal --quantization q4f16_1
+            python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target metal --quantization q4f16_0
 
     .. group-tab:: Linux - CUDA
 
         .. code:: shell
 
-            python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target cuda --quantization q4f16_1
+            python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target cuda --quantization q4f16_0
 
     .. group-tab:: Vulkan
 
         .. code:: shell
 
-            python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target vulkan --quantization q4f16_1
+            python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --target vulkan --quantization q4f16_0
 
 
 .. contents:: Table of Contents
@@ -44,12 +51,12 @@ To begin with, we can check that we have the compilation artifact ready on the d
 
 .. code:: shell
 
-    ~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1
-        RedPajama-INCITE-Instruct-3B-v1-q4f16_1-metal.so  # ===> the model library
+    ~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0
+        RedPajama-INCITE-Instruct-3B-v1-q4f16_0-metal.so  # ===> the model library
         mod_cache_before_build_metal.pkl                  # ===> a cached file for future builds
         params                                            # ===> containing the model weights, tokenizer and chat config
 
-    ~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params
+    ~/mlc-llm > ls dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params
         mlc-chat-config.json                              # ===> the chat config
         ndarray-cache.json                                # ===> the model weight info
         params_shard_0.bin                                # ===> the model weights
@@ -64,7 +71,7 @@ Step 2. Update MLC Chat Configuration JSON
 ------------------------------------------
 
 You can **optionally** customize the chat config file
-``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params/mlc-chat-config.json`` (checkout :ref:`configure-mlc-chat-json` for more detailed instructions).
+``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params/mlc-chat-config.json`` (checkout :ref:`configure-mlc-chat-json` for more detailed instructions).
 You can also simply use the default configuration and skip this step.
 
 For demonstration purpose, we update ``mean_gen_len`` to 32 and ``max_gen_len`` to 64.
@@ -79,8 +86,8 @@ Step 3. Specify the Model Lib
 An MLC chat app needs to look for the model library to run the model.
 In the case of RedPajama-3B instruct model, we already have a prebuilt model lib for RedPajama-3B chat model that shares the
 same model architecture and quantization mode as the instruct model.
-We can edit ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params/mlc-chat-config.json``
-and update the value of field ``model_lib`` to ``"RedPajama-INCITE-Chat-3B-v1-q4f16_1"``.
+We can edit ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params/mlc-chat-config.json``
+and update the value of field ``model_lib`` to ``"RedPajama-INCITE-Chat-3B-v1-q4f16_0"``.
 
 .. note::
 
@@ -93,25 +100,25 @@ and update the value of field ``model_lib`` to ``"RedPajama-INCITE-Chat-3B-v1-q4
 
     .. code:: shell
 
-        python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --reuse-lib RedPajama-INCITE-Chat-3B-v1-q4f16_1 --target [your target] --quantization q4f16_1
-    
+        python3 -m mlc_llm.build --hf-path togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --reuse-lib RedPajama-INCITE-Chat-3B-v1-q4f16_0 --target [your target] --quantization q4f16_0
+
     In this way, `mlc_llm.build` does not produce the model library for the instruct model, and in `mlc-chat-config.json`
-    the ``model_lib`` field is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_1``.
+    the ``model_lib`` field is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_0``.
 
     Please note that only models with same architecture and compiled with same quantization modes can reuse and share model library.
 
 
 We should distribute the generated model lib if we want to build a new model architecture or try out customized compilation optimizations.
-In this case, we should keep the ``model_lib`` field as ``"RedPajama-INCITE-Instruct-3B-v1-q4f16_1"``.
-You can upload the model library ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/RedPajama-INCITE-Instruct-3B-v1-q4f16_1-metal.so``
+In this case, we should keep the ``model_lib`` field as ``"RedPajama-INCITE-Instruct-3B-v1-q4f16_0"``.
+You can upload the model library ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/RedPajama-INCITE-Instruct-3B-v1-q4f16_0-metal.so``
 and ask others to download it to  `dist/prebuilt/lib` directory so the CLI app can pick it up.
 
 
 Step 4. Upload the Compiled Model Weights
 -----------------------------------------
 
 As a next step, we need to upload the model weights.
-We only need to upload the files in ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params``.
+We only need to upload the files in ``dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params``.
 If you also want to host the compiled models on Hugging Face, you can follow the instructions below:
 
 .. code:: shell
@@ -121,11 +128,11 @@ If you also want to host the compiled models on Hugging Face, you can follow the
     git lfs install
     git clone https://huggingface.co/my-huggingface-account/my-redpajama3b-weight-huggingface-repo
     cd my-redpajama3b-weight-huggingface-repo
-    cp path/to/mlc-llm/dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/params/* .
+    cp path/to/mlc-llm/dist/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/params/* .
     git add . && git commit -m "Add redpajama-3b instruct model weights"
     git push origin main
 
-Here we provide an `example distributed RedPajama-3B instruct model repository <https://huggingface.co/mlc-ai/RedPajama-INCITE-Instruct-3B-v1-q4f16_1/tree/main>`_ which you can refer to.
+Here we provide an `example distributed RedPajama-3B instruct model repository <https://huggingface.co/mlc-ai/RedPajama-INCITE-Instruct-3B-v1-q4f16_0/tree/main>`_ which you can refer to.
 
 ---------------------------------
 
@@ -150,10 +157,10 @@ The steps needed to run models in CLI are similar to the steps to download the p
 
     # Download the model weights
     cd dist/prebuilt
-    git clone https://huggingface.co/my-huggingface-account/my-redpajama3b-weight-huggingface-repo RedPajama-INCITE-Instruct-3B-v1-q4f16_1
+    git clone https://huggingface.co/my-huggingface-account/my-redpajama3b-weight-huggingface-repo RedPajama-INCITE-Instruct-3B-v1-q4f16_0
     cd ../..
     # Run CLI
-    mlc_chat_cli --local-id RedPajama-INCITE-Instruct-3B-v1-q4f16_1
+    mlc_chat_cli --local-id RedPajama-INCITE-Instruct-3B-v1-q4f16_0
 
 
 Download the Distributed Models and Run in iOS App
@@ -163,8 +170,8 @@ For iOS app, model libraries are statically packed into the app at the time of a
 Therefore, the iOS app supports running any models whose model libraries are integrated into the app.
 You can check the :ref:`list of supported model libraries <prebuilt-models-ios>`.
 
-To download and run the compiled RedPajama-3B instruct model on iPhone, we need to reuse the integrated ``RedPajama-INCITE-Chat-3B-v1-q4f16_1`` model library.
-Please revisit :ref:`distribute-model-step3-specify-model-lib` and make sure the ``model_lib`` field of `mlc-chat-config.json` is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_1``.
+To download and run the compiled RedPajama-3B instruct model on iPhone, we need to reuse the integrated ``RedPajama-INCITE-Chat-3B-v1-q4f16_0`` model library.
+Please revisit :ref:`distribute-model-step3-specify-model-lib` and make sure the ``model_lib`` field of `mlc-chat-config.json` is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_0``.
 
 Now we can download the model weights in iOS app and run the model by following the steps below:
 

diff --git a/docs/deploy/ios.rst b/docs/deploy/ios.rst
@@ -54,18 +54,18 @@ in the root of the MLC-LLM.
 
    cd dist/prebuilt
    git lfs install
-   git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
+   git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
    cd ../..
 
 Validate that the files and directories exist:
 
 .. code:: bash
 
    >>> ls -l ./dist/prebuilt/lib/*-iphone.tar
-   ./dist/prebuilt/lib/RedPajama-INCITE-Chat-3B-v1-q4f16_1-iphone.tar
+   ./dist/prebuilt/lib/RedPajama-INCITE-Chat-3B-v1-q4f16_0-iphone.tar
    ./dist/prebuilt/lib/vicuna-v1-7b-q3f16_0-iphone.tar
 
-   >>> ls -l ./dist/prebuilt/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
+   >>> ls -l ./dist/prebuilt/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
    # chat config:
    mlc-chat-config.json
    # model weights:
@@ -107,15 +107,15 @@ run the following command under the ``./ios`` directory:
 .. code:: bash
 
    cd ./ios
-   open ./prepare_params.sh # make sure builtin_list only contains "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
+   open ./prepare_params.sh # make sure builtin_list only contains "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
    ./prepare_params.sh
 
 The outcome should be as follows:
 
 .. code:: bash
 
    >>> ls ./dist/
-   RedPajama-INCITE-Chat-3B-v1-q4f16_1
+   RedPajama-INCITE-Chat-3B-v1-q4f16_0
 
 Step 4. Build iOS App
 ^^^^^^^^^^^^^^^^^^^^^

diff --git a/docs/get_started/mlc_chat_config.rst b/docs/get_started/mlc_chat_config.rst
@@ -6,7 +6,7 @@ Configure MLCChat in JSON
 This page explains the components of a chat configuration and how to customize them for your own purposes.
 
 Each MLC Chat runtime can be configured via an ``mlc-chat-config.json`` file under the directory of each compiled model (e.g.
-`RedPajama chat config <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/blob/main/mlc-chat-config.json>`__)
+`RedPajama chat config <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/blob/main/mlc-chat-config.json>`__)
 which contains the chat configuration. You can customize the chat configuration by modifying this file.
 Additionally, the runtimes also provide APIs to optionally override some of the configurations.
 

diff --git a/docs/get_started/try_out.rst b/docs/get_started/try_out.rst
@@ -51,9 +51,9 @@ and you can try out prebuilt models on the following platforms:
       # You can try more models, for example:
       # download prebuilt weights of RedPajama-3B
       cd dist/prebuilt
-      git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
+      git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
       cd ../..
-      mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_1
+      mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_0
 
     .. note::
       If you are using Windows or Linux. Make sure you have the latest Vulkan driver installed.

diff --git a/docs/prebuilt_models.rst b/docs/prebuilt_models.rst
@@ -40,12 +40,12 @@ Prebuilt Models for CLI
       * Running data type: float16
       * Symmetric quantization
     - `link <https://huggingface.co/mlc-ai/mlc-chat-vicuna-v1-7b-q3f16_0>`__
-  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
+  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
     - `RedPajama <https://www.together.xyz/blog/redpajama>`__
     - * Weight storage data type: int4
       * Running data type: float16
       * Symmetric quantization
-    - `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1>`__
+    - `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0>`__
   * - `rwkv-raven-1b5-q8f16_0`
     - `RWKV <https://github.com/BlinkDL/RWKV-LM>`__
     - * Weight storage data type: uint8
@@ -117,12 +117,12 @@ Prebuilt Models for iOS
       * Running data type: float16
       * Symmetric quantization
     - `link <https://huggingface.co/mlc-ai/mlc-chat-vicuna-v1-7b-q3f16_0>`__
-  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
+  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
     - `RedPajama <https://www.together.xyz/blog/redpajama>`__
     - * Weight storage data type: int4
       * Running data type: float16
       * Symmetric quantization
-    - `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1>`__
+    - `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0>`__
 
 The `downloadable iOS app <https://apps.apple.com/us/app/mlc-chat/id6448482937>`_ has builtin RedPajama-3B model support.
 To add a model to the iOS app, follow the steps below:
@@ -184,7 +184,7 @@ For example, if you compile `OpenLLaMA-7B <https://github.com/openlm-research/op
     - * Weight storage data type: int3
       * Running data type: float16
       * Symmetric quantization
-  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
+  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
     - GPT-NeoX
     - * Weight storage data type: int4
       * Running data type: float16
@@ -210,12 +210,12 @@ Prebuilt Models for Android
       * Running data type: float16
       * Symmetric quantization
     - `link <https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int4>`__
-  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_1`
+  * - `RedPajama-INCITE-Chat-3B-v1-q4f16_0`
     - `RedPajama <https://www.together.xyz/blog/redpajama>`__
     - * Weight storage data type: int4
       * Running data type: float16
       * Symmetric quantization
-    - `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1>`__
+    - `link <https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0>`__
 
 ------------------
 
@@ -264,7 +264,7 @@ MLC-LLM supports the following model architectures:
   * - ``minigpt``
     - `MiniGPT <https://huggingface.co/Vision-CAIR/MiniGPT-4>`__
     - `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/minigpt.py>`__
-    - 
+    -
   * - ``gpt_bigcode``
     - `GPTBigCode <https://huggingface.co/docs/transformers/model_doc/gpt_bigcode>`__
     - `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gpt_bigcode.py>`__

diff --git a/ios/MLCChat/app-config.json b/ios/MLCChat/app-config.json
@@ -1,18 +1,18 @@
 {
   "model_libs": [
     "Llama-2-7b-chat-hf-q3f16_1",
-    "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
+    "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
   ],
   "model_list": [
     {
-      "model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/",
-      "local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
+      "model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/",
+      "local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
     }
   ],
   "add_model_samples": [
     {
-      "model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1/",
-      "local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
+      "model_url": "https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0/",
+      "local_id": "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
     }
   ]
 }
diff --git a/ios/prepare_params.sh b/ios/prepare_params.sh
@@ -7,7 +7,7 @@ mkdir -p dist
 
 declare -a builtin_list=(
 	"Llama-2-7b-chat-hf-q3f16_1"
-	# "RedPajama-INCITE-Chat-3B-v1-q4f16_1"
+	# "RedPajama-INCITE-Chat-3B-v1-q4f16_0"
 	# "vicuna-v1-7b-q3f16_0"
 	# "rwkv-raven-1b5-q8f16_0"
 	# "rwkv-raven-3b-q8f16_0"

diff --git a/site/index.md b/site/index.md
@@ -82,9 +82,9 @@ mlc_chat_cli --local-id vicuna-v1-7b-q3f16_0
 
 # Download prebuilt weights of RedPajama-3B
 cd dist/prebuilt
-git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_1
+git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
 cd ../..
-mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_1
+mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_0
 
 # Download prebuilt weights of RWKV-raven-1.5B/3B/7B
 cd dist/prebuilt