Skip to content

Commit

Permalink
[Refactor] Move text from README to documentation (#392)
Browse files Browse the repository at this point in the history
  • Loading branch information
yzh119 authored Jun 12, 2023
1 parent 2e643e6 commit 17081f9
Show file tree
Hide file tree
Showing 11 changed files with 343 additions and 278 deletions.
46 changes: 2 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,51 +54,9 @@ As a starting point, MLC generates GPU shaders for CUDA, Vulkan and Metal. It is

We heavily rely on open-source ecosystem, more specifically, [TVM Unity](https://discuss.tvm.apache.org/t/establish-tvm-unity-connection-a-technical-strategy/13344), an exciting latest development in the TVM project that enables python-first interactive MLC development experiences that allows us to easily compose new optimizations all in Python, and incrementally bring our app to the environment of interest. We also leveraged optimizations such as fused quantization kernels, first class dynamic shape support and diverse GPU backends.

## Building from Source
## Get Started with MLC-LLM

There are two ways to build MLC LLM from source. The first is to use a Hugging Face URL to directly download the model parameters, and the second is to use a local directory that contains the parameters.

### Hugging Face URL

To download the weights from an existing Hugging Face repository for a supported model, you can follow the instructions below:

```shell
# Create a new conda environment and install dependencies
conda create -n mlc-llm-env python
conda activate mlc-llm-env
pip install torch transformers # Install PyTorch and Hugging Face transformers
pip install -I mlc_ai_nightly -f https://mlc.ai/wheels # Install TVM

# Install Git and Git-LFS if you haven't already.
# They are used for downloading the model weights from Hugging Face.
conda install git git-lfs
git lfs install

# Clone the MLC LLM repo
git clone --recursive https://github.com/mlc-ai/mlc-llm.git
cd mlc-llm

# Create the local build directory and compile the model
# This will automatically download the parameters, tokenizer, and config from Hugging Face
python build.py --hf-path=databricks/dolly-v2-3b
```

After a successful build, the compiled model will be available at `dist/dolly-v2-3b-q3f16_0` (the exact path will vary depending on your model type and specified quantization). Follow the platform specific instructions to build and run MLC LLM for [iOS](https://github.com/mlc-ai/mlc-llm/blob/main/ios/README.md), [Android](https://github.com/mlc-ai/mlc-llm/blob/main/android/README.md), and [CLI](https://github.com/mlc-ai/mlc-llm/tree/main/cpp/README.md).

### Local Directory

If you have a local directory that has the model parameters, the tokenizer, and a `config.json` file for a supported model, you can instead run the following build command:

```shell
# Create the local build directory and compile the model
python build.py --model=/path/to/local/directory

# If the model path is in the form of `dist/models/model_name`,
# we can simplify the build command to
# python build.py --model=model_name
```

Similarly, the compiled model will be available at `dist/dolly-v2-3b-q3f16_0`, where the exact path will vary depending on your model type and specified quantization. Follow the platform specific instructions to build and run MLC LLM for [iOS](https://github.com/mlc-ai/mlc-llm/blob/main/ios/README.md), [Android](https://github.com/mlc-ai/mlc-llm/blob/main/android/README.md), and [CLI](https://github.com/mlc-ai/mlc-llm/tree/main/cpp/README.md).
Please check our [documentation](https://mlc.ai/mlc-llm/docs/) to start the journey with MLC-LLM.

## Links

Expand Down
106 changes: 2 additions & 104 deletions android/README.md
Original file line number Diff line number Diff line change
@@ -1,108 +1,6 @@
# Introduction to MLC-LLM for Android

<p align="center">
<img src="../site/gif/android-demo.gif" height="700">
</p>
# MLC-LLM Android

We are excited to share that we have enabled the Android support for MLC-LLM. Checkout [the instruction page](https://mlc.ai/mlc-llm/#android) for instructions to download and install our Android app. Checkout the [announcing blog post](https://mlc.ai/blog/2023/05/08/bringing-hardware-accelerated-language-models-to-android-devices) for the technical details throughout our process of making MLC-LLM possible for Android.

## App Build Instructions

1. Install TVM Unity.
We have some local changes to TVM Unity, so we use the mlc/relax repo. We will migrate change back to TVM Unity soon.

```shell
git clone https://github.com/mlc-ai/mlc-llm.git --recursive
cd mlc-llm/3rdparty/tvm
mkdir build
cp cmake/config.cmake build
```

In build/config.cmake, set `USE_OPENCL` and `USE_LLVM` as ON

```shell
make -j
export TVM_NDK_CC=/path/to/android/ndk/clang
For example
export TVM_NDK_CC=/Users/me/Library/Android/sdk/ndk/25.2.9519653/toolchains/llvm/prebuilt/darwin-x86_64/bin/aarch64-linux-android24-clang
```

2. Install [Apache Maven](https://maven.apache.org/download.cgi) for our Java dependency management. Run command `mvn --version` to verify that Maven is correctly installed.

3. Build TVM4j (Java Frontend for TVM Runtime).

```shell
cd jvm; mvn install -pl core -DskipTests -Dcheckstyle.skip=true
```

4. Follow the instructions [here](https://github.com/mlc-ai/mlc-llm#building-from-source) to either build the model using a Hugging Face URL, or a local directory. If opting for a local directory, you can follow the instructions [here](https://huggingface.co/docs/transformers/main/model_doc/llama) to get the original LLaMA weights in the HuggingFace format, and [here](https://github.com/lm-sys/FastChat#vicuna-weights) to get Vicuna weights.

```shell
# From mlc-llm project directory
python3 build.py --model path/to/vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
# If the model path is `dist/models/vicuna-v1-7b`,
# we can simplify the build command to
# python build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
```

5. Build libraries for Android app.

```shell
export ANDROID_NDK=/path/to/android/ndk
For example
export ANDROID_NDK=/Users/me/Library/Android/sdk/ndk/25.2.9519653
cd android && ./prepare_libs.sh
```

6. Download [Android Studio](https://developer.android.com/studio), and use Android Studio to open folder `android/MLCChat` as the project.
1. Install Android SDK and NDK either inside Android Studio (recommended) or separately.
2. Specify the Android SDK and NDK path in file `android/MLCChat/local.properties` (if it does not exist, create one):

```shell
sdk.dir=/path/to/android/sdk
ndk.dir=/path/to/android/ndk
```

For example, a good `local.properties` can be:

```shell
sdk.dir=/Users/me/Library/Android/sdk
ndk.dir=/Users/me/Library/Android/sdk/ndk/25.2.9519653
```

7. Connect your Android device to your machine. In the menu bar of Android Studio, click `Build - Make Project`.

8. Once the build is finished, click `Run - Run 'app'`, and you will see the app launched on your phone.

<p align="center">
<img src="../site/img/android/android-studio.png">
</p>

## Use Your Own Model Weights

By following the instructions above, the installed app will download weights from our pre-uploaded HuggingFace repository. If you do not want to download the weights from Internet and instead wish to use the weights you build, please follow the steps below.

* Step 1 - step 8: same as [section ”App Build Instructions”](#app-build-instructions).

* Step 9. In `Build - Generate Signed Bundle / APK`, build the project to an APK for release. If it is the first time you generate an APK, you will need to create a key. Please follow [the official guide from Android](https://developer.android.com/studio/publish/app-signing#generate-key) for more instructions on this. After generating the release APK, you will get the APK file `app-release.apk` under `android/MLCChat/app/release/`.

* Step 10. Enable “USB debugging” in the developer options your phone settings.

* Step 11. Install [Android SDK Platform-Tools](https://developer.android.com/studio/releases/platform-tools) for ADB (Android Debug Bridge). The platform tools will be already available under your Android SDK path if you have installed SDK (e.g., at `/path/to/android-sdk/platform-tools/`). Add the platform-tool path to your PATH environment. Run `adb devices` to verify that ADB is installed correctly your phone is listed as a device.

* Step 12. In command line, run
```shell
adb install android/MLCChat/app/release/app-release.apk
```
to install the APK to your phone. If it errors with message `adb: failed to install android/MLCChat/app/release/app-release.apk: Failure [INSTALL_FAILED_UPDATE_INCOMPATIBLE: Existing package ai.mlc.mlcchat signatures do not match newer version; ignoring!]`, please uninstall the existing app and try `adb install` again.

* Step 13. Push the tokenizer and model weights to your phone through ADB.
```shell
adb push dist/models/vicuna-v1-7b/tokenizer.model /data/local/tmp/vicuna-v1-7b/tokenizer.model
adb push dist/vicuna-v1-7b/float16/params /data/local/tmp/vicuna-v1-7b/params
adb shell "mkdir -p /storage/emulated/0/Android/data/ai.mlc.mlcchat/files/Download/"
adb shell "mv /data/local/tmp/vicuna-v1-7b /storage/emulated/0/Android/data/ai.mlc.mlcchat/files/Download/vicuna-v1-7b"
```
This folder contains the source code for building Android application of MLC-LLM. Please checkout our [documentation](https://mlc.ai/mlc-llm/docs/tutorials/runtime/android.html) on how to build and use the MLC-LLM for Android.

* Step 14. Everything is ready. Launch the MLCChat on your phone and you will be able to use the app with your own weights. You will find that no weight download is needed.
4 changes: 2 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Tutorials
- :doc:`tutorials/runtime/javascript` for WebLLM
- :doc:`tutorials/runtime/android` for Android
- :doc:`tutorials/runtime/ios` for iOS
- :doc:`tutorials/runtime/rest` for REST API with Python
- :doc:`tutorials/runtime/python` for running models in Python
**Note.** TVM Unity compiler is not a dependency to running any MLC-compiled model.
Expand Down Expand Up @@ -114,7 +114,7 @@ Community
tutorials/runtime/javascript.rst
tutorials/runtime/android.rst
tutorials/runtime/ios.rst
tutorials/runtime/rest.rst
tutorials/runtime/python.rst

.. toctree::
:maxdepth: 1
Expand Down
2 changes: 1 addition & 1 deletion docs/install/software-dependencies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
:depth: 2
:local:

While we have included most of the dependencies in our pre-built wheels/scripts, there are still some platform-dependent packages that you will need to install on your own. In most cases, you won't need all the packages listed on this page. If you're unsure about which packages are required for your specific use case, please check the :ref:`navigation panel <navigation>` first.
While we have included most of the dependencies in our pre-built wheels/scripts, there are still some platform-dependent packages that you will need to install on your own. In most cases, you won't need all the packages listed on this page.

Conda
-----
Expand Down
2 changes: 2 additions & 0 deletions docs/install/tvm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ To help our community to use Apache TVM Unity, a nightly prebuilt developer pack

Please visit the installation page for installation instructions: https://mlc.ai/package/.

.. _tvm-unity-build-from-source:

Option 2. Build from Source
---------------------------

Expand Down
159 changes: 159 additions & 0 deletions docs/tutorials/runtime/android.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,161 @@
🚧 Run Models in Android
========================

.. image:: https://github.com/mlc-ai/mlc-llm/raw/main/site/gif/android-demo.gif
:width: 400
:align: center

We are excited to share that we have enabled the Android support for
MLC-LLM. Checkout `the instruction
page <https://mlc.ai/mlc-llm/#android>`__ for instructions to download
and install our Android app. Checkout the `announcing blog
post <https://mlc.ai/blog/2023/05/08/bringing-hardware-accelerated-language-models-to-android-devices>`__
for the technical details throughout our process of making MLC-LLM
possible for Android.

App Build Instructions
----------------------

1. Install TVM Unity by following :ref:`tvm-unity-build-from-source`.

Note that our pre-built wheels do not support OpenCL, and you need to built TVM-Unity
from source and set ``USE_OPENCL`` as ``ON``.

2. Setup ``TVM_NDK_CC`` environment variable to NDK compiler path:

.. code:: bash
# replace the /path/to/android/ndk/clang to your NDK compiler path
# e.g. export TVM_NDK_CC=/Users/me/Library/Android/sdk/ndk/25.2.9519653/toolchains/llvm/prebuilt/darwin-x86_64/bin/aarch64-linux-android24-clang
export TVM_NDK_CC=/path/to/android/ndk/clang
3. Install `Apache Maven <https://maven.apache.org/download.cgi>`__ for
our Java dependency management. Run command ``mvn --version`` to
verify that Maven is correctly installed.

4. Build TVM4j (Java Frontend for TVM Runtime) under the TVM Unity directory.

.. code:: shell
cd jvm; mvn install -pl core -DskipTests -Dcheckstyle.skip=true
5. Follow the instructions
`here <https://github.com/mlc-ai/mlc-llm#building-from-source>`__ to
either build the model using a Hugging Face URL, or a local
directory. If opting for a local directory, you can follow the
instructions
`here <https://huggingface.co/docs/transformers/main/model_doc/llama>`__
to get the original LLaMA weights in the HuggingFace format, and
`here <https://github.com/lm-sys/FastChat#vicuna-weights>`__ to get
Vicuna weights.

.. code:: shell
# From mlc-llm project directory
python3 build.py --model path/to/vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
# If the model path is `dist/models/vicuna-v1-7b`,
# we can simplify the build command to
# python build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
6. Build libraries for Android app.

.. code:: shell
export ANDROID_NDK=/path/to/android/ndk
For example
export ANDROID_NDK=/Users/me/Library/Android/sdk/ndk/25.2.9519653
cd android && ./prepare_libs.sh
7. Download `Android Studio <https://developer.android.com/studio>`__,
and use Android Studio to open folder ``android/MLCChat`` as the
project.

1. Install Android SDK and NDK either inside Android Studio
(recommended) or separately.

2. Specify the Android SDK and NDK path in file
``android/MLCChat/local.properties`` (if it does not exist, create
one):

.. code:: shell
sdk.dir=/path/to/android/sdk
ndk.dir=/path/to/android/ndk
For example, a good ``local.properties`` can be:

.. code:: shell
sdk.dir=/Users/me/Library/Android/sdk
ndk.dir=/Users/me/Library/Android/sdk/ndk/25.2.9519653
8. Connect your Android device to your machine. In the menu bar of
Android Studio, click ``Build - Make Project``.

9. Once the build is finished, click ``Run - Run 'app'``, and you will see the app launched on your phone.

.. image:: https://github.com/mlc-ai/mlc-llm/raw/main/site/img/android/android-studio.png

Use Your Own Model Weights
--------------------------

By following the instructions above, the installed app will download
weights from our pre-uploaded HuggingFace repository. If you do not want
to download the weights from Internet and instead wish to use the
weights you build, please follow the steps below.

- Step 1 - step 9: same as `section ”App Build
Instructions” <#app-build-instructions>`__.

- Step 10. In ``Build - Generate Signed Bundle / APK``, build the
project to an APK for release. If it is the first time you generate
an APK, you will need to create a key. Please follow `the official
guide from
Android <https://developer.android.com/studio/publish/app-signing#generate-key>`__
for more instructions on this. After generating the release APK, you
will get the APK file ``app-release.apk`` under
``android/MLCChat/app/release/``.

- Step 11. Enable “USB debugging” in the developer options your phone
settings.

- Step 12. Install `Android SDK
Platform-Tools <https://developer.android.com/studio/releases/platform-tools>`__
for ADB (Android Debug Bridge). The platform tools will be already
available under your Android SDK path if you have installed SDK
(e.g., at ``/path/to/android-sdk/platform-tools/``). Add the
platform-tool path to your PATH environment. Run ``adb devices`` to
verify that ADB is installed correctly your phone is listed as a
device.

- Step 13. In command line, run the following command to install APK to your phone:

.. code:: bash
adb install android/MLCChat/app/release/app-release.apk
.. note::

If it errors with message

.. code:: bash
adb: failed to install android/MLCChat/app/release/app-release.apk: Failure [INSTALL_FAILED_UPDATE_INCOMPATIBLE: Existing package ai.mlc.mlcchat signatures do not match newer version; ignoring!]
please uninstall the existing app and try ``adb install`` again.

- Step 14. Push the tokenizer and model weights to your phone through
ADB.

.. code:: bash
adb push dist/models/vicuna-v1-7b/tokenizer.model /data/local/tmp/vicuna-v1-7b/tokenizer.model
adb push dist/vicuna-v1-7b/float16/params /data/local/tmp/vicuna-v1-7b/params
adb shell "mkdir -p /storage/emulated/0/Android/data/ai.mlc.mlcchat/files/Download/"
adb shell "mv /data/local/tmp/vicuna-v1-7b /storage/emulated/0/Android/data/ai.mlc.mlcchat/files/Download/vicuna-v1-7b"
- Step 15. Everything is ready. Launch the MLCChat on your phone and
you will be able to use the app with your own weights. You will find
that no weight download is needed.
Loading

0 comments on commit 17081f9

Please sign in to comment.