Skip to content

Commit

Permalink
[DOCS] high level reorganization (#428)
Browse files Browse the repository at this point in the history
  • Loading branch information
tqchen authored Jun 17, 2023
1 parent 8e6ac81 commit 0dd9a12
Show file tree
Hide file tree
Showing 12 changed files with 141 additions and 162 deletions.
2 changes: 1 addition & 1 deletion android/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@

We are excited to share that we have enabled the Android support for MLC-LLM. Checkout [the instruction page](https://mlc.ai/mlc-llm/#android) for instructions to download and install our Android app. Checkout the [announcing blog post](https://mlc.ai/blog/2023/05/08/bringing-hardware-accelerated-language-models-to-android-devices) for the technical details throughout our process of making MLC-LLM possible for Android.

This folder contains the source code for building Android application of MLC-LLM. Please checkout our [documentation](https://mlc.ai/mlc-llm/docs/tutorials/runtime/android.html) on how to build and use the MLC-LLM for Android.
This folder contains the source code for building Android application of MLC-LLM. Please checkout our [documentation](https://mlc.ai/mlc-llm/docs/deploy/android.html) on how to build and use the MLC-LLM for Android.

Original file line number Diff line number Diff line change
@@ -1,14 +1,10 @@
Build Android Package
=====================
Android App
===========

.. contents:: Table of Contents
:local:
:depth: 2

.. image:: https://github.com/mlc-ai/mlc-llm/raw/main/site/gif/android-demo.gif
:width: 400
:align: center


The MLC LLM Android package can be installed in two ways: either from the pre-built package or by building it from source. If you're an Android user interested in trying out models, the pre-built package is the way to go. On the other hand, if you're a developer aiming to incorporate new features into the package, building the Android package from source is necessary.

Expand Down
90 changes: 88 additions & 2 deletions docs/tutorials/app_build/cli.rst → docs/deploy/cli.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Build MLCChat-CLI
=================
CLI and C++ API
===============

MLCChat CLI is the command line tool to run MLC-compiled LLMs out of the box. You may install it from the prebuilt package we provide, or compile it from source.

Expand Down Expand Up @@ -161,3 +161,89 @@ Once ``mlc_chat_cli`` is installed, you are able to run any MLC-compiled model o
...
Have fun chatting with MLC-compiled LLM!

Advanced Topic: Integrate Models in C++
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

MLC-compiled models can be integrated into any C++ project using TVM's C/C++ API without going through the command line.

**Step 1. Create libmlc_llm.** Both static and shared libraries are available via the :ref:`CMake instructions <mlcchat_build_from_source>`, and the downstream developer may include either one into the C++ project depending on needs.

**Step 2. Calling into the model in your C++ Project.** Use ``tvm::runtime::Module`` API from TVM runtime to interact with MLC LLM without MLCChat.

.. note::
`DLPack <https://dmlc.github.io/dlpack/latest/c_api.html>`_ that comes with TVM is an in-memory representation of tensors in deep learning. It is widely adopted in
`NumPy <https://numpy.org/devdocs/reference/generated/numpy.from_dlpack.html>`_,
`PyTorch <https://pytorch.org/docs/stable/dlpack.html>`_,
`JAX <https://jax.readthedocs.io/en/latest/jax.dlpack.html>`_,
`TensorFlow <https://www.tensorflow.org/api_docs/python/tf/experimental/dlpack/>`_,
etc.

Using MLCChat APIs in your own programs
---------------------------------------

Below is a minimal example of using MLCChat C++ APIs.

.. code:: c++

#define TVM_USE_LIBBACKTRACE 0
#define DMLC_USE_LOGGING_LIBRARY <tvm/runtime/logging.h>

#include <tvm/runtime/packed_func.h>
#include <tvm/runtime/module.h>
#include <tvm/runtime/registry.h>

// DLPack is a widely adopted in-memory representation of tensors in deep learning.
#include <dlpack/dlpack.h>

void ChatModule(
const DLDeviceType& device_type, // from dlpack.h
int device_id, // which one if there are multiple devices, usually 0
const std::string& path_model_lib,
const std::string& path_weight_config
) {
// Step 0. Make sure the following files exist:
// - model lib : `$(path_model_lib)`
// - chat config: `$(path_weight_config)/mlc-chat-config.json`
// - weights : `$(path_weight_config)/ndarray-cache.json`
using tvm::runtime::PackedFunc;

// Step 1. Call `mlc.llm_chat_create`
// This method will exist if `libmlc_llm` is successfully loaded or linked as a shared or static library.
const PackedFunc* llm_chat_create = tvm::runtime::Registry::Get("mlc.llm_chat_create");
assert(llm_chat_create != nullptr);
tvm::runtime::Module mlc_llm = (*llm_chat_create)(
static_cast<int>(device_type),
device_id,
);
// Step 2. Obtain all available functions in `mlc_llm`
PackedFunc prefill = mlc_llm->GetFunction("prefill");
PackedFunc decode = mlc_llm->GetFunction("decode");
PackedFunc stopped = mlc_llm->GetFunction("stopped");
PackedFunc get_message = mlc_llm->GetFunction("get_message");
PackedFunc reload = mlc_llm->GetFunction("reload");
PackedFunc get_role0 = mlc_llm->GetFunction("get_role0");
PackedFunc get_role1 = mlc_llm->GetFunction("get_role1");
PackedFunc runtime_stats_text = mlc_llm->GetFunction("runtime_stats_text");
PackedFunc reset_chat = mlc_llm->GetFunction("reset_chat");
PackedFunc process_system_prompts = mlc_llm->GetFunction("process_system_prompts");
// Step 3. Load the model lib containing optimized tensor computation
tvm::runtime::Module model_lib = tvm::runtime::Module::LoadFromFile(path_model_lib);
// Step 4. Inform MLC LLM to use `model_lib`
reload(model_lib, path_weight_config);
}
.. note::

MLCChat CLI can be considered as a `single-file <https://github.com/mlc-ai/mlc-llm/blob/main/cpp/cli_main.cc>`_ project serving a good example of using MLC LLM in any C++ project.


**Step 3. Set up compilation flags.** To properly compile the code above, you will have to set up compiler flags properly in your own C++ project:

- Make sure the following directories are included where ``TVM_HOME`` is ``/path/to/mlc-llm/3rdparty/tvm``:

- TVM runtime: ``${TVM_HOME}/include``
- Header-only DLPack: ``${TVM_HOME}/3rdparty/dlpack/include``
- Header-only DMLC core: ``${TVM_HOME}/3rdparty/dmlc-core/include``

- Make sure to link either the static or the shared ``libtvm_runtime`` library, which is provided via :ref:`CMake <mlcchat_build_from_source>`.
17 changes: 7 additions & 10 deletions docs/tutorials/app_build/ios.rst → docs/deploy/ios.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,14 @@
Build iOS Package
=================
iOS App and Swift API
=====================

.. contents:: Table of Contents
:local:
:depth: 2

.. image:: https://mlc.ai/blog/img/redpajama/ios.gif
:width: 400
:align: center
The MLC LLM iOS app can be installed in two ways: through the pre-built package or by building it from source. If you're an iOS user looking to try out the models, the pre-built package is recommended. However, if you're a developer seeking to integrate new features into the package, building the iOS package from source is required.

The MLC LLM iOS package can be installed in two ways: through the pre-built package or by building it from source. If you're an iOS user looking to try out the models, the pre-built package is recommended. However, if you're a developer seeking to integrate new features into the package, building the iOS package from source is required.

Use Pre-built iOS Package
-------------------------
Use Pre-built iOS App
---------------------
The MLC LLM app is accessible on the App Store at no cost. You can download and explore it by simply clicking the button below:

.. image:: https://linkmaker.itunes.apple.com/assets/shared/badges/en-us/appstore-lrg.svg
Expand Down Expand Up @@ -137,8 +133,9 @@ Make sure to select a target device or simulator for the build.
After a successful build, you can run the iOS app on your device or
simulator to use the LLM model for text generation and processing.


Build your own App with MLC Swift API
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-------------------------------------

We also provide an swift package that you can use to build
your own app. The package is located under `ios/MLCSwift`.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,28 +1,30 @@
WebLLM Javascript APIs
======================
WebLLM and Javascript API
=========================

.. contents:: Table of Contents
:local:
:depth: 2

MLC-Chat also provide its Javascript bindings (`WebLLM <https://www.npmjs.com/package/@mlc-ai/web-llm>`_) to allow user to use MLC-Chat in your web application.
WebLLM is a MLC chat webruntime (`WebLLM <https://www.npmjs.com/package/@mlc-ai/web-llm>`_)
that allows you to build chat applications directly in browser.

Install WebLLM NPM Package
--------------------------
Try out Prebuilt Webpage
------------------------

.. code:: bash
To get started, you can try out `WebLLM prebuilt webpage <https://mlc.ai/webllm>`__.

npm i @mlc-ai/web-llm
A WebGPU-compatible browser and a local GPU are needed to run WebLLM.
You can download the latest Google Chrome and use `WebGPU Report <https://webgpureport.org/>`__
to verify the functionality of WebGPU on your browser.

🚧 API References
-----------------

Please refer to the source code of the ChatModule at `this link <https://github.com/mlc-ai/web-llm/blob/main/src/chat_module.ts>`_ to examine the function interface.

Use WebLLM API in your own program
----------------------------------
Use WebLLM NPM Package
----------------------

WebLLM is available as a npm package.
Below is a simple example to use WebLLM API in your own Typescript program.
You can follow the instructions in `get-started <https://github.com/mlc-ai/web-llm/tree/main/examples/get-started>`__
to run the example.

.. code:: typescript
Expand Down Expand Up @@ -63,3 +65,7 @@ Below is a simple example to use WebLLM API in your own Typescript program.
}
main();
Build a Chat App
----------------
You can find a complete a complete chat app example in `simple-chat <https://github.com/mlc-ai/web-llm/tree/main/examples/simple-chat>`__.
4 changes: 2 additions & 2 deletions docs/tutorials/runtime/rest.rst → docs/deploy/rest.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Rest APIs
=========
Rest API
========

.. contents:: Table of Contents
:local:
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
.. _get_started:

Try out MLC LLM on your device
==============================
Try out MLC Chat
================

We have prepared packages for you to try out MLC LLM locally, and you can try out prebuilt models on your device:
Welcome to MLC LLM, to get started, we have prepared prebuilt packages
for you to try out MLC Chat app built with MLC LLM,
and you can try out prebuilt models on the following platforms:

.. tabs::

.. tab:: iOS

The MLC LLM app is now accessible on the App Store at no cost. You can download and explore it by simply clicking the button below:
The MLC Chat app is now accessible on the App Store at no cost. You can download and explore it by simply clicking the button below:

.. image:: https://linkmaker.itunes.apple.com/assets/shared/badges/en-us/appstore-lrg.svg
:width: 135
Expand All @@ -23,11 +25,11 @@ We have prepared packages for you to try out MLC LLM locally, and you can try ou
:width: 300
:align: center

MLC LLM on iOS
MLC Chat on iOS

.. tab:: Android

The MLC LLM Android app is free and available for download and can be tried out by simply clicking the button below:
The MLC Chat Android app is free and available for download and can be tried out by simply clicking the button below:

.. image:: https://seeklogo.com/images/D/download-android-apk-badge-logo-D074C6882B-seeklogo.com.png
:width: 135
Expand Down Expand Up @@ -69,7 +71,7 @@ We have prepared packages for you to try out MLC LLM locally, and you can try ou

Once the parameters have been fetched and stored in the local cache, you can begin interacting with the model without the need for an internet connection.

You can use `WebGPU Report <https://webgpureport.org/>`__ to verify the functionality of WebGPU on your browser.
A WebGPU-compatible browser and a local GPU are needed to run WebLLM. You can download the latest Google Chrome and use `WebGPU Report <https://webgpureport.org/>`__ to verify the functionality of WebGPU on your browser.

.. figure:: https://mlc.ai/blog/img/redpajama/web.gif
:width: 300
Expand All @@ -78,16 +80,4 @@ We have prepared packages for you to try out MLC LLM locally, and you can try ou
MLC LLM on Web


Customize MLC-Chat Configuration
--------------------------------

The behavior of the chat can be customized by modifying the chat configuration file. To learn more about customizing the chat configuration JSON, you can refer to the following tutorials which provide a detailed walkthrough:

- :doc:`/get_started/mlc_chat_config`

Model Prebuilts
---------------

To use different pre-built models, you can refer to the following tutorials:

- :doc:`/tutorials/prebuilts/prebuilt_models`
24 changes: 11 additions & 13 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
👋 Welcome to MLC LLM
===================================
=====================

`Discord <https://discord.gg/9Xpy2HGBuD>`_ | `Demo <https://mlc.ai/mlc-llm>`_ | `GitHub <https://github.com/mlc-ai/mlc-llm>`_
`Discord <https://discord.gg/9Xpy2HGBuD>`_ | `GitHub <https://github.com/mlc-ai/mlc-llm>`_

🚧 This document is currently undergoing heavy construction.

👉 👉 :doc:`Try out MLC LLM on your devices. </get_started/try_out_mlc_llm>`
👉 👉 :doc:`Get started by try out the MLC Chat. </get_started/try_out>`

Machine Learning Compilation for LLM (MLC LLM) is a universal deployment solution that enables LLMs to run efficiently on consumer devices, leveraging native hardware acceleration like GPUs.

Expand Down Expand Up @@ -88,7 +88,6 @@ The underlying compiler techniques employed by MLC LLM are outlined in the follo
month = oct,
}
..
|

If you are interested in using Machine Learning Compilation in practice, we highly recommend the following course:
Expand All @@ -100,21 +99,20 @@ If you are interested in using Machine Learning Compilation in practice, we high
:caption: Get Started
:hidden:

get_started/try_out_mlc_llm.rst
get_started/try_out.rst
get_started/mlc_chat_config.rst
get_started/proj_overview.rst

.. toctree::
:maxdepth: 1
:caption: Build Apps
:caption: Build and Deploy Apps
:hidden:

tutorials/proj_overview.rst
tutorials/runtime/cpp.rst
tutorials/runtime/javascript.rst
tutorials/runtime/rest.rst
tutorials/app_build/cli.rst
tutorials/app_build/ios.rst
tutorials/app_build/android.rst
deploy/javascript.rst
deploy/rest.rst
deploy/cli.rst
deploy/ios.rst
deploy/android.rst

.. toctree::
:maxdepth: 1
Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/compilation/compile_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ By executing the compile command above, we generate three parts that are needed
- the model library,
- and chat config.

We have detailed introduction of these three parts in :doc:`the project overview page </tutorials/proj_overview>`.
We have detailed introduction of these three parts in :doc:`the project overview page </get_started/proj_overview>`.
Before proceeding, you can check and identify each part using the commands below:

.. tabs::
Expand Down
Loading

0 comments on commit 0dd9a12

Please sign in to comment.