Neural Magic’s DeepSparse is an inference runtime that can be deployed directly from a public Docker image. DeepSparse supports various CPU instance types and sizes, allowing you to quickly deploy the infrastructure that works best for your use case, based on cost and performance.
If you are interested in configuring and launching an instance with DeepSparse in Python, follow the step-by-step guide below.
We recommend installing the Azure CLI for easy access to Azure's functionalities although it is not required.
Create an Azure subscription to gain access to a subscription id
.
git clone https://github.com/neuralmagic/deepsparse.git
cd deepsparse/examples/azure-vm
pip install -r requirements.txt
The azure-vm.py script creates an Azure resource group, launches an Ubuntu instance and returns the Public IP address so you can SSH into the instance after it finishes staging. Additionally, it also contains a bash script which automatically downloads Docker and pulls Neural Magic's public DeepSparse image into your instance.
To execute the script, run the following command and pass in your subscription id
from step 1, your VMs location
, vm-type
, a resources group name
, your virtual machine's name
and the password
for logging in to your instance:
python azure-vm.py create-vm --subscription-id <SUBSCRIPTION-ID> --location <LOCATION> --vm-type <VM-TYPE> --group-name <GROUP-NAME> --vm-name <VM-NAME> --pw <PASSWORD>
To leverage CPU optimized instances, we recommend using the Fsv2-series
instances which contain AVX-512 instructions. Here's an example command for launching a VM in the US East location using a F4s-v2 instance (4 vCPUs and 8GB of RAM):
python azure-vm.py create-vm --subscription-id <sub-id> --location eastus --vm-type Standard_F4s_v2 --group-name deepsparse-group --vm-name deepsparse-vm --pw Password123!
PRO-TIP: The password passed into the CLI command must satisfy the following conditions:
- Contains an uppercase character.
- Contains a lowercase character.
- Contains a numeric digit.
- Contains a special character.
- Control characters are not allowed.
After running the script, your instance's public IP address will be printed out in the terminal. Pass the IP address into the following CLI command to SSH into your running instance:
ssh testuser@<PUBLIC-IP>
After entering your password, get root access:
sudo su
We recommend giving your instance 2-3 mins. to finish executing the bash script. To make sure you have the DeepSparse image imported, run the following command:
docker images
You should be able to see a downloaded DeepSparse image, if it shows an empty table, the bash script hasn't completed execution.
Upon image download, you can now use DeepSparse. Here's an example of benchmarking a pruned-quantized version of BERT trained on SQuAD from Docker:
docker run -it deepsparse_docker deepsparse.benchmark zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned95_obs_quant-none -i [64,128] -b 64 -nstreams 1 -s sync
python azure-vm.py delete-vm-rg --subscription-id <SUBSCRIPTION-ID> --group-name <GROUP-NAME> --vm-name <VM-NAME>