Deep Learning Open-Source
Deep Learning Open-Source
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their
respective holders.
Notice
The purchased products, services, and features are stipulated by the contract made between
Huawei and the customer. All or part of the products, services, and features described in this
document may not be within the purchase scope or the usage scope. Unless otherwise specified in
the contract, all statements, information, and recommendations in this document are provided
"AS IS" without warranties, guarantees, or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made
in the preparation of this document to ensure accuracy of the contents, but all statements,
information, and recommendations in this document do not constitute a warranty of any kind,
express, or implied.
Website: https://e.huawei.com
Huawei MindSpore AI Development Framework Page 1
Contents
The development state provides unified APIs (Python APIs) for all scenarios, including
unified model training, inference, and export APIs, as well as unified data processing,
enhancement, and format conversion APIs.
The development state also supports Graph High Level Optimization (GHLO), including
hardware-independent optimization (such as dead code elimination), automatic
parallelism, and automatic differentiation. These functions also support the design
concept of unified APIs for all scenarios.
MindSpore Intermediate Representation (IR) in the execution state has a native
computational graph and provides a unified IR. MindSpore performs pass optimization
based on the IR.
The execution state includes hardware-related optimization, parallel pipeline execution
layer, and in-depth optimization related to the combination of software and hardware
such as operator fusion and buffer fusion. These features support automatic
differentiation, automatic parallelism, and automatic optimization.
The deployment state uses the device-edge-cloud collaborative distributed architecture
with deployment, scheduling, and communication at the same layer, so it can implement
on-demand collaboration across all scenarios.
To put it simply, MindSpore integrates easy development (AI algorithm as code), efficient
execution (supporting Ascend/GPU optimization), and flexible deployment (all-scenario
on-demand collaboration).
mode, and new collaboration mode, to help developers develop and deploy AI
applications simpler and more efficiently.
Automatic differentiation is the soul of the deep learning framework, with which we only
need to focus on forward propagation and leave all complex derivation and backward
propagation processes to the framework. Automatic differentiation generally refers to the
method of automatically calculating the derivative of a function. In machine learning,
these derivatives can be used to update the weight. In the wider natural sciences, these
derivatives can also be used for various subsequent calculations. Figure 5-5 shows the
development history of automatic differentiation.
There are three automatic differentiation technologies in the mainstream deep learning
framework at present:
Conversion based on static computational graphs: The network is conversed into static
data flow diagrams during compilation, and then the chain rule is applied to the data
flow diagrams to implement automatic differentiation. For example, the static
compilation technology can be used to optimize the network performance in TensorFlow,
but the network setup or debugging is complex.
Conversion based on dynamic computational graphs: The operator reloading mode is
used to record the operation of the network in forward execution. Then, the chain rule is
applied to the dynamically generated data flow diagrams, and implement automatic
differentiation. For example, PyTorch is easy to use but difficult to achieve optimal
performance.
Conversion based on source code: Based on the functional programming framework, this
technology source performs automatic differentiation transfer on IE (program expressions
of in the compilation process) through just-in-time (JIT) compilation. It supports complex
control flow scenarios, high-order functions, and closures. The automatic differentiation
technology of MindSpore is based on source code conversion. It also supports automatic
differentiation of automatic control flows, so it is easy to build models, same as PyTorch.
In addition, MindSpore can perform static compilation optimization on the neural
networks, so the performance is excellent. Table 5-1 compares automatic differentiation
technologies and Figure 5-6 compares the performance and programmability.
SCT √ √ √ √ MindSpore
Huawei MindSpore AI Development Framework Page 7
To put it simply, the automatic differentiation technology of MindSpore has the following
advantages:
1. In terms of programmability, the universal Python language is used, and it is based
on the primitive differentiability of IR.
2. In terms of performance, compilation is optimized, and inverse operators are
automatically optimized.
3. In terms of debugging, abundant visual interfaces are available, and dynamic
execution is supported.
Figure 5-10 shows an example of computer vision. The neural network ResNet50 V1.5 is
used for training based on ImageNet2012 dataset with the optimal batch size. It shows
that the speed of the MindSpore framework based on Ascend 910 is much higher than
that in other frameworks and other mainstream training cards. Therefore, the
optimization technology of Huawei software and hardware collaboration can be used to
implement efficient operation in the MindSpore framework.
Different hardware has different precision and speeds, as shown in Figure 5-11.
The diversity of hardware architectures leads to the all-scenario deployment
differences and performance uncertainties. The separation of training and inference
leads to model isolation.
In the new mode, all-scenario on-demand collaboration can be implemented to obtain
better resource efficiency and privacy protection, ensuring security and reliability. It can
be developed once and deployed across devices. Models can be large or small and can be
flexibly deployed, bringing consistent development experience.
Three key technologies for the new collaboration mode in MindSpore are as follows:
IR of the unified model adapts to upper-layer differences in different language
scenarios. User-defined data structures are compatible, providing consistent
deployment experience.
The underlying hardware of the framework is also developed by Huawei. The graph
optimization technology based on software and hardware collaboration can shield
scenario differences.
Device-cloud collaboration of Federal Meta Learning breaks the boundaries of device
and cloud, and implements real-time update of the multi-device collaboration
model. The ultimate effect of the three key technologies is that, in a unified
architecture, the deployment performance of models in all scenarios is consistent,
and the precision of personalized models is significantly improved, as shown in
Figure 5-12.
The vision and value of MindSpore is to provide an AI computing platform that features
efficient development, excellent performance, and flexible deployment, helping the
industry lower the threshold of AI development, release the computing power of Ascend
AI processors, and facilitate inclusive AI, as shown in Figure 5-13.
- Python 3.7.5
Executable File Installation Dependencies - For details about other dependency
items, see the requirements.txt.
Compilation Dependencies:
Source Code Compilation and Installation
- Python 3.7.5
Dependencies
- wheel >= 0.32.0
Huawei MindSpore AI Development Framework Page 14
2. Run the following command in the root directory of the source code to compile
MindSpore.
Before running the preceding command, ensure that the paths where the executable
files cmake and patch store have been added to the environment variable PATH.
In the build.sh script, the git clone command will be executed to obtain the code in
the third-party dependency database. Ensure that the network settings of Git are
correct.
If the compiler performance is good, add -j{Number of threads} to increase the
number of threads. For example, bash build.sh -e cpu -z -j12.
3. Run the following commands to install MindSpore:
chmod +x build/package/MindSpore-{version}-cp37-cp37m-linux_{arch}.whl
pip install build/package/MindSpore-{version}-cp37-cp37m-linux_{arch}.whl
4. Run the following command. If no loading error message such as "No module
named 'MindSpore'" is displayed, the installation is successful.
asnumpy()
size()
dim()
dtype()
set_dtype()
tensor_add(other: Tensor)
tensor_mul(ohter: Tensor)
shape()
__Str__# (conversion into strings)
These tensor operations can be understood. For example, asnumpy() indicates that the
tensor is converted into a NumPy array, and tensor_add() indicates that the tensor is
added.
Table 5-3 describes other components of MindSpore.
Component Description
-ExpandDims - Squeeze
-Concat - OnesLike
-Select - StridedSlice
-ScatterNd …
-AddN - Cos
-Sub - Sin
-Mul - LogicalAnd
-MatMul - LogicalNot
-RealDiv - Less
-ReduceMean - Greater
…
-Conv2D - MaxPool
-Flatten - AvgPool
-Softmax - TopK
-ReLU - SoftmaxCrossEntropy
-Sigmoid - SmoothL1Loss
-Pooling- SGD
-BatchNorm - SigmoidCrossEntropy
…
ControlDepend
6. Unsupported syntax
Currently, the following syntax is not supported in network constructors: break, continue,
pass, raise, yield, async for, with, async with, assert, import and await.
Huawei MindSpore AI Development Framework Page 19
class ExpandDimsTest(Cell):
def __init__(self):
super(ExpandDimsTest, self).__init__()
self.expandDims = P.ExpandDims()
def construct(self, input_x, input_axis):
return self.expandDims(input_x, input_axis)
Huawei MindSpore AI Development Framework Page 20
expand_dim = ExpandDimsTest()
input_x = Tensor(np.random.randn(2,2,2,2).astype(np.float32))
expand_dim(input_x, 0)
class ExpandDimsTest(Cell):
def __init__(self, axis):
super(ExpandDimsTest, self).__init__()
self.expandDims = P.ExpandDims()
self.axis = axis
6. Validate the model, load the test dataset and trained model, and verify the result
precision.
5.2.4.2 Preparation
Before you start, check whether MindSpore has been correctly installed. If MindSpore is
not installed, install it by referring to 5.2.1 Environment Setup. In addition, you shall have
basic mathematical knowledge such as Python coding basics, probability, and matrix.
Now, let's start the MindSpore experience.
The MNIST dataset used in this example consists of 10 types of 28 x 28 pixels grayscale
images. It has a training set of 60,000 examples, and a test set of 10,000 examples.
Download the MNIST dataset at http://yann.lecun.com/exdb/mnist/. This page provides
four download links of dataset files. The first two links are required for data training, and
the last two links are required for data test.
Download the files, decompress them, and store them in the workspace
directories ./MNIST_Data/train and ./MNIST_Data/test.
The directory is as follows:
└─MNIST_Data
├─test
│ t10k-images.idx3-ubyte
│ t10k-labels.idx1-ubyte
│
└─train
train-images.idx3-ubyte
train-labels.idx1-ubyte
To facilitate the use of the sample, the function of automatically downloading dataset is
added to the sample script.
Before compiling code, you need to learn basic information about the hardware and
backend required for MindSpore running.
You can use context.set_context() to configure the information required for running, such
as the running mode, backend information, and hardware information.
Import the context module and configure the required information.
import argparse
from MindSpore import context
Huawei MindSpore AI Development Framework Page 22
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='MindSpore LeNet Example')
parser.add_argument('--device_target', type=str, default="Ascend", choices=['Ascend', 'GPU', 'CPU'],
help='device where the code will be implemented (default: Ascend)')
args = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target,
enable_mem_reuse=False)
...
The sample is configured to use the graph mode for running. Configure the hardware
information based on the site requirements. For example, if the code runs on the Ascend
AI processor, set --device_target to Ascend. If the code runs on the CPU or GPU, set --
device_target accordingly. For details about parameters, see the API description for
context.set_context().
----End
import MindSpore.dataset as ds
import MindSpore.dataset.transforms.c_transforms as C
import MindSpore.dataset.transforms.vision.c_transforms as CV
from MindSpore.dataset.transforms.vision import Inter
from MindSpore.common import dtype as mstype
# apply DatasetOps
buffer_size = 10000
mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size) # 10000 as in LeNet train script
mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)
mnist_ds = mnist_ds.repeat(repeat_size)
return mnist_ds
where
batch_size: indicates the number of data records in each group. Currently, each group
contains 32 data records.
repeat_size: indicates the number of replicated data records.
Generally, perform the shuffle and batch operations, and then perform the repeat
operation to ensure that data during an epoch is unique.
MindSpore supports multiple data processing and enhancing operations, which are
usually used together. For details, see section "Data Processing and Data Enhancement".
You need to initialize the full connection layers and convolutional layers.
TruncatedNormal: parameter initialization method. MindSpore supports multiple
parameter initialization methods, such as TruncatedNormal, Normal, and Uniform. For
details, see the description of the mindspore.common.initializer module of MindSpore
API.
The following is the sample code for initialization:
import MindSpore.nn as nn
from MindSpore.common.initializer import TruncatedNormal
def weight_variable():
"""
weight initial
"""
return TruncatedNormal(0.02)
To use MindSpore for neural network definition, inherit mindspore.nn.cell.Cell. Cell is the
base class of all neural networks such as Conv2d.
Define each layer of a neural network in the __init__() method in advance, and then
define the construct() method to complete the forward construction of the neural
network. According to the structure of the LeNet network, define the network layers as
follows:
class LeNet5(nn.Cell):
Huawei MindSpore AI Development Framework Page 25
"""
Lenet network structure
"""
#define the operator required
def __init__(self):
super(LeNet5, self).__init__()
self.batch_size = 32
self.conv1 = conv(1, 6, 5)
self.conv2 = conv(6, 16, 5)
self.fc1 = fc_with_initialize(16 * 5 * 5, 120)
self.fc2 = fc_with_initialize(120, 84)
self.fc3 = fc_with_initialize(84, 10)
self.relu = nn.ReLU()
self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
self.flatten = nn.Flatten()
if __name__ == "__main__":
...
#define the loss function
net_loss = SoftmaxCrossEntropyWithLogits(is_grad=False, sparse=True, reduction='mean')
...
Define the optimizer.
Optimizers supported by MindSpore include Adam, AdamWeightDecay, StepLRPolicy,
and Momentum. The popular Momentum optimizer is used in this example.
if __name__ == "__main__":
...
#learning rate setting
lr = 0.01
momentum = 0.9
#create the network
network = LeNet5()
#define the optimizer
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)
...
where
lenet.py: indicates the script file that you write according to the tutorial.
--device_target CPU: specifies the running hardware platform. The parameter can be CPU,
GPU, or Ascend. You can specify the hardware platform based on the actual running
hardware platform.
Loss values are output during training, as shown in the following figure. Although loss
values may fluctuate, they gradually decrease and the accuracy gradually increases in
general. Loss values displayed each time may be different because of their randomicity.
The following is an example of loss printing during training:
checkpoint_lenet-1_1875.ckpt
where
checkpoint_lenet-1_1875.ckpt: is the saved model parameter file. The file name format is
checkpoint_{network name}-{epoch No.}_{step No.}.ckpt.
if __name__ == "__main__":
...
test_net(args, network, model, mnist_path)
where
load_checkpoint(): This API is used to load the checkpoint model parameter file and
return a parameter dictionary.
checkpoint_lenet-1_1875.ckpt: indicates the name of the saved checkpoint model file.
load_param_into_net: This API is used to load parameters to the network.
Use the run command to run your code script.
python lenet.py --device_target=CPU
where
lenet.py: indicates the script file that you write according to the tutorial.
--device_target CPU: specifies the running hardware platform. The parameter can be CPU,
GPU, or Ascend. You can specify the hardware platform based on the actual running
hardware platform.
Command output similar to the following is displayed:
The model accuracy data is displayed in the output content. In the example, the accuracy
reaches 97.4%, indicating a good model quality.
5.3 Summary
This section describes the Huawei-developed deep learning framework MindSpore. Three
technological innovations of the MindSpore design concept are first introduced: new
programming paradigm, new execution mode, and new collaboration mode, as well as
advantages such as easy development, efficient execution state, and flexible deployment
state. In the last section, the development and application of MindSpore are introduced,
and an actual example of image classification is used to illustrate the development
procedure.
5.4 Quiz
1. MindSpore is a Huawei-developed AI computing framework that implements device-
edge-cloud on-demand collaboration across all scenarios. It provides unified APIs for
all scenarios and provides end-to-end capabilities for AI model development,
execution, and deployment in all scenarios. What are the main features of the
MindSpore architecture?
2. To address the challenges faced by AI developers in the industry, such as high
development threshold, high operating costs, and difficult deployment. What are the
three technological innovations proposed by MindSpore to help developers develop
and deploy AI applications more easily and more efficiently?
3. Challenges to model execution under strong chip computing power include memory
wall problem, high interaction overhead, and difficult data supply. Some operations
are performed on the host, and some are performed on the device, so the interaction
overhead is much greater than the execution overhead, leading to the low
accelerator usage. What is the solution of MindSpore?
4. Use MindSpore to recognize MNIST handwritten digits.