0% found this document useful (0 votes)
79 views

Deep Learning Open-Source

Uploaded by

John Smith
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

Deep Learning Open-Source

Uploaded by

John Smith
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Huawei AI Academy Training Materials

Deep Learning Open-Source


Framework MindSpore

Huawei Technologies Co., Ltd.


Copyright © Huawei Technologies Co., Ltd. 2020. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their
respective holders.

Notice
The purchased products, services, and features are stipulated by the contract made between
Huawei and the customer. All or part of the products, services, and features described in this
document may not be within the purchase scope or the usage scope. Unless otherwise specified in
the contract, all statements, information, and recommendations in this document are provided
"AS IS" without warranties, guarantees, or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made
in the preparation of this document to ensure accuracy of the contents, but all statements,
information, and recommendations in this document do not constitute a warranty of any kind,
express, or implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base Bantian, Longgang, Shenzhen 518129

Website: https://e.huawei.com
Huawei MindSpore AI Development Framework Page 1

Contents

5 Deep Learning Open-Source Framework MindSpore .............................................................. 2


5.1 MindSpore Development Framework ............................................................................................................................... 2
5.1.1 MindSpore Architecture ...................................................................................................................................................... 2
5.1.2 MindSpore Design Concept ............................................................................................................................................... 3
5.1.3 MindSpore Advantages .....................................................................................................................................................12
5.2 MindSpore Development and Application ....................................................................................................................13
5.2.1 Environment Setup .............................................................................................................................................................13
5.2.2 MindSpore Components and Concepts.......................................................................................................................15
5.2.3 Constraints on Network Construction Using Python Source Code ...................................................................17
5.2.4 Implementing an Image Classification Application ................................................................................................20
5.3 Summary ...................................................................................................................................................................................28
5.4 Quiz .............................................................................................................................................................................................28
Huawei MindSpore AI Development Framework Page 2

5 Deep Learning Open-Source Framework


MindSpore

This chapter describes Huawei AI development framework MindSpore, including the


structure and design roadmap of MindSpore, features of MindSpore for resolving the
problems and difficulties of the AI computing framework, and development and
application of MindSpore.

5.1 MindSpore Development Framework


MindSpore is a Huawei-developed AI computing framework that implements on-demand
device-edge-cloud collaboration across all scenarios. It provides unified APIs for all
scenarios and end-to-end capabilities for AI model development, running, and
deployment.
With the device-edge-cloud collaborative distributed architecture, MindSpore uses the
new paradigm of differential native programming, and new execution mode of AI-Native
to achieve better resource efficiency, security, and reliability. In addition, it lowers the AI
development threshold in the industry, and releases the computing power of Ascend
processors, contributing to inclusive AI.

5.1.1 MindSpore Architecture


The MindSpore architecture consists of the development state, execution state, and
deployment state. The processors that can be deployed include CPUs, GPUs, and Ascend
processors (Ascend 310/Ascend 910), as shown in Figure 5-1.
Huawei MindSpore AI Development Framework Page 3

Figure 5-1 MindSpore architecture

The development state provides unified APIs (Python APIs) for all scenarios, including
unified model training, inference, and export APIs, as well as unified data processing,
enhancement, and format conversion APIs.
The development state also supports Graph High Level Optimization (GHLO), including
hardware-independent optimization (such as dead code elimination), automatic
parallelism, and automatic differentiation. These functions also support the design
concept of unified APIs for all scenarios.
MindSpore Intermediate Representation (IR) in the execution state has a native
computational graph and provides a unified IR. MindSpore performs pass optimization
based on the IR.
The execution state includes hardware-related optimization, parallel pipeline execution
layer, and in-depth optimization related to the combination of software and hardware
such as operator fusion and buffer fusion. These features support automatic
differentiation, automatic parallelism, and automatic optimization.
The deployment state uses the device-edge-cloud collaborative distributed architecture
with deployment, scheduling, and communication at the same layer, so it can implement
on-demand collaboration across all scenarios.
To put it simply, MindSpore integrates easy development (AI algorithm as code), efficient
execution (supporting Ascend/GPU optimization), and flexible deployment (all-scenario
on-demand collaboration).

5.1.2 MindSpore Design Concept


To address the challenges faced by AI developers in the industry, such as high
development threshold, high operation cost, and difficult deployment, MindSpore
proposes three technical innovation points: new programming paradigm, new execution
Huawei MindSpore AI Development Framework Page 4

mode, and new collaboration mode, to help developers develop and deploy AI
applications simpler and more efficiently.

5.1.2.1 New Programming Paradigm


The design concept of the new programming paradigm is put forward to deal with the
challenges of the development state.
For the development state, the challenges are as follows:
1. High requirements for skills: Developers are required to understand AI, have
theoretical knowledge related to computer systems and software, and have strong
mathematical skills, so there is a high development threshold.
2. Difficult tuning of the black box: It is difficult to optimize parameters because of the
black box and unexplainable features of AI algorithms.
3. Difficult parallel planning: With the current technology trend where the data volume
and the model are larger and larger, parallel computing is inevitable, but parallel
planning depends heavily on human experience. It requires the understanding of
data, model and the distributed system architecture.
The concept "AI algorithm as code" of the new programming paradigm lowers the
threshold for AI development. The new AI programming paradigm based on
mathematical native expressions allows algorithm experts to focus on AI innovation and
exploration, as shown in Figure 5-2.

Figure 5-2 New programming paradigm of MindSpore

5.1.2.2 Automatic Differentiation Technology


The core of the AI framework and one of the decisive factors of a programming
paradigm is the automatic differentiation technology used in the AI framework. The deep
learning model is trained through forward and backward propagation. As shown in
Figure 5-3, the forward propagation follows the direction of the black arrow, and the
backward propagation follows the direction of the red arrow. The backward propagation
is based on the chain rule of the composite function, as shown in Figure 5-4.
Huawei MindSpore AI Development Framework Page 5

Figure 5-3 Forward propagation and backward propagation

Figure 5-4 Chain rule

Automatic differentiation is the soul of the deep learning framework, with which we only
need to focus on forward propagation and leave all complex derivation and backward
propagation processes to the framework. Automatic differentiation generally refers to the
method of automatically calculating the derivative of a function. In machine learning,
these derivatives can be used to update the weight. In the wider natural sciences, these
derivatives can also be used for various subsequent calculations. Figure 5-5 shows the
development history of automatic differentiation.

Figure 5-5 Development history of automatic differentiation


Huawei MindSpore AI Development Framework Page 6

There are three automatic differentiation technologies in the mainstream deep learning
framework at present:
Conversion based on static computational graphs: The network is conversed into static
data flow diagrams during compilation, and then the chain rule is applied to the data
flow diagrams to implement automatic differentiation. For example, the static
compilation technology can be used to optimize the network performance in TensorFlow,
but the network setup or debugging is complex.
Conversion based on dynamic computational graphs: The operator reloading mode is
used to record the operation of the network in forward execution. Then, the chain rule is
applied to the dynamically generated data flow diagrams, and implement automatic
differentiation. For example, PyTorch is easy to use but difficult to achieve optimal
performance.
Conversion based on source code: Based on the functional programming framework, this
technology source performs automatic differentiation transfer on IE (program expressions
of in the compilation process) through just-in-time (JIT) compilation. It supports complex
control flow scenarios, high-order functions, and closures. The automatic differentiation
technology of MindSpore is based on source code conversion. It also supports automatic
differentiation of automatic control flows, so it is easy to build models, same as PyTorch.
In addition, MindSpore can perform static compilation optimization on the neural
networks, so the performance is excellent. Table 5-1 compares automatic differentiation
technologies and Figure 5-6 compares the performance and programmability.

Table 5-1 Comparison of automatic differentiation technology


Automatic
Typical
Differentiation General Fast Portable Differentiable
Framework
Type

Graph No √ √ Partially TensorFlow

OO √ Partially Partially √ PyTorch

SCT √ √ √ √ MindSpore
Huawei MindSpore AI Development Framework Page 7

Figure 5-6 Performance and programmability comparison of automatic


differentiation technology

To put it simply, the automatic differentiation technology of MindSpore has the following
advantages:
1. In terms of programmability, the universal Python language is used, and it is based
on the primitive differentiability of IR.
2. In terms of performance, compilation is optimized, and inverse operators are
automatically optimized.
3. In terms of debugging, abundant visual interfaces are available, and dynamic
execution is supported.

5.1.2.3 Automatic Parallelism


Currently, deep learning models must be parallelized due to the large volume, and it is
done manually. It requires model segmentation to be designed, and the cluster topology
to be sensed, so it is difficult to develop, ensure and optimize performance.
MindSpore automatic parallelism uses serial algorithm code to automatically implement
distributed parallel training and maintain high performance.
Generally, parallel training can be divided into model parallel training and data parallel
training. It is easy to understand data parallel training, where each sample can
independently complete forward propagation and then summarize the propagation
result. In contrast, model parallel training is more complex, which requires developers to
manually write all the parts that need to be parallelized with the logic of "parallel
thinking".
MindSpore provides a key innovative technology, that is, automatic graph segmentation.
The entire graph is segmented based on the input and output data dimensions of the
operator, that is, each operator in the graph is segmented to the clusters to complete
parallel computing. Data parallelism and model parallelism are combined. Cluster
topology awareness scheduling allows the cluster topology to be perceived, and
automatic scheduling of subgraphs to be executed to minimize the communication
overhead, as shown in Figure 5-7.
MindSpore automatic parallelism aims to build a training mode that integrates data
parallelism, model parallelism, and hybrid parallelism. It automatically selects a model
segmentation mode with the minimum cost to implement automatic distributed parallel
training.
Huawei MindSpore AI Development Framework Page 8

Figure 5-7 Automatic graph segmentation

The fine-grained operator segmentation of MindSpore is complex. However, developers


only need to use the top API for efficient computing, relieving of underlying
implementation concerns.
In general, the new programming paradigm not only implements "AI algorithm as code",
but also lowers the threshold for AI development and enables efficient development and
debugging. For example, the new programming paradigm can efficiently complete
automatic differentiation, and achieve automatic parallelization and debug-mode switch
with one line.
A developer implements the classic algorithm transformer in natural language processing
(NLP) field by using the MindSpore framework. During development and debugging, with
the dynamic and static combined, the debugging process is transparent and simple. From
the final structure, MindSpore has 2000 lines in the framework, the number of lines is
about 20% less than that of 2500 lines in TensorFlow, but the efficiency is improved by
over 50%.

5.1.2.4 New Execution Mode


The design concept of the new execution mode is proposed to meet the challenges of the
execution state.
The challenges of the execution state are as follows: 1. AI computing complexity and
computing power diversity: CPU core, Cube unit, and Vector unit, operations of scalars,
vectors, and tensors, mixed precision operation, dense matrix and sparse matrix
calculation. 2. When multiple cards are running, the performance cannot increase linearly
as the number of nodes increases, and the parallel control overhead is high.
The new execution mode uses the Ascend Native execution engine: On-Device execution
is available, as shown in Figure 5-8. The mode offloads graphs to devices, and
implements deep graph optimization, maximizing the computing power of Ascend.
Huawei MindSpore AI Development Framework Page 9

Figure 5-8 On-Device execution

Two core technologies of On-Device execution are as follows:


1. The graph sink execution maximizes the computing power of Ascend. Challenges to
model execution under strong chip computing power include memory wall problem,
high interaction overhead, and difficult data supply. Partial operations are performed
on the host, while the others are performed on the device. The interaction overhead
is much larger than the execution overhead, resulting in the low accelerator usage.
MindSpore uses the chip-oriented deep graph optimization technology to minimize
the synchronization waiting time, and maximize the parallelism of data, computing,
and communication. It sinks the entire data and computational graphs to the Ascend
chip to provide the best effect. The training performance elevates tenfold compared
with the on-host graph scheduling.
2. Massive distributed gradient aggregation is driven by data. Challenges to distributed
gradient aggregation under strong chip computing power are the synchronization
overhead of central control and frequent synchronization of ResNet50 under the
single iteration of 20 ms. The traditional method can only complete All Reduce after
three times of synchronization, while the data-driven method autonomously
performs All Reduce without causing control overhead. MindSpore uses adaptive
graph segmentation optimization driven by gradient data to implement
decentralized All Reduce, consistent gradient aggregation, and full pipeline of
computing and communication, as shown in Figure 5-9.
Huawei MindSpore AI Development Framework Page 10

Figure 5-9 Decentralized and autonomous All Reduce

Figure 5-10 shows an example of computer vision. The neural network ResNet50 V1.5 is
used for training based on ImageNet2012 dataset with the optimal batch size. It shows
that the speed of the MindSpore framework based on Ascend 910 is much higher than
that in other frameworks and other mainstream training cards. Therefore, the
optimization technology of Huawei software and hardware collaboration can be used to
implement efficient operation in the MindSpore framework.

Figure 5-10 Comparison between MindSpore and TensorFlow

5.1.2.5 New Collaboration Mode


The design concept of the new collaboration mode targets the challenge to the
deployment state.
 Varied requirements, objectives, and constraints exist in the application scenarios of
device, edge, and cloud. For example, the mobile phones are expected to be
lightweight, while the cloud may require higher precision.
Huawei MindSpore AI Development Framework Page 11

 Different hardware has different precision and speeds, as shown in Figure 5-11.
 The diversity of hardware architectures leads to the all-scenario deployment
differences and performance uncertainties. The separation of training and inference
leads to model isolation.
In the new mode, all-scenario on-demand collaboration can be implemented to obtain
better resource efficiency and privacy protection, ensuring security and reliability. It can
be developed once and deployed across devices. Models can be large or small and can be
flexibly deployed, bringing consistent development experience.
Three key technologies for the new collaboration mode in MindSpore are as follows:
 IR of the unified model adapts to upper-layer differences in different language
scenarios. User-defined data structures are compatible, providing consistent
deployment experience.
 The underlying hardware of the framework is also developed by Huawei. The graph
optimization technology based on software and hardware collaboration can shield
scenario differences.
 Device-cloud collaboration of Federal Meta Learning breaks the boundaries of device
and cloud, and implements real-time update of the multi-device collaboration
model. The ultimate effect of the three key technologies is that, in a unified
architecture, the deployment performance of models in all scenarios is consistent,
and the precision of personalized models is significantly improved, as shown in
Figure 5-12.

Figure 5-11 Deployment challenge


Huawei MindSpore AI Development Framework Page 12

Figure 5-12 On-Demand collaboration and consistent development

The vision and value of MindSpore is to provide an AI computing platform that features
efficient development, excellent performance, and flexible deployment, helping the
industry lower the threshold of AI development, release the computing power of Ascend
AI processors, and facilitate inclusive AI, as shown in Figure 5-13.

Figure 5-13 MindSpore vision and value

5.1.3 MindSpore Advantages


5.1.3.1 Easy Development
 Automatic differentiation: unified programming of network and operator,
functional/algorithm native expression, and automatic generation of inverse network
operators
 Automatic parallelism: The automatic segmentation of models can achieve the
optimal efficiency of model parallelism.
 Automatic optimization. The same set of code is used for dynamic and static graphs.
Huawei MindSpore AI Development Framework Page 13

5.1.3.2 Efficient Execution


 On-Device execution leverages the great computing power of Ascend.
 The pipeline is optimized to maximize the parallel linearity.
 It implements deep graph optimization and adapts to the computing power and
precision of the AI core.

5.1.3.3 Flexible Deployment


 Device-edge-cloud collaborative computing enables better protecting privacy.
 Unified device-edge-cloud architecture implements one-time development and on-
demand deployment.
MindSpore is equivalent to the open-source framework in the industry. Huawei-
developed chips and cloud services are prioritized.
Upward: It can interconnect with third-party frameworks and third-party ecosystems
(training frontend interconnection and inference model interconnection) through Graph
IR. In addition, developers can be extended.
Downward: It can interconnect with third-party chips, help developers increase
MindSpore application scenarios, and expand the AI ecosystem.

5.2 MindSpore Development and Application


5.2.1 Environment Setup
5.2.1.1 Overall Installation Requirements
Overall installation requirements: Ubuntu 16.04 (or later) and Python 3.7.5 (or later) are
required, including the CPU, GPU, and Ascend environment. The installation methods
include direct installation by using the installation package, source code compilation
installation, and docker installation.
The following example uses the CPU environment to describe the installation procedure.
Table 5-2 lists the system requirements and software dependencies of the MindSpore
CPU version.

Table 5-2 MindSpore requirements and software dependencies


Version MindSpore Master

Operating System Ubuntu 16.04 (or later) x86_64

- Python 3.7.5
Executable File Installation Dependencies - For details about other dependency
items, see the requirements.txt.

Compilation Dependencies:
Source Code Compilation and Installation
- Python 3.7.5
Dependencies
- wheel >= 0.32.0
Huawei MindSpore AI Development Framework Page 14

Version MindSpore Master


- GCC 7.3.0
- CMake >= 3.14.1
- patch >= 2.5
- Autoconf >= 2.64
- Libtool >= 2.4.6
- Automake >= 1.15.1
Installation Dependencies:
Same as the executable files installation
dependencies.

5.2.1.2 Direct Installation Using the Pip Installation Package


pip install –y MindSpore-cpu

5.2.1.3 Installation Using Source Code Compilation


1. Download the source code from the code repository.

git clone https://gitee.com/MindSpore/MindSpore.git

2. Run the following command in the root directory of the source code to compile
MindSpore.

bash build.sh -e cpu -z -j4

 Before running the preceding command, ensure that the paths where the executable
files cmake and patch store have been added to the environment variable PATH.
 In the build.sh script, the git clone command will be executed to obtain the code in
the third-party dependency database. Ensure that the network settings of Git are
correct.
 If the compiler performance is good, add -j{Number of threads} to increase the
number of threads. For example, bash build.sh -e cpu -z -j12.
3. Run the following commands to install MindSpore:

chmod +x build/package/MindSpore-{version}-cp37-cp37m-linux_{arch}.whl
pip install build/package/MindSpore-{version}-cp37-cp37m-linux_{arch}.whl

4. Run the following command. If no loading error message such as "No module
named 'MindSpore'" is displayed, the installation is successful.

python -c 'import MindSpore'


Huawei MindSpore AI Development Framework Page 15

5.2.1.4 Docker Installation:


docker pull MindSpore/MindSpore-cpu:0.1.0-alpha

5.2.2 MindSpore Components and Concepts


5.2.2.1 Components
In MindSpore, data is also stored in tensors. Common tensor operations:

asnumpy()
size()
dim()
dtype()
set_dtype()
tensor_add(other: Tensor)
tensor_mul(ohter: Tensor)
shape()
__Str__# (conversion into strings)

These tensor operations can be understood. For example, asnumpy() indicates that the
tensor is converted into a NumPy array, and tensor_add() indicates that the tensor is
added.
Table 5-3 describes other components of MindSpore.

Table 5-3 MindSpore components and description


Component Description

model_zoo Definition of common network models

Data loading module, which defines the dataloader


communication and dataset
and processes data such as images and texts

Dataset processing module, which can read and pre-


dataset
process data

common Defines tensor, parameter, dtype, and initializer.

Defines the context class and sets model running


context parameters, such as graph and PyNative switching
modes.

Automatic differentiation and custom operator


akg
library

Defines MindSpore cells (neural network units), loss


nn
functions, and optimizers.

Defines basic operators and registers reverse


ops
operators.
Huawei MindSpore AI Development Framework Page 16

Component Description

train Training model and summary function modules

Utilities, which verify parameters. This parameter is


utils
used in the framework.

5.2.2.2 Programming Concept: Operation


Common operations in MindSpore:
 array: array-related operators

-ExpandDims - Squeeze
-Concat - OnesLike
-Select - StridedSlice
-ScatterNd …

 math: math-related operators

-AddN - Cos
-Sub - Sin
-Mul - LogicalAnd
-MatMul - LogicalNot
-RealDiv - Less
-ReduceMean - Greater

 nn: network operators

-Conv2D - MaxPool
-Flatten - AvgPool
-Softmax - TopK
-ReLU - SoftmaxCrossEntropy
-Sigmoid - SmoothL1Loss
-Pooling- SGD
-BatchNorm - SigmoidCrossEntropy

 control: control operators

ControlDepend

5.2.2.3 Programming Concept: Cell


1. The cell defines the basic module for calculation. The objects of the cell can be
directly executed.
① __init__ initializes and verifies components such as parameter, cell, and primitive.
② construct defines the execution process. In graph mode, a graph is compiled for
execution, and is subject to specific syntax restrictions.
Huawei MindSpore AI Development Framework Page 17

③ bprop (optional) indicates the reverse direction of customized modules. If this


function is undefined, automatic differentiation is used to calculate the reverse of
the construct part.
2. The cells predefined in MindSpore mainly include: common loss
(SoftmaxCrossEntropyWithLogits and MSELoss), common optimizers (Momentum,
SGD, and Adam), and common network packaging functions, such as
TrainOneStepCell network gradient calculation and update, WithGradCell gradient
calculation.

5.2.2.4 Programming Concept: MindSpore IR


1. MindSpore IR (MindIR) is a compact, efficient, and flexible graph-based functional IR
that can represent functional semantics such as free variables, high-order functions,
and recursion. It is a program carrier in the process of AD and compilation
optimization.
2. Each graph represents a function definition graph and consists of ParameterNode,
ValueNode, and ComplexNode (CNode).
3. The edge shows the def-use relationship.

5.2.3 Constraints on Network Construction Using Python Source


Code
MindSpore can compile user source code based on the Python syntax into computational
graphs, and can convert common functions or instances inherited from nn.Cell into
computational graphs. Currently, MindSpore does not support conversion of any Python
source code into computational graphs. Therefore, there are constraints on source code
compilation, including syntax constraints and network definition constraints. As
MindSpore evolves, the constraints may change. These constraints may change as
MindSpore evolves.

5.2.3.1 Syntax Constraints


1. Supported Python data types
① Number: The value can be int, float, or bool. Complex numbers are not supported.
② String
③ List: Currently, only the append method is supported. Updating a list will generate
a new list.
④ Tuple
⑤ Dictionary: The type of key only supports String.
2. MindSpore extended data types
Tensor: The tensor variables must be defined instances.
3. Function parameters
① Default parameter value: Currently, data types int, float, bool, None, str, tuple, list
and dict are supported, whereas Tensor is not supported.
② Variable parameter: Currently, functions with variable parameters cannot be used
for backward propagation.
Huawei MindSpore AI Development Framework Page 18

③ Key-value pair parameter: Currently, functions with key-value pair parameters


cannot be used for backward propagation.
④ Variable key-value pair parameter: Currently, the function with variable key-value
pairs cannot be reversed.
4. Statement types, as shown in Table 5-4.

Table 5-4 MindSpore and Python statement comparison


Statement Compared with Python

Nested for loops are partially supported.


for
Iteration sequences must be tuples or list.

Nested while loops are partially


while
supported.

Same as that in Python. The input of the


if
if condition must be a constant.

def Same as that in Python.

Accessed multiple subscripts of lists and


Assignment statement
dictionaries cannot be used as left values.

5. Operators, as shown in Table 5-5.

Table 5-5 Supported types of MindSpore operators


Operator Supported Type

+ Scalar, Tensor, tuple

- Scalar and Tensor

* Scalar and Tensor

/ Scalar and Tensor

The operation object type can be list,


tuple, or Tensor. Accessed multiple
subscripts can be used as the right values
instead of left values. The index type
[]
cannot be Tensor. For details about the
access constraints for the Tuple and
Tensor types, see the description in the
slicing operations.

6. Unsupported syntax
Currently, the following syntax is not supported in network constructors: break, continue,
pass, raise, yield, async for, with, async with, assert, import and await.
Huawei MindSpore AI Development Framework Page 19

5.2.3.2 Network Definition Constraints


1. Instance types on the entire network
① Common Python function with the @ms_function decorator
② Cell subclass inherited from nn.Cell.
2. Network input types
① The training data input parameters of the entire network must be of the Tensor
type.
② The generated ANF diagram cannot contain the following constant nodes: string
constants, constants with nested tuples, and constants with nested lists.
3. Network graph optimization
During graph optimization at the ME frontend, the dataclass, dictionary, list, and
key-value pair types are converted to tuple types, and the corresponding operations
are converted to tuple operations.
4. Network construction components, as shown in Table 5-6.

Table 5-6 Constraints on network construction components


Category Content

Cell instance MindSpore/nn/* and customized Cell

Member functions of other classes in the


Member function of a Cell instance
construct function of Cell can be called.

Custom Python functions and system


Function
functions listed in the preceding content.

Dataclass instance Class decorated with @dataclass

Primitive operator MindSpore/ops/operations/*

Composite operator MindSpore/ops/composite/*

Use the value generated by @constexpr to


Operator generated by constexpr
calculate operators.

5.2.3.3 Other Constraints


The input parameters of the construct function on the entire network and the parameters
of the function modified by the ms_function decorator are generalized during graph
compilation and cannot be passed to the operator as constant input. For example, the
incorrect input is as follows:

class ExpandDimsTest(Cell):
def __init__(self):
super(ExpandDimsTest, self).__init__()
self.expandDims = P.ExpandDims()
def construct(self, input_x, input_axis):
return self.expandDims(input_x, input_axis)
Huawei MindSpore AI Development Framework Page 20

expand_dim = ExpandDimsTest()
input_x = Tensor(np.random.randn(2,2,2,2).astype(np.float32))
expand_dim(input_x, 0)

In the example, ExpandDimsTest is a single-operator network with two inputs: input_x


and input_axis. The second input of the ExpandDims operator must be a constant. This is
because input_axis is required when the output dimension of the ExpandDims operator is
deduced during graph compilation. However, input_axis, as a network parameter input, is
generalized into a variable and its value cannot be determined. As a result, the output
dimension of the operator cannot be deduced, causing the graph compilation failure.
Therefore, the input required by deduction in the graph compilation phase must be a
constant. In the API, the parameters of this type of operator that require constant input
will be explained, marked const input is needed.
The correct way is to directly enter the required value or a member variable in a class for
the constant input of the operator in the construct function, as shown in the following
example:

class ExpandDimsTest(Cell):
def __init__(self, axis):
super(ExpandDimsTest, self).__init__()
self.expandDims = P.ExpandDims()
self.axis = axis

def construct(self, input_x):


return self.expandDims(input_x, self.axis)
axis = 0
expand_dim = ExpandDimsTest(axis)
input_x = Tensor(np.random.randn(2,2,2,2).astype(np.float32))
expand_dim(input_x)

5.2.4 Implementing an Image Classification Application


5.2.4.1 Overview
This document uses a practice example to demonstrate the basic functions of MindSpore.
For common users, it takes 20 to 30 minutes to complete the practice. This is a simple
and basic application process. For other advanced and complex applications, extend this
basic process as needed.
You can download the complete executable sample code for experiment learning. The
link is as follows:
https://gitee.com/mindspore/docs/blob/master/tutorials/tutorial_code/lenet.py
During the practice, a simple image classification function is implemented. The overall
process is as follows:
1. Load the required dataset. The MNIST dataset is used in this example.
2. Define a network. The LeNet network is used in this example.
3. Define the loss function and optimizer.
4. Load the dataset and perform training. After the training is complete, view the result
and save the model file.
5. Load the saved model for inference.
Huawei MindSpore AI Development Framework Page 21

6. Validate the model, load the test dataset and trained model, and verify the result
precision.

5.2.4.2 Preparation
Before you start, check whether MindSpore has been correctly installed. If MindSpore is
not installed, install it by referring to 5.2.1 Environment Setup. In addition, you shall have
basic mathematical knowledge such as Python coding basics, probability, and matrix.
Now, let's start the MindSpore experience.

Step 1 Download a dataset.

The MNIST dataset used in this example consists of 10 types of 28 x 28 pixels grayscale
images. It has a training set of 60,000 examples, and a test set of 10,000 examples.
Download the MNIST dataset at http://yann.lecun.com/exdb/mnist/. This page provides
four download links of dataset files. The first two links are required for data training, and
the last two links are required for data test.
Download the files, decompress them, and store them in the workspace
directories ./MNIST_Data/train and ./MNIST_Data/test.
The directory is as follows:

└─MNIST_Data
├─test
│ t10k-images.idx3-ubyte
│ t10k-labels.idx1-ubyte

└─train
train-images.idx3-ubyte
train-labels.idx1-ubyte

To facilitate the use of the sample, the function of automatically downloading dataset is
added to the sample script.

Step 2 Import Python libraries and modules.

Before start, you need to import Python libraries.


Currently, the os library is used. For ease of understanding, other required libraries will be
introduced in detail when being used.
import os
Step 3 Configure the running information.

Before compiling code, you need to learn basic information about the hardware and
backend required for MindSpore running.
You can use context.set_context() to configure the information required for running, such
as the running mode, backend information, and hardware information.
Import the context module and configure the required information.

import argparse
from MindSpore import context
Huawei MindSpore AI Development Framework Page 22

if __name__ == "__main__":
parser = argparse.ArgumentParser(description='MindSpore LeNet Example')
parser.add_argument('--device_target', type=str, default="Ascend", choices=['Ascend', 'GPU', 'CPU'],
help='device where the code will be implemented (default: Ascend)')
args = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target,
enable_mem_reuse=False)

...
The sample is configured to use the graph mode for running. Configure the hardware
information based on the site requirements. For example, if the code runs on the Ascend
AI processor, set --device_target to Ascend. If the code runs on the CPU or GPU, set --
device_target accordingly. For details about parameters, see the API description for
context.set_context().
----End

5.2.4.3 Data Preprocessing


Datasets are important for training. A good dataset can effectively improve training
accuracy and efficiency. Generally, before loading a dataset, you need to perform some
operations on the dataset.
Define the dataset and data operations.
Define the create_dataset() function to create a dataset. In this function, define the data
augmentation and processing operations to be performed:
1. Define the dataset.
2. Define parameters required for data augmentation and processing.
3. Generate corresponding data augmentation operations according to the parameters.
4. Use the map() mapping function to apply data operations to the dataset.
5. Process the generated dataset.

import MindSpore.dataset as ds
import MindSpore.dataset.transforms.c_transforms as C
import MindSpore.dataset.transforms.vision.c_transforms as CV
from MindSpore.dataset.transforms.vision import Inter
from MindSpore.common import dtype as mstype

def create_dataset(data_path, batch_size=32, repeat_size=1,


num_parallel_workers=1):
""" create dataset for train or test
Args:
data_path: Data path
batch_size: The number of data records in each group
repeat_size: The number of replicated data records
num_parallel_workers: The number of parallel workers
"""
# define dataset
mnist_ds = ds.MnistDataset(data_path)

# define operation parameters


Huawei MindSpore AI Development Framework Page 23

resize_height, resize_width = 32, 32


rescale = 1.0 / 255.0
shift = 0.0
rescale_nml = 1 / 0.3081
shift_nml = -1 * 0.1307 / 0.3081

# define map operations


resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR) # resize images
to (32, 32)
rescale_nml_op = CV.Rescale(rescale_nml, shift_nml) # normalize images
rescale_op = CV.Rescale(rescale, shift) # rescale images
hwc2chw_op = CV.HWC2CHW() # change shape from (height, width, channel) to (channel,
height, width) to fit network.
type_cast_op = C.TypeCast(mstype.int32) # change data type of label to int32 to fit network

# apply map operations on images


mnist_ds = mnist_ds.map(input_columns="label", operations=type_cast_op,
num_parallel_workers=num_parallel_workers)
mnist_ds = mnist_ds.map(input_columns="image", operations=resize_op,
num_parallel_workers=num_parallel_workers)
mnist_ds = mnist_ds.map(input_columns="image", operations=rescale_op,
num_parallel_workers=num_parallel_workers)
mnist_ds = mnist_ds.map(input_columns="image", operations=rescale_nml_op,
num_parallel_workers=num_parallel_workers)
mnist_ds = mnist_ds.map(input_columns="image", operations=hwc2chw_op,
num_parallel_workers=num_parallel_workers)

# apply DatasetOps
buffer_size = 10000
mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size) # 10000 as in LeNet train script
mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)
mnist_ds = mnist_ds.repeat(repeat_size)

return mnist_ds

where
batch_size: indicates the number of data records in each group. Currently, each group
contains 32 data records.
repeat_size: indicates the number of replicated data records.
Generally, perform the shuffle and batch operations, and then perform the repeat
operation to ensure that data during an epoch is unique.
MindSpore supports multiple data processing and enhancing operations, which are
usually used together. For details, see section "Data Processing and Data Enhancement".

5.2.4.4 Defining the Network


The LeNet network is relatively simple. In addition to the input layer, the LeNet network
has seven layers, including two convolutional layers, two down-sampling layers (pooling
layers), and three fully connected layers. Each layer contains different numbers of
training parameters, as shown in Figure 5-14:
Huawei MindSpore AI Development Framework Page 24

Figure 5-14 LeNet-5 structure

You need to initialize the full connection layers and convolutional layers.
TruncatedNormal: parameter initialization method. MindSpore supports multiple
parameter initialization methods, such as TruncatedNormal, Normal, and Uniform. For
details, see the description of the mindspore.common.initializer module of MindSpore
API.
The following is the sample code for initialization:

import MindSpore.nn as nn
from MindSpore.common.initializer import TruncatedNormal

def weight_variable():
"""
weight initial
"""
return TruncatedNormal(0.02)

def conv(in_channels, out_channels, kernel_size, stride=1, padding=0):


"""
conv layer weight initial
"""
weight = weight_variable()
return nn.Conv2d(in_channels, out_channels, _size=kernel_size,
stride=stride, padding=padding, weight_init=weight, has_bias=False,
pad_mode="valid")
def fc_with_initialize(input_channels, out_channels):
"""
fc layer weight initial
"""
weight = weight_variable()
bias = weight_variable()
return nn.Dense(input_channels, out_channels, weight, bias)

To use MindSpore for neural network definition, inherit mindspore.nn.cell.Cell. Cell is the
base class of all neural networks such as Conv2d.
Define each layer of a neural network in the __init__() method in advance, and then
define the construct() method to complete the forward construction of the neural
network. According to the structure of the LeNet network, define the network layers as
follows:

class LeNet5(nn.Cell):
Huawei MindSpore AI Development Framework Page 25

"""
Lenet network structure
"""
#define the operator required
def __init__(self):
super(LeNet5, self).__init__()
self.batch_size = 32
self.conv1 = conv(1, 6, 5)
self.conv2 = conv(6, 16, 5)
self.fc1 = fc_with_initialize(16 * 5 * 5, 120)
self.fc2 = fc_with_initialize(120, 84)
self.fc3 = fc_with_initialize(84, 10)
self.relu = nn.ReLU()
self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
self.flatten = nn.Flatten()

#use the preceding operators to construct networks


def construct(self, x):
x = self.conv1(x)
x = self.relu(x)
x = self.max_pool2d(x)
x = self.conv2(x)
x = self.relu(x)
x = self.max_pool2d(x)
x = self.flatten(x)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
x = self.relu(x)
x = self.fc3(x)
return x

5.2.4.5 Defining the Loss Function and Optimizer


 Basic concepts
Before definition, this section briefly describes concepts of loss function and
optimizer.
Loss function: It is also called objective function and is used to measure the
difference between a predicted value and an actual value. Deep learning reduces the
value of the loss function by continuous iteration. Defining a good loss function can
effectively improve the model performance.
Optimizer: It is used to minimize the loss function, improving the model during
training.
After the loss function is defined, the weight-related gradient of the loss function can
be obtained. The gradient is used to indicate the weight optimization direction for
the optimizer, improving model performance.
 Define the loss function.
Loss functions supported by MindSpore include SoftmaxCrossEntropyWithLogits,
L1Loss, MSELoss, and NLLLoss. The SoftmaxCrossEntropyWithLogits loss function is
used.
from MindSpore.nn.loss import SoftmaxCrossEntropyWithLogits
Huawei MindSpore AI Development Framework Page 26

Call the defined loss function in the __main__ function:

if __name__ == "__main__":
...
#define the loss function
net_loss = SoftmaxCrossEntropyWithLogits(is_grad=False, sparse=True, reduction='mean')

...
 Define the optimizer.
Optimizers supported by MindSpore include Adam, AdamWeightDecay, StepLRPolicy,
and Momentum. The popular Momentum optimizer is used in this example.

if __name__ == "__main__":
...
#learning rate setting
lr = 0.01
momentum = 0.9
#create the network
network = LeNet5()
#define the optimizer
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)
...

5.2.4.6 Running Rules and Viewing Results


Run the following command to run the initScript.sh script:

python lenet.py --device_target=CPU

where
lenet.py: indicates the script file that you write according to the tutorial.
--device_target CPU: specifies the running hardware platform. The parameter can be CPU,
GPU, or Ascend. You can specify the hardware platform based on the actual running
hardware platform.
Loss values are output during training, as shown in the following figure. Although loss
values may fluctuate, they gradually decrease and the accuracy gradually increases in
general. Loss values displayed each time may be different because of their randomicity.
The following is an example of loss printing during training:

epoch: 1 step: 262, loss is 1.9212162


epoch: 1 step: 263, loss is 1.8498616
epoch: 1 step: 264, loss is 1.7990671
epoch: 1 step: 265, loss is 1.9492403
epoch: 1 step: 266, loss is 2.0305142
epoch: 1 step: 267, loss is 2.0657792
epoch: 1 step: 268, loss is 1.9582214
epoch: 1 step: 269, loss is 0.9459006
epoch: 1 step: 270, loss is 0.8167224
epoch: 1 step: 271, loss is 0.7432692
...

The following is an example of model files saved after training:


Huawei MindSpore AI Development Framework Page 27

checkpoint_lenet-1_1875.ckpt

where
checkpoint_lenet-1_1875.ckpt: is the saved model parameter file. The file name format is
checkpoint_{network name}-{epoch No.}_{step No.}.ckpt.

5.2.4.7 Model Verification


After the model file is obtained, the result obtained by running the test data set by the
model is used to verify the generalization capability of the model.
Use the model.eval() interface to read the test data set.
Use the saved model parameters for inference.

from MindSpore.train.serialization import load_checkpoint, load_param_into_net


...
def test_net(args,network,model,mnist_path):
"""define the evaluation method"""
print("============== Starting Testing ==============")
#load the saved model for evaluation
param_dict = load_checkpoint("checkpoint_lenet-1_1875.ckpt")
#load parameter to the network
load_param_into_net(network, param_dict)
#load testing dataset
ds_eval = create_dataset(os.path.join(mnist_path, "test"))
acc = model.eval(ds_eval, dataset_sink_mode=False)
print("=========== Accuracy:{}=========".format(acc))

if __name__ == "__main__":
...
test_net(args, network, model, mnist_path)

where
load_checkpoint(): This API is used to load the checkpoint model parameter file and
return a parameter dictionary.
checkpoint_lenet-1_1875.ckpt: indicates the name of the saved checkpoint model file.
load_param_into_net: This API is used to load parameters to the network.
Use the run command to run your code script.
python lenet.py --device_target=CPU
where
lenet.py: indicates the script file that you write according to the tutorial.
--device_target CPU: specifies the running hardware platform. The parameter can be CPU,
GPU, or Ascend. You can specify the hardware platform based on the actual running
hardware platform.
Command output similar to the following is displayed:

============== Starting Testing ==============


========== Accuracy:{'Accuracy':0.9742588141025641} ===========
Huawei MindSpore AI Development Framework Page 28

The model accuracy data is displayed in the output content. In the example, the accuracy
reaches 97.4%, indicating a good model quality.

5.3 Summary
This section describes the Huawei-developed deep learning framework MindSpore. Three
technological innovations of the MindSpore design concept are first introduced: new
programming paradigm, new execution mode, and new collaboration mode, as well as
advantages such as easy development, efficient execution state, and flexible deployment
state. In the last section, the development and application of MindSpore are introduced,
and an actual example of image classification is used to illustrate the development
procedure.

5.4 Quiz
1. MindSpore is a Huawei-developed AI computing framework that implements device-
edge-cloud on-demand collaboration across all scenarios. It provides unified APIs for
all scenarios and provides end-to-end capabilities for AI model development,
execution, and deployment in all scenarios. What are the main features of the
MindSpore architecture?
2. To address the challenges faced by AI developers in the industry, such as high
development threshold, high operating costs, and difficult deployment. What are the
three technological innovations proposed by MindSpore to help developers develop
and deploy AI applications more easily and more efficiently?
3. Challenges to model execution under strong chip computing power include memory
wall problem, high interaction overhead, and difficult data supply. Some operations
are performed on the host, and some are performed on the device, so the interaction
overhead is much greater than the execution overhead, leading to the low
accelerator usage. What is the solution of MindSpore?
4. Use MindSpore to recognize MNIST handwritten digits.

You might also like