0% found this document useful (0 votes)
144 views14 pages

A Neuromorphic Processing System With Spike-Driven SNN Processor For Wearable ECG Classification

Uploaded by

silvalimaaf94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
144 views14 pages

A Neuromorphic Processing System With Spike-Driven SNN Processor For Wearable ECG Classification

Uploaded by

silvalimaaf94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/361861891

A Neuromorphic Processing System With Spike-Driven SNN Processor for


Wearable ECG Classification

Article in IEEE Transactions on Biomedical Circuits and Systems · July 2022


DOI: 10.1109/TBCAS.2022.3189364

CITATIONS READS
45 588

8 authors, including:

Haoming Chu Liyu Qian


Fudan University Fudan University
17 PUBLICATIONS 164 CITATIONS 4 PUBLICATIONS 58 CITATIONS

SEE PROFILE SEE PROFILE

Yuxiang Huan Zhuo Zou


Fudan University Fudan University
46 PUBLICATIONS 712 CITATIONS 193 PUBLICATIONS 3,826 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Haoming Chu on 22 August 2023.

The user has requested enhancement of the downloaded file.


IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 16, NO. 4, AUGUST 2022 511

A Neuromorphic Processing System With


Spike-Driven SNN Processor for Wearable
ECG Classification
Haoming Chu , Member, IEEE, Yulong Yan , Leijing Gan, Hao Jia , Liyu Qian,
Yuxiang Huan , Member, IEEE, Lirong Zheng , Senior Member, IEEE, and Zhuo Zou , Senior Member, IEEE

Abstract—This paper presents a neuromorphic processing sys-


tem with a spike-driven spiking neural network (SNN) processor
design for always-on wearable electrocardiogram (ECG) classifi-
cation. In the proposed system, the ECG signal is captured by
level crossing (LC) sampling, achieving native temporal coding with
single-bit data representation, which is directly fed into an SNN in
an event-driven manner. A hardware-aware spatio-temporal back-
propagation (STBP) is suggested as the training scheme to adapt
to the LC-based data representation and to generate lightweight
SNN models. Such a training scheme diminishes the firing rate of
the network with little plenty of classification accuracy loss, thus
reducing the switching activity of the circuits for low-power opera-
tion. A specialized SNN processor is designed with the spike-driven
processing flow and hierarchical memory access scheme. Validated
with field programmable gate arrays (FPGA) and evaluated in
40 nm CMOS technology for application-specific integrated circuit Fig. 1. Processing approach comparison. (a) Nyquist sampling with ANN
(ASIC) design, the SNN processor can achieve 98.22% classification processing. (b) Nyquist sampling data processed by an SNN after spike cod-
accuracy on the MIT-BIH database for 5-category classification, ing. (c) LC sampling with ANN processing. (d) LC sampling with 1-bit data
with an energy efficiency of 0.75 µJ/classification. representation followed by an SNN.

Index Terms—ECG classification, hardware-aware STBP,


neuromorphic processing, SNN processor, spiking neural network
(SNN). accuracy in the diagnosis of CVD patients [2]. Especially,
artificial intelligence (AI) assisted methods are recommended
I. INTRODUCTION for the effective detection and diagnosis of CVDs based on
ECG of patients [3], [4]. In recent researches, wearable devices
ARDIOVASCULAR diseases (CVDs) are considered by
C World Health Organization (WHO) the leading cause of
death globally and take around 17.9 million human lives in
such as smart watches [5], armbands [6] and disposable ECG
patches [7] have shown the capability to capture good-quality
ECG in a convenient and effective way. However, wearable
2019, corresponding to 32% of global deaths [1]. Electrocar- devices for always-on monitoring are usually battery-powered
diogram (ECG) as a mature approach to monitor the electri- and thus sensitive to power consumption. While advanced AI
cal activity of human hearts, has been proven to show high algorithms such as convolutional neural networks (CNNs) are
achieving superior performance on accurate classification, they
Manuscript received 6 March 2022; revised 15 May 2022 and 19 June 2022;
accepted 22 June 2022. Date of publication 8 July 2022; date of current version are too power-consuming, computation- and memory-intensive
10 October 2022. This work was supported in part by the National Natural for wearable devices. Therefore, there is an urgent need for
Science Foundation of China under Grants 61876039, 9216430, and 62076066, low-power processing systems for AI-assisted wearable ECG
in part by the Shanghai Municipal Science and Technology Major Project under
Grants 2021SHZDZX0103 and 2018SHZDZX01, and in part by the NSFC- applications.
STINT Project. This paper was recommended by Associate Editor V. Chen. Fig. 1 illustrates AI-assisted processing approaches for ac-
(Corresponding author: Zhuo Zou.) curate ECG classification with neural networks: (a) Nyquist
Haoming Chu, Yulong Yan, Leijing Gan, Hao Jia, Liyu Qian, Lirong Zheng,
and Zhuo Zou are with the State Key Laboratory of ASIC and System, sampling, e.g. 360 Hz sampling rate and n-bit resolution, fol-
Fudan University, Shanghai 200433, China (e-mail: [email protected]; lowed by an artificial neural network (ANN) [8]–[10]. This is
[email protected]; [email protected]; [email protected]; lyqian the classic approach for ECG classification with high accuracy,
[email protected]; [email protected]; [email protected]).
Yuxiang Huan is with the Large-scale Neuromorphic Computing Group, but requires complex network architecture, leading to intensive
Guangdong Institute of Intelligence Science and Technology, Guangdong calculation and high energy consumption. (b) Nyquist sampling
519031, China (e-mail: [email protected]). with spiking neural networks (SNN) [11], [12]. The quantified
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TBCAS.2022.3189364. value in multi-bit is converted to either rate coding or tempo-
Digital Object Identifier 10.1109/TBCAS.2022.3189364 ral coding for SNN-compatible representation. Though SNN
1932-4545 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
512 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 16, NO. 4, AUGUST 2022

Fig. 2. Processing stages of the proposed neuromorphic system for ECG classification. (a) ECG signal acquisition. (b) ECG signal converted to spike sequences
by LC sampling. (c) Spike data processed by spiking neurons in the SNN processor. (d) Prediction of the ECG type.

potentially facilitates energy-efficient inference, the conversion The rest of the paper is organized as follows. Section II intro-
process is still required. (c) Level crossing (LC) sampling with duces the neuromorphic processing system and its spike-driven
ANN processing [13]–[16]. For example, LC-sampled ECG data processing flow. In Section III, the hardware-aware SNN opti-
with 1D-CNN has been demonstrated promising classification mization is presented. The hardware architecture and design of
accuracy in [13] with 3× data compression. In [17], LC sampling the proposed SNN processor are shown in Section IV. Section V
is capable of achieving over 5× reduction in sampling points, presents the implementation and performance comparison, and
thus reduces data volume and processing complexity. However, Section VI concludes the paper.
the processing of ANN is still resource-consuming and com-
putationally intensive. (d) LC sampling with SNN processing.
For example, Corradi et al. [18] adopted LC sampling and SNN II. SYSTEM DESCRIPTION
processing. Yet the random-connected SNN on device side is
The proposed system is designed for neuromorphic process-
used for dimensionality expansion, while classification is still
ing with two stages: data acquisition and SNN processing.
carried out by a conventional support vector machine (SVM) on
PC side. Therefore, the event-driven peculiarity and sparsity of
neuromorphic systems are not fully exploited.
In this paper, we present an event-driven neuromorphic A. System Architecture
processing system that incorporates LC sampling and fully SNN Fig. 2 shows the proposed neuromorphic processing system
processing (the system concept was first briefly presented in for low-power ECG classification. In the data acquisition stage,
BioCAS2021 conference [19]). The main contributions of this the 2-lead analog ECG signal is sampled by an LC-ADC. LC
work are listed as follows: is a sampling scheme based on predefined threshold levels [20]
a) We propose a novel LC sampling and SNN processing where a pulse is generated when the signal crosses the threshold.
approach. The event of level-crossing analog-to-digital As illustrated in Fig. 3 (top part), the predefined thresholds
converter (LC-ADC) is represented by a single bit, thus are displayed by the horizontal red dash line. In this work,
simplifies the quantization process for ADC, meanwhile the full-scale voltage range of the ECG signal is considered
achieves native temporal coding and can be fed into the to be 10 mVpp [13], which ensures the industry standard of
SNN directly without traditional spike coding. ambulatory equipments [21], and matches the dynamic range
b) We improve the spatio-temporal backpropagation (STBP) of the MIT-BIH Arrhythmia Database [22], while the level
training scheme for hardware-aware optimization to adapt interval is set to 0.1 mV for LC sampling. A pulse known
to LC sampled data representation. It aims to minimize the as the sampling spike is generated at the moment the ECG
firing rate without sacrificing the accuracy, thus improving signal crosses the threshold. Two channels of spikes, i.e., the
the energy efficiency of the overall system. Trained with rising and falling spikes, are generated in response depending
the MIT-BIH database for 5-category classification task, on whether the upper threshold or lower threshold is crossed
the proposed system reaches 98.22% in classification ac- by the ECG signal. The sampling point is decided by the signal
curacy. level instead of clock frequency and the intensity of LC sam-
c) We design a specialized SNN processor that performs pling is determined by the variation rate of the original signal,
classification tasks for the LC-samples directly. Spike- achieving nonuniform sampling and resulting in the reduction
driven processing flow in which the computation is trig- of data amount and power consumption. Compared to re-coding
gered by sparse spikes is adopted for low-power SNN methods such as rate-coding which turns digital ADC values
processing. Hierarchical memory access is implemented into spikes, LC natively contains temporal information and is
for sparse weight decoding, diminishing memory require- more friendly to neuromorphic processing. In this work, the
ment and improving energy efficiency. The SNN processor 2-lead ECG signals (Fig. 2(a)) in the MIT-BIH Arrhythmia
is verified with field programmable gate array (FPGA) Database [22] are converted by the LC-ADC into nonuniform
validation and implemented to an application-specific inte- 4-channel spike sequences, i.e., the rising and falling spikes of
grated circuit (ASIC) design in 40 nm CMOS technology, each lead (Fig. 2(b)). The spike is represented by a single bit
achieving an energy efficiency of 0.75 μJ per classification. thus simplifies the quantization process of traditional ADCs.

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEUROMORPHIC PROCESSING SYSTEM WITH SPIKE-DRIVEN SNN PROCESSOR FOR WEARABLE ECG CLASSIFICATION 513

number of 48 time slots by means of truncation or zero extension,


i.e., heart beats longer than 48 time slots will be truncated, while
zeros are filled to the heart beat when the length is less than
48 time slots. The 4 channels of 48-slot data are aggregated
to 192 slots in Fig. 3(c) and further grouped by including 96
time slots as a window, thus resulting in an overlap of 48 time
slots between two adjacent windows. The spikes of the three
windows are fed into the SNN sequentially as the inputs of three
time steps. This maintains the integrity of the ECG data, and
ensures that the spikes of the same channel appear at the same
place in the 192-slot ECG data for every sample of the MIT-BIH
database. The three windows are the same in length of 96 time
slots, corresponding to the 96 input neurons in the SNN that will
be presented in detail in Section III-A.
In the SNN processing stage, the specialized SNN processor
implements a spiking recurrent Multi-Layer Perceptron (rMLP)
for accurate ECG classification. The spiking rMLP adopts Leaky
Integrate-and-Fire (LIF) neuron model to process LC-sampled
1-bit spike data via an interface, which uses First-In-First-Out
(FIFO) based buffer structure with a counter to transform the
spike trains into SNN-compatible inputs. Parameters of the
spiking rMLP are trained to match the feature distribution of
different spike patterns (Fig. 2(c)). The output neurons corre-
sponding to 5 diverse cardiac arrhythmia types respond to the
most fitted feature distribution, thus indicating the classification
results of the ECG sample (Fig. 2(d)). The labels of classification
in Fig. 2(d) represent 5 classes of heart beats in accordance with
Fig. 3. Data acquisition and segmentation of the ECG data in MIT-BIH the Association for the Advancement of Medical Instrumentation
database. (a) The 2-lead ECG data are LC-sampled and transformed into 4- (AAMI) standard [23], i.e., N for normal beats, S for supraven-
channel spike sequences. (b) In each channel, the spike sequences are segmented tricular ectopic beats, V for ventricular ectopic beats, F for fusion
by consecutive time slots. The length of a heart beat contains a fixed number of
48 time slots by means of truncation or zero extension. (c) The 4 channels of beats, and Q for unclassified beats.
48-slot data are aggregated to 192 slots, which will be further grouped by three In summary, the proposed neuromorphic processing architec-
overlapped windows as input spikes in three time steps to the SNN. ture improves energy efficiency in two manners. a) LC sampling
implements an event-driven sampling scheme to reduce sam-
pling points thus curtails data volume. b) SNN processes data in a
As shown in Fig. 3 (top part), the ECG signal in MIT-BIH spike-driven manner that takes advantage of the sparsity of spike
Arrhythmia Database is segmented into multiple samples ac- sequences and reduces computation intensity. SNN calculations
cording to the annotation of every beat. Since the beat type is also replace most of the multiplication operations in ANN with
annotated at each R peak, a common method to partition the ECG accumulation, which lowers the computation complexity and
signal is to collect a fixed number of sampling points before and resource requirements in hardware.
after each R peak as the sampling beat [12], [14]–[16], resulting
in a fixed length of each sample. However, the length of heart
beats fluctuates with different patients and different periods of a B. Neuron Model
patient, which may cause missing information in the collection In order to achieve accurate and low-power ECG classification
of sampling beats. Therefore, we propose a method to ensure the with an SNN, the neuron model is supposed to be simple in
consecutive collecting of sampling beats with variable lengths. calculation and compatible with LC sampled data. Therefore LIF
As shown in Fig. 3 (top part), the middle point of the current R neurons are adopted in this work due to their low computation
peak and the previous R peak is selected as the start point of the complexity compared to Hodgkin and Huxley (HH) neurons
current sampling beat. Correspondingly, the middle point of the thus reduce processing cost, and the capability of carrying
current and the following R peak is set as the end point. more temporal information compared to Integrate-and-Fire (IF)
In the data acquisition and segmentation phase, the 2-lead neurons thus be capable of processing temporal sequences effec-
ECG signal in Fig. 3(a) is first LC-sampled to transform to tively. The calculation of the proposed SNN with LIF neurons
4-channel spike sequences as illustrated in Fig. 3(b). In each is illustrated in Algorithm 1. SNN processes the input spike
channel, the spike sequences are segmented by consecutive time sequence X with synaptic weight W and bias b to generate the
slots. Each time slot corresponds to an input neuron with a time output spike train Y .
duration of 10 data points of the original ECG signal in the T is the number of time steps of the SNN processing, St is
MIT-BIH database. The length of a heart beat contains a fixed the collection of spike events at moment t, uti represents the

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
514 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 16, NO. 4, AUGUST 2022

Fig. 4. Spike-driven processing flow of SNN with LIF neurons.

of whether there is a spike or not, due to the leakage of the


neuron model. Simulation results show that in the original model
without hardware-aware optimization, the weight accumulation
is executed for over 50 times more than the spike generation
operations. Therefore, the spike-driven processing flow along
with the hardware-aware training method (which is elaborated
in Section III) is adopted to speed up the processing of SNNs and
reduce power consumption. Compared with traversal of neurons
membrane potential of the i-th neuron at moment t, urst is the to accomplish synaptic operations, the spike-driven approach is
resting potential, uth is the threshold potential, tlast_sp is the more energy-efficient because the computation intensity is pro-
last time when the current neuron spikes, Tref is the refractory portional to the number of synaptic operations. Benefiting from
− Δt the hardware-aware training method that provides SNNs with
period, bi is the bias of i-th neuron layer, e τi is the leaky factor.
sparse synaptic connections and low firing rate, the spike-driven
At the beginning of the calculation, the input spikes are merged
processing flow achieves the reduction of power consumption
to the spike collection of the previous time step, while the spike
and 4× processing speed acceleration.
collection of the current time step is initialized as an empty set.
In the weight accumulation stage, the original synaptic calcula-
tions are accomplished by multiplication and accumulation of III. HARDWARE-AWARE SNN OPTIMIZATION
spiking matrix and weight matrix, which are simplified as index To achieve the energy-efficient classification, the SNN shall
and accumulation operations in this work. The neuromorphic be simple but capable of processing the LC sampled data di-
processing unit (NPU) takes the spiking matrix as an index and rectly. Hardware-aware training is adopted based on the STBP
accumulates corresponding weights to the membrane potential, approach and improved with a 3-step optimization: firing rate
avoiding the resource-consuming multiplication operations thus mitigation, pruning and fixed-point quantization, aiming to min-
saving processing energy effectively. imize the processing power and the storage overhead, while
ensuring a guaranteed classification accuracy.
C. Spike-Driven Processing Flow
A. Network Topology
Spike-driven processing flow is adopted in this work, which
means that the post-neuron synaptic operation calculations are The SNN classifier is implemented with a spiking recurrent
only activated when the current neuron generates a spike. From MLP (rMLP) model. As illustrated in Fig. 5, the model contains
Fig. 4 we can see that the main calculations in Algorithm 1 271 LIF neurons organized as input96 − r120 − sc50 − f c5,
are in three parts: the membrane potential update, the spike where ‘r120’ denotes the recurrent layer of 120 neurons, ‘sc50’
activation and the leakage calculation. On the one hand, the represents the sparse connection layer with 50 neurons, and
weight accumulation in the membrane potential update is spike- ‘f c5’ is the fully connected layer of 5 neurons. The recurrent
triggered and executed for multiple times in each time step. On layer is named the liquid pool, and work [18], [24] have pre-
the other hand, the spike generation, which is formed by the sented detailed descriptions of it. The neurons in the liquid
bias calculation, the spike activation and the leakage calculation, pool can construct recurrent and sparse interconnections. The
needs to be continuously and periodically updated regardless input spikes are connected to the neurons in the liquid pool,

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEUROMORPHIC PROCESSING SYSTEM WITH SPIKE-DRIVEN SNN PROCESSOR FOR WEARABLE ECG CLASSIFICATION 515

Fig. 6. (a) Adjacency matrix used to represent network connections to enable


more flexible structures. (b) The forward computation graph and backpropaga-
tion path of the SNN with the improved STBP algorithm.

Fig. 5. Network topology of the proposed spiking rMLP model. where T represents the number of time steps. y is the one-hot
ground truth label. sti ∈ {0, 1} denotes a spike event of the i-th
achieving temporal feature processing and resulting in improved neuron at the moment t. sto is the output vector, the sum of which
classification accuracy. The last 2 layers constitute the classifier represents the SNN prediction result. w is the synaptic weight.
which takes the full-connection structure to recognize different The first term in loss function (MSE) measures the network error
types of LC-sampled ECG signals, while the input to liquid pool by the difference between the ground truth and the prediction,
connection, the liquid pool internal recurrent connection, and the which declines in the training process with gradient descent and
liquid pool to classifier connection are all sparse connections parameter optimization. The second term in the loss function
to reduce the processing time and cost. The whole model, penalizes all spikes except the output layer by L2 normalization,
including the liquid pool and the classifier, is trained with the where λs is the regularization coefficient of spiking sparsity.
hardware-aware STBP method to form an accurate and sparse This term encourages the SNN to represent features with sparser
SNN. spikes, and improves accuracy by suppressing overfitting. The
third term in the loss function is the L1 regularization of the
synaptic weight, with the weight decay λw . Through the L1 term,
B. Hardware-Aware Training the weights decay in the optimization until a sparse solution is
The firing rate and synaptic density determinate circuit ac- reached.
tivity thus greatly affect the power consumption. Therefore In order to support the recurrent liquid pool, the adjacency
the optimization target is to minimize the firing rate and the graph is used to represent the entire network to enable a more
number of synapses whereas maintain an acceptable accuracy. flexible structure as shown in Fig. 6(a). The number of columns
Since study [25] has shown that synapses store only a few bits and rows of the adjacency matrix are both the number of neu-
of information, parameter quantization is further introduced to rons in the network. The index of a non-zero element in the
enable fixed-point computations to reduce hardware burden. matrix indicates the connection from a pre-synaptic neuron to
STBP [24], [26] is adopted and generalized to utilize the tem- a post-synaptic neuron, and the value of the non-zero element
poral information of ECG signals sampled by the LC-ADC. represents the synaptic weight. A zero element indicates that
The original STBP algorithm [26] models the LIF dynamic there is no connection between the corresponding neurons.
process in the discrete time domain and propagates spikes layer For instance, the ‘l1 → l2 ’ block in Fig. 6(a) represents the
by layer, so that the backpropagation can be deduced in both connections from the input layer to the liquid pool, and the
spatial domain and time domain, which enables the extraction ‘l2 → l2 ’ block indicates the recurrent connections inside the
of spatio-temporal features of the SNN. Based on this, we liquid pool. Such an adjacency matrix allows spiking neural
developed the algorithm (G-STBP algorithm [24]) to support networks with arbitrary connections, and thus improves the
the recurrent structure (i.e., the liquid pool) and further improved flexibility of network structure.
this algorithm for low-power applications with hardware-aware In this context, the forward computation graph and backprop-
optimization, specifically, firing rate mitigation, pruning and agation path of the SNN can be further described in Fig. 6(b).
fixed-point quantization. The loss function consists of the mean Taking the i-th neuron as an example, in the time domain,
squared error (MSE) term to measure the classification error, the membrane potential uti at moment t affects the membrane
and the spike and weight regularization term to optimize firing potential ut+1
i at moment t + 1, while in the spatial domain,
rate and number of synapses: the spike generated by uti affects all the post-synaptic neurons
with synaptic weights. Therefore, the error comes from the two
 2
1 1  t
backpropagation paths: one is the backpropagation with time
 λs     
T T
 sti 2 + λw
L = y− so  + w1 since the membrane potential is calculated iteratively in the
2 T t=1  2 2
time domain, and the other is the error backpropagation through
t=1 2 ∈output
i/
(1) synaptic connections, which further includes the error of the

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
516 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 16, NO. 4, AUGUST 2022

Fig. 7. Joint effort with firing rate mitigation, pruning and fixed-point quantization on the reduction of processing energy while keeping acceptable classification
accuracy. The left part visualizes the process of the 3-step training, and the right part reports the tradeoffs in each optimization step in detail with a table. The
selected baseline is highlighted in the table.

spiking regularization. It is to mention that the output neurons demonstrated in Fig. 7. The left part visualizes the process of the
have no post-neuron connections, and only receive the gradient 3-step training, in which the vertical axis is the final accuracy of
of the MSE loss. Therefore, the gradient of the loss function each SNN model and the horizontal axis is the normalized energy
with respect to the membrane potential is as follows: consumption with floating-point addition. The energy cost of
⎧  T
floating-point multiplication and fixed-point operation relative

⎪ − 1
− 1
sti · h(uti ) , i ∈ output to floating-point addition is converted according to [28]–[30].

⎨ T y i T
i=1 The area of the circle shows the parameter storage overhead of
∇uti = ∇ut+1 (2)
i · α(si −1) + ∇ut+1
j ·wji · h(ui )
t t

⎪ the SNN. The right part reports the tradeoffs in each optimization

⎩ j
+ λs sti · h(uti ) , otherwise step in detail with a table, in which the selected parameters are
highlighted. Hardware-aware optimization is mainly composed
where h is the surrogate gradient of the spiking function, of the following three aspects:
2 − Δt
specifically h(u) = √12π e−(u−uth ) . α = e τi is the leaky fac- 1) Firing Rate Mitigation: Frequent spiking events in hard-
tor. wji represents the synaptic weight of the connection from ware lead to intensive circuit computation. Therefore spiking
the j-th neuron to the i-th neuron. The gradients of synaptic regularization is implemented to penalize frequent spiking ac-
weight and bias can be calculated from the membrane potential tivities. By gradually increasing λs in Equation (1), the firing
gradient: rate of the SNN will decrease due to regularization, and the
  normalized power consumption of the SNN in Fig. 7 is continu-
∇wij = ∇uti · st−1
j + λw · sign(wij ), ∇bi = ∇uti ously optimized. Due to the spiking regularization preventing
t t overfitting, the accuracy is also improved at the beginning.
(3)
Finally, λs = 10−5 is selected as the appropriate regularization
Once the gradients are calculated, it is easy to perform param-
coefficient because of its obvious optimization of power con-
eter optimization according to parameter update algorithms such
sumption and high accuracy.
as adaptive momentum estimation (Adam), which is explained
2) Pruning: The L1 regularization of synaptic weights
in detail in [27].
makes the weights decay until a sparse solution is found.
Synaptic pruning is further achieved by setting all weights with
C. Evaluation w < Θw to zero, where Θw is the pruning threshold. At this
The LC-sampled ECG signals in the MIT-BIH database are point, the number of synapses is optimized, which not only
used to train the SNN for 5-category classification. In the training reduces the storage overhead of parameters, but also contributes
process of the proposed rMLP, the dataset is divided with the to the improvement of computational complexity by reducing
ratio of 4:1 for the training set and the test set, respectively. For the number of synapses that corresponds to the same spike
each type of all the cardiac arrhythmia types, we choose random activity. As shown in Fig. 7, further pruning is applied to the
samples to form the training and test set with the same ratio, i.e., spiking-sparse SNN model. The storage overhead and power
4:1, to ensure that the dataset is allocated fairly for all classes. consumption of the network are continuously reduced, along
The improved STBP training process not only reduces the with a decrease in accuracy. Here we choose λw = 10−2 , Θw =
classification error of SNN by minimizing the MSE term, but 10−2 as the suitable pruning parameter.
also achieves the hardware-aware optimization (i.e., firing rate 3) Fixed-Point Quantization: Fixed-point quantization not
mitigation, pruning, and fixed-point quantization), which is only reduces storage overhead, but also enables fixed-point

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEUROMORPHIC PROCESSING SYSTEM WITH SPIKE-DRIVEN SNN PROCESSOR FOR WEARABLE ECG CLASSIFICATION 517

sparsity and increase memory efficiency. The spike processing


part contains a spike-driven NPU and a memory block named
Neuron_State. The spike-driven NPU triggered by sparse spike
data receives weight values from the data dispatcher via AXI
Stream protocol and access Neuron_State for neuron informa-
tion, accomplishing membrane potential calculation and spike
processing in Algorithm 1. The finite state machine (FSM)
is implemented to organize the computation process of the
system according to the procedure in Algorithm 1 by send-
ing control signals to the NPU. The universal asynchronous
receiver-transmitter (UART) serial bus interface is implemented
to receive instructions and transmit spike data with external
Fig. 8. System architecture. Data acquisition is realized by an LC-ADC devices due to its simple structure to save hardware resources.
and demonstrated with Simulink. SNN processor is verified with ModelSim
simulation and FPGA validation.
B. NPU With FSMs
operations to reduce the energy burden in computation. All The NPU as shown in Fig. 9(a) is used to carry out the mem-
parameters are clamped to the range of (−1, 1) during train- brane potential calculation and spike processing of all neurons
ing and then all parameters are quantized by n-bit fixed-point of the SNN iteratively in a spike-driven manner. It utilizes two
numbers after training, where the first bit is the sign bit and FSMs to orchestrate the spike-driven processing flow, i.e., the
− Δt
other n − 1 bits are the fractional part. The leaky factor e τi Top_FSM and the Neuron_FSM. The Neuron_FSM is used to
is set to 0.25, so the leakage multiplication of the membrane carry out the processing and control flow of the spike-driven
potential can be replaced by a 2-bit right shift operation. The NPU and organize the time-division multiplexing of the NPU
leaky factor as a hyperparameter, is adjusted with the aim of to complete the computations of all the neurons available. The
improving classification accuracy, and in the training process Top_FSM is implemented to organize the continuous processing
in this work, the leaky factor of 0.25 is found to achieve the of the SNN.
best accuracy. As shown in Fig. 7, fixed-point quantization As illustrated in Fig. 9(a), the Top_FSM separates the calcu-
incurs a loss of precision, but the power optimization is also lations of each LIF neuron in every time step into four major
significant. With a trade-off between classification accuracy and states: IDLE, NEU_INIT, W_ACC, and V_CALCU. The state
power consumption, 6-bit fixed-point quantization is selected as transition diagram of Top_FSM is exhibited in Fig. 9(b). IDLE
the parameter precision for the final model. is the idle state and NEU_INIT is the initialization state where the
With the joint effort of firing rate mitigation, pruning, and NPU accomplishes network configuration of the SNN process-
fixed-point quantization, the synaptic connections are reduced ing via UART interface. W_ACC is the spike-triggered weight
from 32,170 to 12,600, with a 2.6× reduction. The total model accumulation state, in which a control signal is generated to
size is mitigated for 13.6× and the firing rate is restrained to drive the NPU into weight accumulation mode. After iteration
15.40%, resulting in 6,527 synaptic operations per inference of accumulating the synaptic weights of all connections with all
with over 4× reduction. Meanwhile, the network suffers from input spikes to the membrane potential, the Top_FSM moves to
little accuracy loss from the original 98.39% to the 6-bit model the next state. In contrast, V_CALCU is a neuron state update
of 98.22%. process that is carried out periodically regardless of spikes, in
which the control signal drives the NPU into spike generation
IV. SNN PROCESSOR DESIGN mode to calculate the membrane potential with bias and deter-
mine whether to spike, finishing the computation of the current
A. SNN Processor Architecture
time step.
The SNN processor design for event-driven ECG classifica- In the weight accumulation mode, when a spike arrives at the
tion is elaborated in Fig. 8 (right part). It is composed of five processor, the NPU obtains the synaptic connections from the
parts: spike merge, sparse weight decoding, spike processing, sparse weight decoding block. The synaptic weight is loaded
FSM, and UART. The spike merge block collects internal spikes to w_reg and the membrane potential from N euron_State
generated by the NPU calculation of the previous time step is loaded to J_reg. In J_ACC stage of the Neuron_FSM, the
and external spikes sampled by the LC-ADC in the current NPU configures the adder to accumulate the synaptic weight
time step, providing data for the sparse weight decoding block, to the membrane potential J_reg before writing it back to Neu-
corresponding to the merging process in Algorithm 1. The two ron_State. The NPU repeats the accumulation until all of the
kinds of spikes are both the pre-neuron spikes of the current time spikes within the current time step are processed and switches
step, meaning that after being merged in the spike merge block, to spike generation mode. In this mode, the NPU loads the mem-
the spikes are treated the same and processed together. The brane potential and bias of each neuron from N euron_State
sparse weight decoding block consists of a data dispatcher and to J_reg and bias_reg correspondingly before configuring the
two memory blocks to access synaptic weight values. The data adder to calculate the new membrane potential in V_BIAS
dispatcher achieves hierarchical memory access to exploit data stage. Then the membrane potential is compared with threshold

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
518 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 16, NO. 4, AUGUST 2022

Fig. 9. (a) The processing and time-division multiplexing of the spike-driven NPU are orchestrated by two FSMs named the Top_FSM and the Neuron_FSM.
Arrows with solid lines are data paths, and arrows with dashed lines are control signals. (b) State transition diagram of Top_FSM. (c) State transition diagram of
Neuron_FSM.

voltage to decide the spike generation result. If the membrane to 504.14 μs at 100 MHz, and 775.6 ms at 65 kHz, respec-
potential exceeds the threshold, the neuron spikes and resets tively. This ensures the real-time processing capability for ECG
J_reg, otherwise it is multiplied with a leaky factor exp_reg. classification applications assuming a person with the heart rate
After all the neurons in the network are processed, the NPU is of 75 bpm (0.8 s).
controlled to switch to weight accumulation mode again by the
Neuron_FSM and start the computation of the next time step.
The state transition diagram of the Neuron_FSM is exhibited in C. Hierarchical Memory Access
Fig. 9(c). On-chip static random-access memory (SRAM) is used in
With the two FSMs, the NPU implements only one adder and this work to store synaptic weights and inherent parameters
one multiplier for the calculation of all neurons in the SNN with of neurons that occupies a large part of the silicon area. Due
time-division multiplexing. Fig. 9(a) shows that the latency of to the use of the hardware-aware STBP training method that
a weight accumulation process (W_ACC) is 7 cycles, and the leads to sparse connected SNN model, memory organization for
latency of a spike generation process (V_CALCU) is 9 cycles. storing and accessing the synaptic weights needs to be specially
The proposed SNN processor is designed to support a maximum considered to take good use of the sparsity of the network.
number of 1,024 neurons through time-division multiplexing In conventional approaches, a memory organization of dense
and 32 K synaptic connections. The maximum neuron number storage is commonly adopted for storing weight information
of 1,024 is considered mainly for scalability and versatility so which is accessed by the processor in computation. As illustrated
that the SNN processor can be applied to more kinds of tasks and in Fig. 10(a), this is a memory organization that contains all pos-
various network structures. Assuming an SNN with a 25% firing sible connections. The first N words in the W eight_Storage
rate and an average fan-out of 32, the total number of cycles to contain all synaptic connections of neuron #1, corresponding
update the time-multiplexed 1,024 neurons is calculated to be to the first word in N euron_State, which is a memory block
65,560, yielding 655.6 μs latency at 100 MHz clock frequency. that stores the inherent parameters of each neuron. In this way,
In other words, the 1,024-neuron SNN processor is capable of all synaptic weights are stored sequentially in the SRAM. With
achieving a complete update of one time step within 1 ms. this storage organization, the synaptic weight can be read in
In the specific ECG classification application, mapping the one cycle, achieving quick access without decoding process.
trained model of 271 neurons to the processor and accomplishing However, this storage organization requires the processor to tra-
a classification with 6,527 synaptic operations in three time verse every synaptic connection in the processing of every time
steps, the total number of cycles required is 50,414, leading step, which may be ineffective for sparse networks. The memory

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEUROMORPHIC PROCESSING SYSTEM WITH SPIKE-DRIVEN SNN PROCESSOR FOR WEARABLE ECG CLASSIFICATION 519

Fig. 10. (a) Memory organization with dense storage. (b) Hierarchical memory access scheme with sparse storage.

requirement for this storage organization can be calculated by


RAMdense = Wweight × N × N + N × (WV + Wbias + 1)
(4)
where Wweight , WV and Wbias are the bit width of synaptic
weight, membrane potential and calculation bias, respectively.
N represents the total number of neurons in the processor.
Equation (4) indicates that with the network size increasing, the
memory requirement grows exponentially. Hence, this memory
organization is suitable for lightweight networks with dense
connections.
Compared with the dense storage, this paper proposes a hier-
archical sparse memory access scheme to achieve sparse weight
decoding and irregular data access. As illustrated in Fig. 10(b),
the hierarchical memory structure is composed of three blocks
of memory: W eight_Entrance, W eight_Connection,
and N euron_State. W eight_Entrance contains the Fig. 11. Memory requirements comparison for sparse and dense memory
connection index of each spiking neuron, leading to the organizations with different neuron numbers.
W eight_Connection block which stores the specific
information of all connections, i.e., the synaptic weight
weight and destination neuron ID dst_id. The N euron_State
where F O represents the average connection fan-out.
block stores the inherent parameters of each neuron, i.e.,
Fig. 11 presents the memory requirements comparison of
the membrane potential V and calculation bias bias. The
sparse and dense memory organizations with different neuron
enabling flag en is used for indicating required neurons for
numbers. In this design, considering the versatility and scalabil-
the NPU to skip the calculation of disabled neurons. When
ity of the system, the SNN processor is designed to support
a spike arrives at a neuron in the SNN, the NPU obtains
SNNs containing 1,024 neurons with an average connection
the corresponding data word in W eight_Entrance, where
fan-out of 32 and a max fan-out of 512, for the capability of
w_addr is the starting address of the data block of the current
extending to the classification of ECG signals with more leads
neuron, and cnt is the connection fan-out, i.e., the number of
which requires more complex networks. From Fig. 11, it is
consecutive words to be accessed in W eight_Connection. The
obvious that the sparse memory organization has much lower
data block is read word-by-word successively and each word in
SRAM requirement than the dense storage. Calculated accord-
W eight_Connection indicates one synaptic connection from
ing to Equation (4), the memory requirement of dense storage
the current neuron to the neuron ID of dst_id, with the synaptic
can be as high as 770 KB with 6-bit synaptic weights, which
weight of weight. Then the dst_id-th word of N euron_State
takes up too many resources for always-on low-power wearable
is accessed for the membrane potential calculation of the
devices with single-chip integration. From Equation (5) it is
dst_id-th neuron. With the sparse storage organization, the
clear that the memory requirement of sparse storage is 69 KB,
memory requirement can be calculated by
achieving over 11× reduction compared with the dense storage.
RAMsparse = N × (log2 (N × F O) + log2 N ) For the ECG classification model proposed in Section III-A, the
number of required neurons is trained to be 271 with 12,600
+ N × F O × (Wweight + log2 N )
synaptic weights. It requires 56.9 KB of memory with the dense
+ N × (WV + Wbias + 1) (5) storage scheme according to Equation (4). When the sparse

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
520 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 16, NO. 4, AUGUST 2022

TABLE I
RESOURCE UTILIZATION OF FPGA

Fig. 12. Simulation result of LC-ADC in Simulink.

memory organization is applied, the required memory is reduced


to 24.3 KB, as expressed in Equation (5). The sparse storage
organization requires hierarchical weight decoding, leading to
1-cycle latency in the access of synaptic weight, thus slow-
ing down the processing of each synaptic operation by 50%.
Nevertheless, this storage organization enables the spike-driven
processing flow of the SNN processor, making computation
intensity proportional to the firing rate of neurons. Moreover, Fig. 13. FPGA prototyping for the validation of the proposed SNN processor.
the spike-driven processing flow avoids the traversal of synaptic One PC and two FPGAs are employed to build the testing system. The PC
weight memory, further accelerating the spike processing and generates ECG data and send it to FPGA1, which emulates the functional behav-
ior the LC-ADC and generates LC-sampled single-bit input spikes sequences.
reducing computation intensity. FPGA2 implements the SNN processor to process the LC-sampled data and
The parameters are quantized with fixed-point calculation send classification results back to PC.
and synaptic weights are quantized to 6 bits maximumly. The
membrane potential V has extra 2 bits compared with weight
to avoid overflow caused by weight accumulation in the NPU reducing the computation intensity by 2.6×, and the network
processing. With the hierarchical memory access design, the firing rate is lowered from 25.59% to 15.40%, achieving an extra
memory requirement declines for over 11× and the spike-driven 1.7× reduction of computation amount.
processing method is achieved.
V. IMPLEMENTATION AND EVALUATION
D. Behavior-Level Simulation
The proposed SNN processor for ECG classification is sim-
In order to validate the system concept, the hardware ar- ulated with ModelSim and verified with an FPGA. The ASIC
chitecture is designed and implemented in behavior level with design is synthesized by Synopsys Design Compiler and the
a co-simulation of Simulink and ModelSim. As illustrated in placement and routing are accomplished by Cadence Innovus in
Fig. 8, the system is composed of two parts: the LC-ADC and 40 nm CMOS process.
the SNN processor. The LC-ADC is modeled by Simulink and
the SNN is implemented by Verilog HDL.
A. FPGA Prototyping
The LC-ADC model is built in Simulink with the structure
shown in Fig. 8 (left part). The input ECG signal is converted As illustrated in Fig. 13, the testing setup for the SNN proces-
to the increase pulses INC and the decrease pulses DEC via two sor and system is composed of two Xilinx Artix-7 XC7A100 T
comparators. The pulses control the feedback loop and update FPGAs and a PC. The PC generates ECG data and sends it
V_H and V_L to encase the input signal. The up/down counter to FPGA1 for LC sampling. FPGA1 emulates the functional
(U/D Counter) implemented with Verilog HDL in Modelsim behavior of the LC-ADC to generate the bitstreams and feed
performs as the control unit triggered by the pulses to refresh them into the SNN processor, i.e., to deal with the ECG data in
the comparison window. MIT-BIH Arrhythmia Database and transform them into input
Fig. 12 exhibits the simulation results of LC-ADC in spike sequences. FPGA2 is the implementation of the SNN
Simulink. The comparison window is updated constantly in processor, processing the LC-sampled data and sending clas-
continuous time domain and generates spike sequences, of which sification results to the receiver on PC. The results on PC show
the density is proportional to signal activity, achieving adaptive that the proposed SNN processor classifies the ECG samples
resolution alteration based on the ECG variation rate. Compared correctly, and displays exactly the same results as the software
with the 360 Hz Nyquist sampling of 2-lead ECG wave with simulation.
11-bit resolution, the LC sampling achieves a 55.9× reduction The resource utilizations are listed in Table I. The proposed
in data volume with an only 1-bit representation of spikes. SNN processor only takes 619 look-up tables (LUT), 40 LUT-
The SNN processor is implemented in Verilog HDL and based RAM (LUTRAM), and 783 flip-flops (FF) which are all
simulated using the SNN model in Section III-A. Trained by the less than 1% of the available logic resources. 17 block RAM
proposed hardware-aware STBP scheme and the LC-sampled (BRAM) out of 135 are utilized as the memory block, taking
data, the synaptic connections are reduced from 31.4 K to 12.3 K, 12.59% of the total BRAM resources.

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEUROMORPHIC PROCESSING SYSTEM WITH SPIKE-DRIVEN SNN PROCESSOR FOR WEARABLE ECG CLASSIFICATION 521

TABLE II
COMPARISONS WITH RELATED WORKS ON MIT-BIH DATASET


measured # post-layout simulated  estimated by the results reported in the paper.

Fig. 14. Layout and ASIC specification of the proposed SNN processor.

Fig. 15. Power breakdown and equivalent power consumption for real-time
B. ASIC Implementation processing at 75 bpm.

The ASIC design is implemented with Verilog HDL and syn-


thesized by Synopsys Design Compiler. The physical design is
accomplished by Cadence Innovus with 40 nm CMOS process. with other works, yet with superior power and energy efficiency
As shown in Fig. 14, the core processing area of the ASIC performance.
excluding I/O pads is 649 μm × 500 μm, taking up 0.3246 mm2 . It is worth mentioning that the proposed LC-SNN architec-
Post layout simulation is performed in Synopsys VCS and the ture fits SNN processing natively. As an alternative to ANN,
switching activities are recorded with a dump file, which is used such an architecture exhibits great potential that can be further
in Synopsys PrimeTime for power analysis. The simulated core explored. Our SNN classifier uses a larger model size than some
power consumption of the SNN processor is given in Fig. 15. ANN classifiers, e.g., the ANNs in [10], [15], [16]. One of the
The processor operates at clock frequencies from 65 kHz to major reasons is that the training algorithms and frameworks
100 MHz, with the latency of 775.6 ms - 504.14 μs. As the power of SNNs are not as well-developed as ANNs, which makes
reported in Fig. 15, lower frequency has low dynamic power, the SNN difficult to compete with ANN in terms of model
but the leakage takes a growing part in the power consumption size, accuracy, and energy in most cases. For SNN compari-
and increases to 96.41% at 65 kHz. Therefore, using higher son, our work is smaller in model size than [31], and larger
clock rate with duty-cycling scheme can achieve lower energy than [33], yet with superior classification accuracy (98.22%
per classification. The energy efficiency with different clock vs. 90.5%). With rapidly evolving SNN research recently, we
frequencies is displayed in Fig. 15, exhibiting that the minimum believe there is still room for improving accuracy and energy
energy efficiency is 0.75 μJ per classification at 100 MHz, equiv- efficiency. Emerging devices and circuit advances, e.g., the work
alent to an 0.93 μW average power. Table II provides the com- in [33], adopts asynchronous design, CIM, and more in-depth
parison of the proposed SNN processor for ECG classification customization, provides the possibility to exploit the event-
with other state-of-the-art works using MIT-BIH database. The driven characteristic of SNNs further and reach better power
classification accuracy of the proposed classifier is comparable performance.

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
522 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 16, NO. 4, AUGUST 2022

VI. CONCLUSION [16] Q. Cai, X. Xu, Y. Zhao, L. Ying, Y. Li, and Y. Lian, “A 1.3 µW event-driven
ANN core for cardiac arrhythmia classification in wearable sensors,”
In this paper, we present a neuromorphic processing system IEEE Trans. Circuits Syst. II: Exp. Briefs, vol. 68, no. 9, pp. 3123–3127,
with a spike-driven SNN processor design for always-on wear- Sep. 2021.
[17] X. Zhang, Z. Zhang, Y. Li, C. Liu, Y. X. Guo, and Y. Lian, “A 2.89 µW dry-
able ECG classification. The system is composed of an LC-ADC electrode enabled clockless wireless ECG SoC for wearable applications,”
for data acquisition and an SNN processor for spike processing. IEEE J. Solid-State Circuits, vol. 51, no. 10, pp. 2287–2298, Oct. 2016.
The LC-ADC samples the ECG signal in an event-driven manner [18] F. Corradi et al., “ECG-based heartbeat classification in neuromorphic
hardware,” in Proc. Int. Joint Conf. Neural Netw., 2019, pp. 1–8.
with single-bit data representation, generating SNN-compatible [19] H. Chu et al., “A neuromorphic processing system for low-power wearable
spike sequences and compressing the data volume by 55.9×. ECG classification,” in Proc. IEEE Biomed. Circuits Syst. Conf., 2021,
The hardware-aware STBP is adopted to improve classification pp. 1–5.
[20] Z. Wang et al., “A 57nW software-defined always-on wake-up chip for IoT
performance and reduce the computation cost, achieving 98.22% devices with asynchronous pipelined event-driven architecture and time-
classification accuracy on the MIT-BIH database. The special- shielding level-crossing ADC,” in Proc. IEEE Int. Solid-State Circuits
ized hardware is validated with FPGA and the ASIC design is Conf., 2020, pp. 314–316.
[21] M. Tlili, M. Ben-Romdhane, A. Maalej, F. Rivet, D. Dallet, and C. Rebai,
implemented in 40 nm CMOS process. The energy efficiency “Level-crossing ADC design and evaluation methodology for normal
is 0.75 μJ per classification, which is promising for always-on and pathological electrocardiogram signals measurement,” Measurement,
wearable ECG classification. vol. 124, pp. 413–425, 2018.
[22] G. B. Moody and R. G. Mark, “The impact of the MIT-BIH arrhyth-
mia database,” IEEE Eng. Med. Biol. Mag., vol. 20, no. 3, pp. 45–50,
REFERENCES May/Jun. 2001.
[23] Testing and Reporting Performance Results of Cardiac Rhythm and ST
[1] “Cardiovascular diseases (CVDs),” World Health Organization, Geneva, Segment Measurement Algorithms, ANSI/AAMI Standard EC57, 2012.
Switzerland, 2019. [Online]. Available: https://www.who.int/news-room/ [24] Y. Yan et al., “Graph-based spatio-temporal backpropagation for training
factsheets/detail/cardiovascular-diseases-(cvds) spiking neural networks,” in Proc. IEEE 3rd Int. Conf. Artif. Intell. Circuits
[2] A. Rosiek and K. Leksowski, “The risk factors and prevention of cardio- Syst., 2021, pp. 1–4.
vascular disease: The importance of electrocardiogram in the diagnosis and [25] T. M. Bartol Jr et al., “Nanoconnectomic upper bound on the variability
treatment of acute coronary syndrome,” Therapeutics Clin. Risk Manage., of synaptic plasticity,” Elife, vol. 4, 2015, Art. no. e10778.
vol. 12, 2016, Art. no. 1223. [26] Y. Wu, L. Deng, G. Li, J. Zhu, and L. Shi, “Spatio-temporal backprop-
[3] A. Mincholé and B. Rodriguez, “Artificial intelligence for the electrocar- agation for training high-performance spiking neural networks,” Front.
diogram,” Nature Med., vol. 25, no. 1, pp. 22–23, 2019. Neurosci., vol. 12, 2018, Art. no. 331.
[4] Z. Jin, J. Oresko, S. Huang, and A. C. Cheng, “HeartToGo: A personalized [27] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
medicine technology for cardiovascular disease prevention and detection,” Dec. 2014, arXiv:1412.6980.
in Proc. IEEE/NIH Life Sci. Syst. Appl. Workshop, 2009, pp. 80–83. [28] Z. Cheng, W. Wang, Y. Pan, and T. Lukasiewicz, “Distributed low precision
[5] A. Samol, K. Bischof, B. Luani, D. Pascut, M. Wiemer, and S. Kaese, training without mixed precision,” Nov. 2019, arXiv:1911.07384.
“Single-lead ECG recordings including Einthoven and Wilson leads by a [29] A. Finnerty and H. Ratigner, “Reduce power and cost by converting
smartwatch: A new era of patient directed early ECG differential diagnosis from floating point to fixed point,” San Jose, CA, USA, Xilinx
of cardiac diseases?,” Sensors, vol. 19, no. 20, 2019, Art. no. 4377. White Paper WP491 (v1. 0), 2017. [Online]. Available: "https:
[6] O. J. Escalona, A. Villegas, S. Mukhtar, G. Perpiñan, and D. J. McEneaney, //docs.xilinx.com/v/u/en-US/wp491-floating-to-fixed-point"https:
“Wireless arm wearable sensor band for long-term heart rhythms surveil- //docs.xilinx.com/v/u/en-US/wp491-floating-to-fixed-point
lance using a bipolar Arm-ECG lead,” in Proc. Comput. Cardiol., 2020, [30] S. Tajasob, M. Rezaalipour, and M. Dehyadegari, “Designing energy-
pp. 1–4. efficient imprecise adders with multi-bit approximation,” Microelectron.
[7] Y. A. Bhagat et al., “Like Kleenex for wearables: A soft, strong and J., vol. 89, pp. 41–55, 2019.
disposable ECG monitoring system,” in Proc. IEEE Biomed. Circuits Syst. [31] F. C. Bauer, D. R. Muir, and G. Indiveri, “Real-time ultra-low power ECG
Conf., 2018, p. 1, doi: 10.1109/BIOCAS.2018.8584778. anomaly detection using an event-driven neuromorphic processor,” IEEE
[8] S. Kiranyaz, T. Ince, and M. Gabbouj, “Real-time patient-specific ECG Trans. Biomed. Circuits Syst., vol. 13, no. 6, pp. 1575–1582, Dec. 2019.
classification by 1-D convolutional neural networks,” IEEE Trans. Biomed. [32] J. Liu et al., “4.5 BioAIP: A reconfigurable biomedical AI processor with
Eng., vol. 63, no. 3, pp. 664–675, Mar. 2016. adaptive learning for versatile intelligent health monitoring,” in Proc. IEEE
[9] N. Wang, J. Zhou, G. Dai, J. Huang, and Y. Xie, “Energy-efficient intelli- Int. Solid-State Circuits Conf., 2021, pp. 62–64.
gent ECG monitoring for wearable devices,” IEEE Trans. Biomed. Circuits [33] Y. Liu et al., “An 82nW 0.53 pJ/SOP clock-free spiking neural network
Syst., vol. 13, no. 5, pp. 1112–1121, Oct. 2019. with 40 µs latency for AloT wake-up functions using ultimate-event-driven
[10] M. Janveja, R. Parmar, M. Tantuway, and G. Trivedi, “A DNN based low bionic architecture and computing-in-memory technique,” in Proc. IEEE
power ECG co-processor architecture to classify cardiac arrhythmia for Int. Solid-State Circuits Conf., 2022, pp. 372–374.
wearable devices,” IEEE Trans. Circuits Syst. II: Exp. Briefs, vol. 69, no. 4,
pp. 2281–2285, Apr. 2022.
[11] Z. Yan, J. Zhou, and W-F. Wong, “Energy efficient ECG classification with
spiking neural network,” Biomed. Signal Process. Control, vol. 63, 2021,
Art. no. 102170.
[12] A. Amirshahi and M. Hashemi, “ECG classification algorithm based on
STDP and R-STDP neural networks for real-time monitoring on ultra low-
power personal wearable devices,” IEEE Trans. Biomed. Circuits Syst.,
vol. 13, no. 6, pp. 1483–1493, Dec. 2019.
[13] M. Saeed et al., “Evaluation of level-crossing ADCs for event-driven
Haoming Chu received the B.S. degree in electronic
ECG classification,” IEEE Trans. Biomed. Circuits Syst., vol. 15, no. 6,
eng ineering in 2016 from Fudan University, Shang-
pp. 1129–1139, Dec. 2021.
hai, China, where he is currently working toward
[14] Y. Zhao, S. Lin, Z. Shang, and Y. Lian, “Classification of cardiac ar-
the Ph.D. degree in microelectronics and solid state
rhythmias based on artificial neural networks and continuous-in-time
electronics.
discrete-in-amplitude signal flow,” in Proc. IEEE Int. Conf. Artif. Intell.
Since 2016, he has been working in the field
Circuits Syst., 2019, pp. 175–178.
of low-power SoC design. He is research interests
[15] Y. Zhao, Z. Shang, and Y. Lian, “A 13.34 µW event-driven patient-
include low power architecture for microprocessor,
specific ANN cardiac arrhythmia classifier for wearable ECG sen-
event-based low-power design, and energy-efficient
sors,” IEEE Trans. Biomed. Circuits Syst., vol. 14, no. 2, pp. 186–197,
hardware design.
Apr. 2020.

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
CHU et al.: NEUROMORPHIC PROCESSING SYSTEM WITH SPIKE-DRIVEN SNN PROCESSOR FOR WEARABLE ECG CLASSIFICATION 523

Yulong Yan received the B.E. degree in communi- Yuxiang Huan (Member, IEEE) received the Ph.D.
cation engineering from Shandong University, Jinan, degree in micro electronics from Fudan University,
China, in 2017. He is currently working toward the Shanghai, China, in 2018. From 2018 to 2021, he was
Ph.D. degree with the School of Information Science an Assistant Professor with Fudan University. He is
and Technology, Fudan University, Shanghai, China. currently the Principal Investigator of the Large-scale
Since 2016, he has been working in the field Neuromorphic Computing Group Guangdong Insti-
of intelligent electronics and systems, especially in tute of Intelligence Science and Technology, Guang-
spiking neural network algorithm, neuromorphic en- dong, China. His research interests include distributed
gineering, and machine learning algorithms for the computing architectures, energy-efficient architec-
Internet of Things systems and applications. tures for deep learning accelerators, and domain spe-
cific hardware designs for large-scale neuromorphic
computing.

Leijing Gan received the B.S. degree in communica-


tion engineering from Shanghai University, Shanghai,
China in 2018. He is currently working toward the
M.S. degree in electronic and information engineer-
ing with Fudan University, Shanghai. From 2018 to Lirong Zheng (Senior Member, IEEE) received
2020, he was with ZTE for two years. He is working the Ph.D. degree in electronic system design from
on the design of FPGA low-latency accelerators. the KTH Royal Institute of Technology (KTH),
Stockholm, Sweden, in 2001.
He was with KTH as a Research Fellow, an As-
sociate Professor, and a Full Professor. He is the
Founding Director of iPack VINN Excellence Center
of Sweden and has been the Chair Professor of media
electronics with KTH since 2006. He has been also
a Guest Professor since 2008 and a Distinguished
Hao Jia received the B.E. degree in integrated cir- Professor with Fudan University, Shanghai, China,
cuit design and integrated systems from the Univer- since 2010. He is currently the Director of the Shanghai Institute of Intelligent
sity of Electronic Science and Technology of China, Electronics and Systems, Fudan University. He has authored more than 500
Chengdu, China. publications. His current research interests include electronic circuits, wireless
He is currently working toward the Eng.D. de- sensors and systems for ambient intelligence, and the Internet of Things.
gree with the School of Information Science and
Technology, Fudan University, Shanghai, China. His
research interests include neuromorphic hardware for
brain-like computing and domain-specific accelerator
for financial market trading systems.

Zhuo Zou (Senior Member, IEEE) received the Ph.D.


degree in electronic and computer systems from the
KTH Royal Institute of Technology, Stockholm, Swe-
den, in 2012. He is currently a Full Professor with
Liyu Qian received the B.S. degree in electrical en- Fudan University, Shanghai, China, where he is con-
gineering and automation from the Beijing Institute ducting research on intelligent chips and systems for
of Technology, Beijing, China, in 2016, and the M.S. AIoT. Prior to joining Fudan, he was the Assistant
degree in electrical and electronics engineering from Director and a Project Leader with VINN iPack Ex-
Southern Methodist University, Dallas, TX USA, in cellence Center, KTH. He has also been an Adjunct
2019. She is currently working toward the Ph.D. Professor and Docent with the University of Turku,
degree in microelectronics and solid state electronics Turku, Finland. His current research interests include
with Fudan University, Shanghai, China. Her research low-power circuits, energy-efficient SoC, neuromorphic computing, and their
focuses on low-power biomedical processors design. applications in AIoT and autonomous systems. He is the Vice Chairman of IFIP
WG-8.12.

Authorized licensed use limited to: FUDAN UNIVERSITY. Downloaded on October 14,2022 at 06:11:40 UTC from IEEE Xplore. Restrictions apply.
View publication stats

You might also like