0% found this document useful (0 votes)
2 views33 pages

Hybrid Data Decomposition-Based Deep Learning for Bitcoin Prediction and Algorithm Trading

This study introduces a novel hybrid bidirectional deep learning model, VMD-LMH-BiGRU, for forecasting Bitcoin price changes and conducting algorithmic trading. The model utilizes variational mode decomposition (VMD) for data decomposition and bidirectional gated recurrent units (BiGRU) for deep learning, outperforming traditional econometric and machine learning models in both prediction accuracy and investment returns. The research highlights the effectiveness of combining data decomposition and deep learning techniques to enhance forecasting in the volatile Bitcoin market.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
2 views33 pages

Hybrid Data Decomposition-Based Deep Learning for Bitcoin Prediction and Algorithm Trading

This study introduces a novel hybrid bidirectional deep learning model, VMD-LMH-BiGRU, for forecasting Bitcoin price changes and conducting algorithmic trading. The model utilizes variational mode decomposition (VMD) for data decomposition and bidirectional gated recurrent units (BiGRU) for deep learning, outperforming traditional econometric and machine learning models in both prediction accuracy and investment returns. The research highlights the effectiveness of combining data decomposition and deep learning techniques to enhance forecasting in the volatile Bitcoin market.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 33

1

2 Hybrid data decomposition-based deep learning for


3 Bitcoin prediction and algorithm trading
4

5 Yuze Lia,b, Shangrong Jiangb, Xuerong Lia,*, Shouyang Wanga,b

6 a: Academy of Mathematics and Systems Science, Chinese Academy of Sciences,


7 Beijing 100190, China
8 b: School of Economics and Management, University of Chinese Academy of
9 Sciences, Beijing 100190, China
10

11 Abstract
12 Although Bitcoin has attracted significant attention from investors and policy makers, the
13 empirical works in the Bitcoin forecasting and trading support systems are still at an early stage.
14 This study proposes a novel data decomposition based hybrid bidirectional deep learning model,
15 namely VMD-LMH-BiGRU, in forecasting the daily price change in the Bitcoin market and
16 conducting algorithmic trading on the market. Two main steps are involved in our methodology
17 framework, i.e., data-decomposition for inner factors extraction, and bidirectional deep learning
18 for forecasting the Bitcoin price. The results demonstrate that the proposed model outperforms
19 four other benchmark models, including econometric models, machine learning models and deep
20 learning models. Furthermore, the proposed model achieved higher investment returns than all
21 benchmark models and the buy-and-hold strategy in the trading simulation. The robustness of the
22 model is verified through multiple forecasting periods and testing intervals.
23
24 Keywords: Bitcoin price, Variational mode decomposition, Deep learning, Price forecasting,
25 Algorithmic trading
26
27
28

Electronic copy available at: https://ssrn.com/abstract=3614428


29 1. Introduction
30
31 As the price of Bitcoin rose from almost zero in 2009 to nearly $20000 in 2017, it has
32 attracted considerable amount of attention from investors and policy makers. This significant
33 increase in its price has been accompanied by a steady growth of the Bitcoin market - as of
34 December of 2019, the average market volume is around $19.45 billion per day. Acting as a
35 currency, Bitcoin has several characteristics: decentralized transaction, auditability and
36 anonymity (Zheng et al., 2018; Li et al., 2020). Although it has been characterized as a bubble
37 and a threat to the stability of the financial system (Böhme, 2015), Bitcoin still presents itself as
38 an attractive and potentially high-earning investment option. However, there are also greater
39 risks - compared with other financial assets, the price of Bitcoin is much more volatile (Garcia et
40 al., 2014; Yu, Kang & Park, 2019). As a result, the potential losses are much greater. Therefore,
41 how to accurately predict and capture the changing trends in the Bitcoin market is of great
42 importance to investors, as well as policy makers.
43
44 The rise of Bitcoin, as well as its underlying blockchain technology, have also attracted
45 significant attention from scholars. In recent years, many studies have analyzed them from
46 different perspectives, such as the adoption of blockchain technology across different industry
47 sectors (Cohen & Nissim, 2018; Easley, O’Hara, & Basu, 2019; Janssen et al., 2020; Heaven,
48 2019; Mu, Bian, & Zhao, 2019), the impact of Bitcoin and blockchain on firms (Cheng et al.,
49 2019), the relationship between Bitcoin transactions and illegal activities (Gandal et al., 2018), as
50 well as the relationship between Bitcoin and energy emissions (De Vries, 2018; Jones, 2018;
51 Masanet et al., 2019). For example, Leng et al. (2019) propose a blockchain-driven model to
52 handle cyber-credit of social manufacturing among various makers. Cheng et al. (2019) link
53 evidence on public firms initial 8-K disclosures that mention Blockchain and investors’ response
54 to these disclosures. As an example of investigating the linkage between Bitcoin transactions and
55 criminal activities, Foley et al. (2018) analyze a large data of cryptocurrency transactions and
56 find that approximately one-quarter of all Bitcoin transactions are involved in illegal activity. In
57 terms of investigating the relationship between Bitcoin and energy emissions, Mora et al. (2018)
58 conclude that the emissions created from the mining of Bitcoin alone could push global warming

Electronic copy available at: https://ssrn.com/abstract=3614428


59 above 2 degrees Celsius. However, despite the scholarly attention attracted, the empirical works
60 in the Bitcoin pricing forecasting models are relatively at an early stage.
61
Nomenclature
ARIMA Autoregressive integrated moving average model
ANN Artificial neural networks
BiGRU Bidirectional gated recurrent unit
CNN Convolutional neural networks
DA Directional accuracy
EMD Empirical mode decomposition
LR linear regression model
LSTM Long short-term memory model
RNN Recurrent neural network
SVR support vector regression model
VMD Variational mode decomposition
VMD-LMH-BiGRU VMD-BiGRU with all frequency modes
62
63 Several types of models have been adopted by previous studies to forecast the Bitcoin
64 market price. Econometric models such as Generalized Autoregressive Conditional
65 Heteroskedasticity (GARCH) models, Vector Autoregressive (VAR) models and Grey
66 Lotka-Volterra models (GLVM) are first introduced to investigate the determinants of Bitcoin
67 returns (Katsiampa, 2017). For example, Dyhrberg (2016) investigates asset capabilities of
68 Bitcoin with the GARCH model, which showed that Bitcoin is similar to some major
69 commodities, such as gold and stock. Katsiampa (2017) introduces the AR-CGARCH model to
70 describe volatility and price returns of Bitcoin. However, the studies discussed above are mostly
71 of explanatory nature as they do not focus on the predictive ability of the models. Further, the
72 econometric models have a major drawback in that the models all assume the time series is linear
73 and stationary, which is hardly satisfied by the volatile and non-stationary nature of the Bitcoin
74 market (Yu, Wang and Lai, 2008). As a result, they are less effective in predicting the Bitcoin
75 market price.
76

Electronic copy available at: https://ssrn.com/abstract=3614428


77 To overcome the limitations, some studies have adopted machine learning methods to
78 develop prediction models for the Bitcoin market. For example, Madan et al. (2015) utilized
79 binomial regression, support vector machines (SVM) and random forest to predict the change in
80 Bitcoin prices. Moreover, Kristjanpoller and Minutolo (2018) propose a framework integrating
81 GARCH and ANN to forecast the price volatility of Bitcoin. Moreover, Peng et al. (2017) use
82 Support Vector Regression (SVR) to predict the volatility of cryptocurrencies. More recently, a
83 few studies have adopted deep learning models to forecast the Bitcoin market price as they have
84 shown superior performance over its shallow counterparts (LeCun, Bengio & Hinton, 2015). For
85 example, Altan et al. (2019) utilize the long short-term memory (LSTM) neural network to
86 identify the nonlinear properties of the Bitcoin price time series. Atsalakis et al. (2019) develop a
87 novel Neuro-fuzzy technique with artificial neural networks, which demonstrated improved
88 prediction accuracy and trading results compared to the traditional artificial neural networks. Ji et
89 al. (2019) compare prediction accuracy of several deep learning methods, including the deep
90 neural networks (DNN) model, convolutional neural networks (CNN) model and long short-term
91 memory (LSTM) model and found varied prediction performances amongst different models.
92
93 Since Bitcoin price is an extremely volatile and non-stationary time series data, the
94 prediction accuracies may suffer as a result of its high volatility. In recent years, another type of
95 ensemble learning approach based on the concept of “divide and conquer” has been proposed to
96 improve the prediction accuracies on non-stationary time series. This type of approach
97 decomposes the original time series into different cycle factors. The decomposed factors are
98 estimated individually and then integrated together to generate the final prediction output.
99 Currently, Empirical mode decomposition (EMD) is the predominant method used to decompose
100 the non-stationary time series data into intrinsic mode functions (IMF). For example, Yu et al.
101 (2015) adopt a hybrid approach of complementary ensemble empirical mode decomposition
102 (CEEMD) and extended extreme learning machine (EELM) to forecast the crude oil price. Wen
103 et al. (2017) use complementary ensemble empirical mode decomposition (CEEMD) and
104 combined SVM and ANN to forecast the gold prices. Santhosh et al. (2019) combine EEMD and
105 Deep Boltzman Machines (DBM) to forecast wind energy. However, the prediction error from
106 the individual decomposed modes tend to accumulate, which could negatively affect the
107 forecasting results of the prediction model (Tang et al., 2015). Moreover, a mode-mixing

Electronic copy available at: https://ssrn.com/abstract=3614428


108 problem may occur in the process of empirical mode decomposition (EMD), which may produce
109 oscillations with similar scales in intrinsic mode functions (Colominas, Schlotthauer & Torres,
110 2014).
111
112 Based on previous studies discussed above, this paper proposes a novel data
113 decomposition-based hybrid bidirectional deep learning model to forecast the Bitcoin market
114 price. First, a non-recursive signal decomposition method, variational mode decomposition
115 (VMD), is introduced to decompose the historical Bitcoin price data into various intrinsic modes.
116 In comparison to the widely adopted EMD method, VMD can effectively avoid the mode-mixing
117 problem (Dragomiretskiy & Zosso, 2014). Second, a bidirectional gated recurrent unit neural
118 networks (BiGRU) is employed as the deep learning prediction model. The proposed deep
119 learning model is able to extract a two-way sequential relationship in the time series (Ullah et
120 al.,2017). To assess the prediction performance of the proposed model, several types of
121 prediction models such as econometric models, machine learning models and deep learning
122 models are used as benchmarks. The results indicate that the proposed decomposition-based
123 bidirectional deep learning model can effectively improve its predictability. In addition, results
124 revealed that although data decomposition improves the overall predictive ability of the model,
125 not all decomposed factors contribute to the improved predictive ability of the model equally. To
126 further test the practicality of the model, algorithmic trading is conducted based on the prediction
127 results and the performances are assessed against the buy and hold strategy. The results indicate
128 that the proposed VMD-LMH-BiGRU model generated higher returns in comparison to other
129 measured strategies.
130
131 Our intended contributions in this paper lie in the following aspects. First, by adopting a
132 bidirectional deep learning neural networks structure, the model is able to capture a two-way
133 sequential relationship that exist in the time series. Since the current state is not only a reflection
134 of historical information, but also a basis for future states, the proposed bidirectional model is
135 more effective within the complex circumstance of the Bitcoin market. To the best of our
136 knowledge, bidirectional deep learning models have not been employed in the Bitcoin literature,
137 and we aim to close this gap. Second, rather than aggregating the forecasting results of all the
138 decomposed modes, the proposed model directly generates the prediction result by

Electronic copy available at: https://ssrn.com/abstract=3614428


139 simultaneously inputting the decomposed modes into the deep learning neural networks, which
140 avoids the accumulated estimation errors in the current “divide and conquer” ensemble
141 approaches. To the best of our knowledge, bidirectional deep learning models and data
142 decomposition techniques have not been employed in the Bitcoin literature, and we aim to close
143 the gap. As discussed above, considering the complex and volatile nature of the Bitcoin market,
144 the characteristics of bidirectional deep learning model and data decomposition techniques make
145 them particularly suitable to forecast the price changes in the Bitcoin market. Therefore, we aim
146 to close the gap in the existing Bitcoin literature by proposing a novel hybrid
147 decomposition-based bidirectional deep learning prediction model to forecast the Bitcoin market
148 price. It serves as an initial attempt to develop a reliable Bitcoin forecasting and trading decision
149 support system using hybrid deep learning method.
150
151 The remainder of this paper is organized as follows: Section 2 presents the
152 methodological framework of this paper, including variational mode decomposition and
153 bidirectional GRU neural networks. Section 3 presents the empirical study on the Bitcoin market
154 and the performance results, as well as the robustness tests of our proposed model. Section 4
155 concludes and provides plans for future works.
156
157 2. Methodological Framework
158
159 This section presents a data-driven decomposition-based methodology framework for
160 Bitcoin market price forecasting and algorithmic trading as shown in Figure 1.
161

Electronic copy available at: https://ssrn.com/abstract=3614428


162
163 Fig. 1. Methodology framework
164
165 In the proposed approach, two main steps are involved, i.e., data-decomposition and deep
166 learning forecasting.
167
168 Step 1: Data-decomposition
169 An effective VMD decomposition technique is utilized to decompose the original time
170 series data of Bitcoin market price 𝑋" into 𝐾 simple and stationary sub-series of different
171 frequencies, which corresponds to the different inner factors of the data.
172
173 Step 2: Deep learning forecasting
174 A bidirectional GRU deep learning model is employed as the forecasting tool to generate
175 the prediction result for the bitcoin market price. The forecasting performance is evaluated by
176 comparing proposed model with various benchmark models and robustness tests. Meanwhile, the
177 economic performance of the model is evaluated by algorithmic trading on the Bitcoin market.
178
179 Sections 2.1 – 2.2 provide a detailed description into the corresponding techniques of VMD
180 and Bidirectional GRU, respectively.

Electronic copy available at: https://ssrn.com/abstract=3614428


181
182 2.1. Variational mode decomposition
183
184 Variational mode decomposition (VMD) is an entirely non-recursive signal
185 decomposition technique proposed by Dragomiretskiy and Zosso (2014). Based on Wiener
186 filtering and Hilbert transform (Wang & Markert, 2015), it decomposes the original input signal
187 𝑓(𝑡) into a series of quasi-orthogonal band-limited discrete sub-signals 𝑢) that are mostly
188 centered tightly around their respective center frequency 𝜔) (Liu, Cao & Chen, 2016). In
189 essence, VMD is a variational optimization problem that seeks to minimize the total bandwidth
190 of each mode. The optimization procedure is as follows (Zhang et al., 2017):
191
192 Step 1: Calculate the Hilbert transform of each mode 𝑢) and transform into respective
193 uni-sided frequency spectrum.
194 Step 2: The frequency spectrum of each mode 𝑢) is altered to narrow frequency baseband
195 by multiplying an exponential function tuned to the corresponding estimated center frequency.
196 Step 3: Obtain the bandwidth of each mode 𝑢) by conducting the 𝐻, Gaussian
197 smoothness on the demodulated signal.
198
199 The iterative minimization process can be expressed in the following form:
200
𝑗 EF6 " 2
201 min 78 9𝜕" ;<𝛿(𝑡) + A ⨂𝑢) (𝑡)C 𝑒) 2 9 H , 𝑠. 𝑡. 8 𝑢) = 𝑓(𝑡) (4)
{12 }, {62 } 𝜋𝑡 2
) )

202
203 where {𝑢) } and {𝜔) } are the modes and their respective center frequencies, 𝐾 denotes the
204 number of decomposed sub-signals 𝛿(𝑡) denotes the Dirac delta function, ⨂ denotes the
205 convolution operator and 𝑓(𝑡) represents the original input signal.
206
207 To obtain the optimal solution of the constrained optimization problem in Equation (5), a
208 quadratic penalty function 𝛼 and a Lagrangian multiplier 𝜆 are introduced for finite
209 convergence and constraints enforcement purposes. Thus, the augmented Lagragian multiplier
210 function 𝐿 can be obtained as follows:

Electronic copy available at: https://ssrn.com/abstract=3614428


211
212 𝐿 ({𝑢) }, {𝜔) }, 𝜆)
𝑗 EF6 " 2 2
213 = 𝛼 78 9𝜕" ;<𝛿(𝑡) + A ⨂𝑢) (𝑡)C 𝑒) 2 9 H + P𝑓(𝑡) − 8 𝑢) (𝑡)P
𝜋𝑡 2 2
) )

214 + 〈𝜆(𝑡), 𝑓(𝑡) − 8 𝑢) (𝑡)〉 (5)


)

215
216 The Lagrangian functions are shifted from a time domain to a frequency domain and the
217 corresponding extreme values are calculated. The modes 𝑢) and their respective central
218 frequency 𝜔) are calculated as follows:
\
VW, 𝑓X(𝜔) − ∑Z^) 𝑢UZ (𝜔 ) + [(6)
]
219 𝑢U) (𝜔) = (6)
1 + 2𝛼(𝜔 − 𝜔) )]
220
c
VW, (𝜔)
∫d 𝜔|𝑢U) (𝜔)|] 𝑑𝜔
221 𝜔) = c (7)
∫d |𝑢U) (𝜔)|] 𝑑𝜔
222
223 The optimal solution is then obtained using the alternative direction method of multipliers
224 (ADMM), and the original input signal 𝑓(𝑡) is decomposed into 𝐾 sub-signal modes.
225
226 2.2 Bidirectional GRU
227
228 Proposed by Schuster and Paliwal (1997), the bidirectional recurrent neural network
229 (BiRNN) is a recurrent neural network (RNN) that utilizes both forward and backward
230 information in the data. In this paper, the traditional RNN cells are replaced by gated recurrent
231 unit (GRU) cells. A GRU cell consists of two gates: an update gate 𝑧" and a reset gate 𝑟" . The
232 update gate 𝑧" controls the amount of new input information that enters the current state.; the
233 larger the value 𝑧" , the more the input information is updated in the cell. The reset gate 𝑟"
234 controls the amount of information from the past state that are retained in the current state; the
235 smaller the value 𝑟" , the less the historical information is kept. At time 𝑡, the cell calculations
236 are as follows:
237

Electronic copy available at: https://ssrn.com/abstract=3614428


238 𝑟" = 𝜎(𝑊k ∙ [ℎ"E, , 𝑥" ]) (8)
239
240 𝑧" = 𝜎(𝑊r ∙ [ℎ"E, , 𝑥" ]) (9)
241
242 ℎt" = 𝑡𝑎𝑛ℎw𝑊xyz ∙ [𝑟" ∗ ℎ"E, , 𝑥" ]| (10)
243
244 ℎ" = (1 − 𝑧" ) ∗ ℎ"E, + 𝑧" ∗ ℎt" (11)
245
246 𝑦" = 𝜎(𝑊• ∙ ℎ" ) (12)
247
248 where 𝑥" denotes the input, ℎ" denotes the hidden state, 𝜎 represents the sigmoid function, ∗
249 is the element-wise multiplication, and 𝑊k , 𝑊r and 𝑊xyz are the weight matrices, respectively.

250 ℎy" represents the candidate vector that controls the degree to which new input information is
251 received in the current state.

252
253 Fig. 2. Structure of the bidirectional GRU neural network

Electronic copy available at: https://ssrn.com/abstract=3614428


254
255 As illustrated in Figure 2, the bidirectional gated recurrent unit (BiGRU) contains two
256 hidden layers, where one of the layer processes information in the forward direction and the
257 other layer processes information in the backward direction. The two hidden layers are connected
258 to one output layer so that the BiGRU neural network can learn the information from two
259 different data directions. Since time series data contains a two-way sequential relationship as the
260 current state is not only the reflection of historical information but also the basis of the future
261 state. Therefore, BiGRU is more effective in complex reality, thus making more accurate
262 predictions.
263

264 3. Empirical Study


265
266 In this study, the proposed VMD-LMH-BiGRU model is used to predict the Bitcoin market
267 price and conduct algorithmic trading based on the predictions. In order to verify the
268 effectiveness of the proposed model, historical Bitcoin market price time series is used as the
269 sample data. In addition, several benchmark models are formulated for forecasting performance
270 comparison. Section 3.1 provides a detailed description about the experimental design. Section
271 3.2 presents the results and verifies that the proposed model is robust across different market
272 conditions and forecasting horizons.
273
274 3.1. Experiment Design
275
276 3.1.1. Data Description
277
278 This study aims to forecast the price fluctuations in the Bitcoin market and conduct
279 algorithmic trading based on the prediction results. The data used in this study consist of the
280 historical daily Bitcoin market closing price time series obtained from Quandl
281 (www.quandl.com). The raw data is from the period of January 1, 2013 to December 17, 2019,
282 with a total of 2542 observations. A graphical representation of the data is illustrated in Figure 3.
283 Table 1 presents the common descriptive statistics for the daily Bitcoin market price, and the

Electronic copy available at: https://ssrn.com/abstract=3614428


284 augmented Dickey-Fuller test (ADF) is also introduced. The null hypothesis is rejected in the
285 ADF test, which indicates that the data is non-stationary with a unit root present.
286

287
288 Fig. 3. Daily Bitcoin closing price
289
290 Table 1. Common descriptive statistics of Bitcoin price

Observation Mean Std. Dev. Min Max Skewness Kurtosis ADF

Bitcoin price 2542 3056.12 3774.35 68.43 19497.4 1.3192 4.004 -1.05
291
292 The data are divided into two sets: a training set and a testing set. The preceding 90 percent
293 of the data is used to train the prediction model and the remaining 10 percent is used to evaluate
294 the model performance. Overall, the training set consists of 2290 observations from January 1,
295 2013 to April 9, 2019. The testing set contains 252 observations from April 10, 2019 to
296 December 17, 2019.
297
298 To eliminate the differences in variable dimensions, the data is adjusted and normalized
299 using the 0 − 1 normalization as shown below:
𝑥" − min 𝑥"
300 𝑥€" = (13)
max 𝑥" − min 𝑥"
301
302 where 𝑥" denotes the true value of the time series at time 𝑡, max 𝑥" and min 𝑥" are the
303 maximum and the minimum true value of the time series, respectively.

Electronic copy available at: https://ssrn.com/abstract=3614428


304
305 In this paper, a sliding-window prediction approach is adopted in the prediction process. The
306 window length, 𝑁, represents the data lag-order utilized for the prediction model. For example, a
307 window length of 𝑁 = 3 means that the model takes the input data from time 𝑡 − 2 to 𝑡 to
308 forecast the daily market price at time 𝑡 + 1. To determine the sliding-window length in this
309 study, a grid search is conducted with the search range of [1,100]. Figure 4 shows the prediction
310 results of the proposed VMD-LMH-BiGRU model with the window length of 𝑁 =
311 1, 5, 10, 25, 50, 100. When 𝑁 = 25, the model yielded the best prediction performance. As a
312 result, the sliding-window length for the model is set to 𝑁 = 25.
313
314 The prediction model proposed in this study consists of five layers: an input layer, a forward
315 hidden layer, a backward hidden layer, an output layer and a fully connected layer. The
316 dimensions of the input layers, hidden layers and output layers are set to the same as that of the
317 input data. The fully connected layer consists of one node, which corresponds to the predicted
318 value. The model utilizes the Adam optimizer with a learning rate set to 0.01 with 𝑡𝑎𝑛ℎ
319 selected as the activation function. To ensure that the model does not overfit to the training
320 dataset, a rolling forecasting process with the rolling window set to 90 days is suggested and
321 shown in Figure 4 (Nowotarski et al., 2013). In addition, multi-step ahead predictions are also
322 generated to test and compare the robustness of the model.
323

324
325 Fig. 4. Rolling Forecast Process
326
327
328 3.1.2. Forecasting performance measures
329

Electronic copy available at: https://ssrn.com/abstract=3614428


330 Four commonly used measures are chosen to assess the accuracy of the model, namely Mean
331 Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error
332 (MAPE), and Mean Absolute Error (MAE). The mathematical formulas are as follows:
333

1
334 𝑀𝑆𝐸 = 8(𝑥€" − 𝑥" )] (14)
𝑁
"Š,

335


1
336 𝑅𝑀𝑆𝐸 = Œ 8(𝑥U" − 𝑥" )] (15)
𝑁
"Š,

337

1 𝑥U" − 𝑥"
338 𝑀𝐴𝑃𝐸 = 8 • • (16)
𝑁 𝑥"
"Š,

339

1
340 𝑀𝐴𝐸 = 8|𝑥U" − 𝑥" | (17)
𝑁
"Š,

341
342 where 𝑥€" and 𝑥" , (𝑡 = 1, 2, … , 𝑁) is the predicted value and the actual value at time 𝑡, and 𝑁
343 represents the total number of data points in the testing set.
344
345 Moreover, directional accuracy (DA) is introduced to assess the market trend predictive
346 ability of the model (Yu, Wang & Lai, 2008). The larger the DA, the better the model market
347 trend predictive ability:
348

1
349 𝐷𝐴 = 8 𝑎" (18)
𝑁
"Š,

350
351 where
1, (𝑥" − 𝑥"E, )(𝑥U" − 𝑥"E, ) ≥ 0
352 𝑎" = ’
0, (𝑥" − 𝑥"E, )(𝑥U" − 𝑥"E, ) < 0

Electronic copy available at: https://ssrn.com/abstract=3614428


353
354 3.1.3. Benchmark models
355
356 The benchmarking procedure consists of two steps. First, five single benchmark models,
357 including the autoregressive integrated moving average model (ARIMA), linear regression
358 model (LR), support vector regression model (SVR), GRU model and bidirectional GRU model
359 (BiGRU) are developed to compare the predictability of the proposed VMD-LMH-BiGRU
360 model. These models use the original Bitcoin market price and relevant market factors time
361 series as input features without data decomposition. By comparing the prediction performance of
362 the proposed VMD-LMH-BiGRU model with benchmark models formulated based on different
363 forecasting techniques utilized in previous literature, such as traditional econometric models,
364 machine learning models and deep learning models, we can comprehensively assess the
365 effectiveness of signal decomposition technique in improving the Bitcoin market forecasting
366 performance.
367 Second, in order to assess the effectiveness of different decomposed inner factors in
368 improving the forecasting performance of the model, hybrid one-characteristic models and
369 hybrid two-characteristic models are formulated by importing factors of different frequencies
370 into the proposed model.
371 The different inner factors extracted through data decomposition are classified as low
372 frequency, medium frequency and high frequency modes based on their respective periodicity.
373 They are selected respectively as input features to construct the corresponding hybrid
374 one-characteristic models – VMD-L-BiGRU, VMD-M-BiGRU and VMD-H-BiGRU. For hybrid
375 two-characteristics models, modes from two different frequencies are imported as input features,
376 which results in three models – VMD-LM-BiGRU, VMD-LH-BiGRU and VMD-MH-BiGRU.
377 For each benchmark model, the parameters and lag-order are all kept consistent with that of the
378 VMD-LMH-BiGRU model.
379 Overall, eleven benchmark models, including five single models (GRU, BiGRU, ARIMA,
380 LR, SVR), three hybrid one-characteristic models (VMD-L-BiGRU, VMD-M-BiGRU,
381 VMD-H-BiGRU), and three hybrid two-characteristic models (VMD-LM-BiGRU,
382 VMD-LH-BiGRU and VMD-MH-BiGRU), are formulated to compare with the proposed
383 VMD-LMH-BiGRU model.

Electronic copy available at: https://ssrn.com/abstract=3614428


384
385 3.2. Empirical Results
386
387 3.2.1. Data decomposition
388
389 After selecting relevant market features, the historical Bitcoin market price data is
390 decomposed via VMD. According to the decomposition results shown in Figure 5, the daily
391 market price is decomposed into 11 modes ranged from low frequency to high frequency. Each
392 decomposed mode is labeled from M1 to M11, respectively, with M1 having the lowest
393 frequency and M11 having the highest frequency. The decomposed modes contain different
394 inner factors hidden in the original signal that have various effects on the price movement in the
395 Bitcoin market.
396

397
398 Fig. 5. VMD decomposition results for the Bitcoin market price
399

Electronic copy available at: https://ssrn.com/abstract=3614428


400
401 Fig. 6. Fluctuation tendency for different frequency components of Bitcoin market price
402
403 To investigate the characteristics of the decomposed modes, the fast Fourier transform is
404 conducted to detect the main cyclic patterns within each mode by transforming the time series
405 into frequency domain and identifying the maximum spectral density (Welch, 1967; Wang et al.,
406 2014). Based on the detected cyclicity as shown in Figure 5, each mode is classified into one of
407 three groups: low frequency, medium frequency and high frequency. In terms of Bitcoin market
408 price, M1 has a cycle of approximately 3 years, which is significantly longer than that of other
409 decomposed modes. Thus, it is classified as the low frequency mode. Modes M2-M4 are
410 considered as medium frequency modes, and modes M5-M11 are regarded as high frequency
411 modes with relatively short cycles. Figure 6 illustrates the overall tendency of the three classified
412 frequency groups, where the low frequency is the M1 mode, medium frequency is the sum of
413 M2-M4, and the high frequency is the sum of M5-M11.
414
415 3.2.2. Forecasting performance evaluation with single benchmark models
416
417 Table 2. Performance comparison of single benchmark models in all intervals
Model MSE RMSE MAPE MAE DA
One-step ahead
GRU 0.0003 0.0184 0.0272 0.0125 0.5238
BiGRU 0.0003 0.0165 0.0246 0.0114 0.5397

Electronic copy available at: https://ssrn.com/abstract=3614428


ARIMA 0.0003 0.0158 0.0101 0.0216 0.5437
LR 0.0005 0.0944 0.0336 0.0153 0.5278
SVR 0.0026 0.0506 0.2181 0.0407 0.5634
LMH-BiGRU 0.0001 0.0186 0.0233 0.0120 0.8175
Two-step ahead
GRU 0.0012 0.0341 0.0605 0.0269 0.4048
BiGRU 0.0009 0.0297 0.0533 0.0236 0.4683
ARIMA 0.0006 0.0249 0.0169 0.2586 0.5139
LR 0.0011 0.0334 0.0555 0.0253 0.5040
SVR 0.0034 0.0579 0.2206 0.0473 0.5800
LMH-BiGRU 0.0001 0.0079 0.0126 0.0058 0.7857
Three-step ahead
GRU 0.0047 0.0686 0.1292 0.0577 0.4087
BiGRU 0.0016 0.0404 0.0716 0.0319 0.4563
ARIMA 0.0006 0.0248 0.0172 0.2591 0.4840
LR 0.0018 0.0428 0.0729 0.0329 0.5480
SVR 0.0040 0.0644 0.2223 0.0522 0.5920
LMH-BiGRU 0.0001 0.0080 0.0127 0.0058 0.7659
418

419

Electronic copy available at: https://ssrn.com/abstract=3614428


420 Fig. 7. Prediction results comparison for proposed model and single benchmark models
421
422 As can be seen from Table 2, the five single benchmark models and the proposed
423 VMD-LMH-BiGRU model displayed significantly different model performances. Looking at the
424 level of prediction accuracy between the single models, the BiGRU model performed better than
425 the traditional GRU model. Take one-step ahead prediction as an example, the
426 𝑀𝑆𝐸, 𝑅𝑀𝑆𝐸, 𝑀𝐴𝑃𝐸 and 𝑀𝐴𝐸 criteria are decreased by 18.18%, 10.21%, 9.45% and 8.94%,
427 respectively. This indicates that the bidirectional structure is superior to the traditional
428 monodirectional neural networks structure. By adopting a bidirectional structure, the model is
429 able to extract more information within the time series, thus yielding better performance results.
430 When comparing the level of prediction accuracy between the proposed model and the single
431 models, the proposed VMD-LMH-BiGRU model displayed far superior fitting performance by
432 performing better across all the criteria (𝑀𝑆𝐸, 𝑅𝑀𝑆𝐸, 𝑀𝐴𝑃𝐸 and 𝑀𝐴𝐸)..
433
434 In terms of directional accuracy (𝐷𝐴), the one-step ahead 𝐷𝐴 for the GRU, BiGRU,
435 ARIMA, LR and SVR models are 52.3%, 53.9%, 54.3%, 52.7% and 56.3%, respectively. The
436 directional accuracies achieved by these single models are all below 60%, which indicates that
437 despite their prediction accuracy, they are unable to effectively predict the Bitcoin market trend.
438 In comparison, the proposed VMD-LMH-BiGRU model displayed significantly better market
439 trend predictability by achieving a 𝐷𝐴 value of 81.7%, which indicates that the proposed model
440 is able to predict the market trend effectively.
441
442 As presented in Figure 7, the forecasting performance results show that compared to the
443 single benchmark models, the proposed decomposition-based hybrid model is able to
444 significantly improve the Bitcoin market prediction accuracy and directional accuracy. The main
445 reasons could be attributed to “data-decomposition”, which can effectively decompose the
446 complex historical Bitcoin market price time series into inner factors of different frequencies.
447 These hidden inner factors can reveal the patterns and information that exist in the original time
448 series, which enhances the model prediction accuracy.
449
450 3.2.4. Forecasting performance evaluation with hybrid benchmark models

Electronic copy available at: https://ssrn.com/abstract=3614428


451
452 Although the decomposed inner factors can significantly enhance the market prediction
453 accuracy of the model, the decomposed inner factors have different cycles that ranged from
454 several days to several years. The differences in period lengths may indicate that for short-term
455 price prediction in the Bitcoin market, not all decomposed factors contribute to the improved
456 predictive ability of the model equally. Therefore, we further assess the effectiveness of different
457 decomposed inner factors in improving the forecasting performance of the model by comparing
458 the performances of the proposed VMD-LMH-BiGRU model with hybrid one-characteristic and
459 hybrid two-characteristic models.
460
461 The proposed VMD-LMH-BiGRU model and the six hybrid benchmark models are
462 formulated and utilized to forecast the daily price of the Bitcoin market. The one-step prediction
463 results comparisons are shown in Figure 7-9 and the prediction accuracy results are illustrated in
464 Table 3. According to the comparison results, the proposed VMD-LMH-BiGRU model not only
465 achieved the highest prediction accuracy (measured by 𝑀𝑆𝐸, 𝑅𝑀𝑆𝐸, 𝑀𝐴𝑃𝐸 and 𝑀𝐴𝐸), but
466 also obtained the highest directional accuracy (measured by 𝐷𝐴) across different forecasting
467 horizons (1-step, 2-step and 3-step). In comparison, it performed significantly better than the
468 benchmark models, including the single models, the hybrid one-characteristic models and the
469 hybrid two-characteristics models. This superior performance shows that the proposed model
470 effectively captures the different inner factors that exist in the Bitcoin market and thus
471 significantly enhance the final prediction accuracy.
472

Electronic copy available at: https://ssrn.com/abstract=3614428


473
474 Fig. 8. Prediction results comparison for proposed model and hybrid one-characteristic
475 models
476

477
478 Fig. 9. Prediction results comparison for proposed model and hybrid two-characteristics
479 models

Electronic copy available at: https://ssrn.com/abstract=3614428


480 Examining the prediction accuracy results for the hybrid one-characteristic models and
481 the two-characteristic models, it is clear that the proposed VMD-LMH-BiGRU model
482 outperforms all the benchmark models across multiple forecasting horizons. In terms of the
483 hybrid one-characteristic models, the VMD-H-BiGRU model with high-frequency inner factors
484 achieves the best prediction accuracy amongst the three models. When medium-frequency modes
485 are used, the VMD-M-BiGRU experiences a considerable decline in its forecasting performance.
486 Finally, when the low-frequency factors are included instead, the VMD-L-BiGRU achieved the
487 worst performance, resulting in a significant increase across all criteria (𝑀𝑆𝐸, 𝑅𝑀𝑆𝐸, 𝑀𝐴𝑃𝐸
488 and 𝑀𝐴𝐸). These results indicate that, the inner factors of different frequencies decomposed
489 from the original Bitcoin market time series have various effects on the prediction model.
490
491 A similar pattern can be observed in the hybrid two-characteristic models. When the
492 high-frequency modes are removed from the proposed VMD-LMH-BiGRU model, the
493 prediction performance of the resulting VMD-LM-BiGRU model suffers a significant decrease
494 with the 𝑅𝑀𝑆𝐸, 𝑀𝐴𝑃𝐸 and 𝑀𝐴𝐸 increasing by 4.93, 5.08 and 5.19 times, respectively. When
495 the medium frequency inner factors are removed, the 𝑅𝑀𝑆𝐸 of the VMD-LH-BiGRU model
496 doubled from 0.007 to 0.014. In comparison, with the low-frequency modes removed, the
497 VMD-MH-BiGRU model achieves better prediction performance than the other two hybrid
498 two-characteristics models and obtains the smallest 𝑀𝑆𝐸, 𝑅𝑀𝑆𝐸, 𝑀𝐴𝑃𝐸 and 𝑀𝐴𝐸 values.
499 These further indicate that although the short-term Bitcoin price prediction performance is
500 affected by all decomposed inner factors, factors of different frequencies have various effects on
501 the Bitcoin market prediction performance. In particular, the high-frequency modes contribute
502 the most to improving the Bitcoin market price predictions in the proposed model, whereas the
503 low-frequency inner factors have the least effect in improving the prediction performance.
504
505 Examining the directional accuracies of the benchmark models, it is clear that the
506 VMD-L-GRU hybrid one-characteristic model obtained an accuracy below 60%, which indicates
507 that it is unable to effectively predict the Bitcoin market trend. In addition, the hybrid
508 two-characteristics VMD-ML-GRU model achieved the highest directional accuracy out of all
509 the benchmark models. This further shows that the low-frequency modes have trivial effects on
510 the short-term market movements, while high-frequency modes have the essential effects.

Electronic copy available at: https://ssrn.com/abstract=3614428


511
512 The inner factors of different frequencies may contain different hidden information with
513 various economic significance (Wang et al.,2014). Specifically, the low-frequency inner factor
514 approximately captures the long-term trend in the Bitcoin market prices from 2013-2019. Since
515 Bitcoin is recognized as an investment option and a trading commodity, its long-term trend may
516 have been largely influenced by the economic cycle and reflects changes in the global economy
517 (Dyhrberg, 2016; Ji et al., 2019).
518
519 The medium-frequency modes, which have periods ranging from one month to eight
520 months, represent economic and political events related to Bitcoin and other cryptocurrencies.
521 These events, such as international regulatory policies, are important influence factors on the
522 Bitcoin market price volatility over the medium-term. As a result, the high-volatility points in the
523 market price reflect the influences of these events on the Bitcoin market, which can be observed
524 as illustrated in Figure 6. For example, the Bitcoin market prices crashed in early January, which
525 corresponds to the time when multiple governments such as China, South Korea and the United
526 States announced tightened regulations on Bitcoin trading (Cumming, Johan & Pant, 2019). Over
527 time, the medium-frequency component reverts to the mean, which indicates that the Bitcoin
528 market has absorbed the influences of these events. As a result, the Bitcoin market price
529 eventually returns to its long-term trend.
530
531 The high-frequency inner factors have the shortest periods out of all three components,
532 which range from 4 to 21 days. These factors may possibly reflect the short-term fluctuations
533 such as investor speculations that exist on the market. These random disturbances exhibit
534 mean-reversion characteristics with very short cycles, which indicate that their influences are
535 quickly dissipated in the market and rarely sustained over time. However, these high-frequency
536 inner factors may have relatively greater effects on the short-term fluctuations in the Bitcoin
537 market, which makes them more meaningful for short-term price forecasting.
538
539 Table 3. Performance comparison of hybrid benchmark models in all intervals
Model MSE RMSE MAPE MAE DA
One-step ahead

Electronic copy available at: https://ssrn.com/abstract=3614428


L-BiGRU 0.0021 0.0455 0.0824 0.0367 0.5159
M-BiGRU 0.0004 0.0210 0.0351 0.0162 0.6151
H-BiGRU 0.0002 0.0128 0.0217 0.0099 0.6627
LM-BiGRU 0.0014 0.0368 0.0595 0.0280 0.6548
LH-BiGRU 0.0003 0.0171 0.0314 0.0141 0.6429
MH-BiGRU 0.0002 0.0148 0.0269 0.0120 0.7619
LMH-BiGRU 0.0001 0.0186 0.0233 0.0120 0.8175
Two-step ahead
L-BiGRU 0.0033 0.0573 0.0976 0.0423 0.4722
M-BiGRU 0.0007 0.0258 0.0446 0.0203 0.6190
H-BiGRU 0.0002 0.0126 0.0210 0.0096 0.6865
LM-BiGRU 0.0012 0.0343 0.0559 0.0258 0.6508
LH-BiGRU 0.0002 0.0146 0.0272 0.0120 0.6349
MH-BiGRU 0.0001 0.0101 0.0174 0.0079 0.7500
LMH-BiGRU 0.0001 0.0079 0.0126 0.0058 0.7857
Three-step ahead
L-BiGRU 0.0050 0.0707 0.1340 0.0569 0.4524
M-BiGRU 0.0011 0.0329 0.0566 0.0262 0.5675
H-BiGRU 0.0002 0.0147 0.0259 0.0116 0.6270
LM-BiGRU 0.0012 0.0348 0.0657 0.0294 0.6468
LH-BiGRU 0.0004 0.0196 0.0367 0.0161 0.5992
MH-BiGRU 0.0001 0.0100 0.0178 0.0081 0.7183
LMH-BiGRU 0.0001 0.0080 0.0127 0.0058 0.7659
540
541 Generally speaking, the proposed VMD-LMH-BiGRU model displays higher prediction
542 accuracy in comparison to all the benchmark models, including the single models, the hybrid
543 one-characteristic models and the hybrid two-characteristics models. This indicates that by
544 decomposing the original Bitcoin market price time series into different inner factors, the model
545 can effectively extract the hidden information that exist in the data and significantly improve the
546 prediction performance. The proposed model also obtained the highest directional accuracy out
547 of all the constructed models. This shows that the proposed model can effectively capture the

Electronic copy available at: https://ssrn.com/abstract=3614428


548 Bitcoin market movement trend, making it a practical and promising technique for predicting the
549 Bitcoin market price. Although the prediction performance of the proposed VMD-LMH-BiGRU
550 model is greatly enhanced due to the comprehensive effect of all the decomposed inner factors,
551 the inner factors of different frequencies have various effects on model prediction results. In
552 particular, the high-frequency modes contain mostly the short-term random fluctuations that exist
553 in the market. As a result, it contributes most to the improvement in the short-term Bitcoin price
554 prediction accuracy.
555
556 3.2.5. Trading results comparisons
557
558 To determine the practicality of using the proposed model as a decision support tool in real
559 world Bitcoin trading, algorithmic trading is conducted based on the predicted Bitcoin market
560 price. The forecasting model produces a buy signal if the predicted Bitcoin market price next day
561 will be higher than the current Bitcoin market price today. If the predicted price next day is lower
562 than the price today, the forecasting model generates a sell signal. Otherwise, the model
563 produces a hold signal. The model will then use the generated signals to conduct algorithmic
564 trading. Specifically, if a buy signal is generated, all the available capital will be used to
565 purchase Bitcoin at that specific time. On the other hand, if a sell signal is generated, all the
566 purchased Bitcoins will be sold at that specific time. In this study, the initial investment capital is
567 set to $100000.
568
569 To be potentially useful as a trading decision support system, the forecasting model must
570 have a directional accuracy higher than 50%, which could be obtained by chance. In this paper,
571 we set the directional accuracy threshold to 60%, which means the prediction model must have a
572 directional accuracy of above 60% to be able to capture the market trends effectively. As shown
573 in Table 2-3, the proposed VMD-LMH-BiGRU model achieved a directional accuracy of 81.7%,
574 which is higher than that of all the benchmark models. In addition, since the directional accuracy
575 of the single models (GRU, BiGRU, ARIMA, LR, SVR) and VMD-L-BiGRU model are all
576 below 60%, they are considered incapable of predicting the market trends and thus will not be
577 used in the trading comparisons.
578

Electronic copy available at: https://ssrn.com/abstract=3614428


579 Assuming that the Efficient Market Hypothesis (EMH) holds, it is impossible to consistently
580 generate superior trading strategies in comparison to the market (Fama, 1970). Thus, the
581 buy-and-hold strategy is included to compare the trading performance of the proposed prediction
582 model. Under the buy-and-hold strategy, Bitcoin is purchased on the first day of the trading
583 interval and sold on the last day. In this paper, the annualized return (𝐴𝑅) is used as the
584 performance measure to compare the trading strategies, which is calculated as follows:
585
,
𝑇𝑜𝑡𝑎𝑙 𝐶𝑎𝑝𝑖𝑡𝑎𝑙 ‰1•žŸk • ¡Ÿ¢k£
586 𝐴𝑅 = •< A − 1¤ ∗ 100% (19)
𝐼𝑛𝑖𝑡𝑖𝑎𝑙 𝐶𝑎𝑝𝑖𝑡𝑎𝑙

587
588 The out-of-sample testing interval consists of 252 trading days from April 10, 2019 to
589 December 17, 2019. As illustrated in Figure 3, the Bitcoin market price experienced significant
590 fluctuations during the testing interval. From April 10, 2019 to July 11, 2019, the Bitcoin price
591 rose from $5197.75 to its highest point of $12668.62. From July 12, 2019 to December 17, 2019,
592 the price fell back down from the peak to $7046.69. Taking this into consideration, the
593 out-of-sample testing period is split into two different intervals – the “Up” interval consist of 93
594 days from April 10, 2019 to July 11, 2019. The “Down” interval consist of 159 days from July
595 12, 2019 to December 17, 2019.
596

Electronic copy available at: https://ssrn.com/abstract=3614428


597
598 Fig. 10. Trading performance of all trading strategies
599
600 Table 4. Annualized returns of all trading strategies
Annualized Return
intervals
Investment Strategy
Overall Up Down
VMD-LMH-BiGRU 235.23% 383.06% 148.76%
Buy and Hold 30.43% 241.41% -78.46%
VMD-M-BiGRU 84.13% 223.00% 2.98%
VMD-H-BiGRU 173.88% 333.23% 80.67%
VMD-LM-BiGRU 132.91% 247.17% 66.08%
VMD-LH-BiGRU 168.52% 323.48% 77.87%
VMD-MH-BiGRU 219.37% 356.36% 139.24%
601
602 Figure 10 and Table 4 present the annualized returns obtained by all trading strategies in
603 the out-of-sample “Up” interval, “Down” interval, as well as the overall interval. The results
604 clearly demonstrate that the proposed VMD-LMH-BiGRU prediction model outperforms the
605 naïve buy-and-hold strategy as well as other benchmark models across all testing periods.
606 Specifically, during the “Up” interval where the bitcoin market price soared quickly, an investor

Electronic copy available at: https://ssrn.com/abstract=3614428


607 that is trading on the Bitcoin market using the signals generated by the proposed
608 VMD-LMH-BiGRU model achieves an annualized return of 383.06% after 93 trading days. This
609 is a 58.67% increase compared to the buy-and-hold strategy. During the “Down” interval where
610 the bitcoin market experienced significant losses, the proposed model is able to withstand the
611 negative market impacts and consistently generate profits. The trading results above clearly show
612 that the proposed VMD-LMH-BiGRU model is able to generate accurate buy and sell signals
613 based on the predicted Bitcoin price. More importantly, the proposed model is able to reduce the
614 negative impacts of bullish market conditions and generate profits steadily.
615
616 Overall, the superior trading performance displayed by the proposed VMD-LMH-BiGRU
617 model indicates it is not only an effective prediction model, but also a potentially useful trading
618 support system.
619

620 4. Conclusion
621
622 Although Bitcoin has attracted significant attention from investors and policy makers, the
623 empirical works in the Bitcoin pricing forecasting models are at an early stage. This paper fills
624 the gap by proposing the VMD-LMH-BiGRU model, a novel bidirectional deep learning model
625 combined with data-decomposition techniques, to forecast the Bitcoin market price. The
626 prediction performance of the proposed model is assessed against several benchmark models,
627 including the single models, hybrid one-characteristic models and the hybrid two-characteristics
628 models.
629
630 In our study, by decomposing the original Bitcoin price time series into different inner
631 factors, the proposed model is able to effectively capture the hidden patterns of different
632 frequencies that exist in the time series. In addition, by adopting a bidirectional neural networks
633 structure, the proposed model is able to effectively capture the two-way sequential relationship
634 within the time series.
635
636 According to our empirical results, the proposed VMD-LMH-BiGRU model outperformed
637 all the benchmark models in terms of prediction accuracy across multiple forecasting periods.

Electronic copy available at: https://ssrn.com/abstract=3614428


638 Moreover, the proposed model also displays superior trading performances in comparison to
639 other benchmarks such as the buy-and-hold strategy. In particular, our model shows strong
640 consistency in generating profits under volatile market conditions. It effectively reduces the
641 negative impacts of bullish market conditions, which is especially important to avoiding losses in
642 the highly volatile Bitcoin market. Overall, the superior performances demonstrated by the
643 proposed VMD-LMH-BiGRU model in terms of prediction accuracies and trading results
644 indicate that it is not only an effective prediction model, but also a potentially useful trading
645 support system.
646
647 In addition, this study also investigates the effects of different decomposed frequency modes
648 on the prediction performance of the model. The results show that the inner factors of different
649 frequencies have various effects on model prediction results. In particular, the high-frequency
650 modes contain mostly the short-term random fluctuations that exist in the market. As a result, it
651 contributes most to the improvement in the short-term Bitcoin price prediction accuracy. By
652 investigating the effects of different decomposed frequency modes on the prediction result, it
653 further reveals the potential factors that affect the Bitcoin market movement.
654
655 To conclude, this paper extends the Bitcoin literature by serving as a first attempt towards
656 developing a reliable Bitcoin forecasting and trading decision support system using a novel data
657 decomposition-based hybrid bidirectional deep learning method. In future works, features from
658 external financial environments should be exploited to investigate the effects on the Bitcoin price
659 prediction performance of the proposed model. Moreover, future attempts should be made to
660 generate more user-friendly decision support system for investors.
661
662

Electronic copy available at: https://ssrn.com/abstract=3614428


663 References:
664 [1] Altan, A., Karasu, S., & Bekiros, S. (2019). Digital currency forecasting with chaotic
665 meta-heuristic bio-inspired signal processing techniques. Chaos, Solitons & Fractals, 126,
666 325-336.
667 [2] Atsalakis, G. S., Atsalaki, I. G., Pasiouras, F., & Zopounidis, C. (2019). Bitcoin price
668 forecasting with neuro-fuzzy techniques. European Journal of Operational Research, 276(2),
669 770-780.
670 [3] Böhme, R., Christin, N., Edelman, B., & Moore, T. (2015). Bitcoin: Economics, technology,
671 and governance. Journal of Economic Perspectives, 29(2), 213-38.
672 [4] Cheng, S. F., De Franco, G., Jiang, H., & Lin, P. (2019). Riding the Blockchain Mania:
673 Public Firms’ Speculative 8-K Disclosures. Management Science, 65(12), 5901-5913.
674 [5] Cohen, A., & Nissim, N. (2018). Trusted detection of ransomware in a private cloud using
675 machine learning methods leveraging meta-features from volatile memory. Expert Systems
676 with Applications, 102, 158-178.
677 [6] Colominas, M. A., Schlotthauer, G., & Torres, M. E. (2014). Improved complete ensemble
678 EMD: A suitable tool for biomedical signal processing. Biomedical Signal Processing and
679 Control, 14, 19-29.
680 [7] Cumming, D. J., Johan, S., & Pant, A. (2019). Regulation of the Crypto-Economy:
681 Managing Risks, Challenges, and Regulatory Uncertainty. Journal of Risk and Financial
682 Management, 12(3), 126.
683 [8] De Vries, A. (2018). Bitcoin's growing energy problem. Joule, 2(5), 801-805.
684 [9] Dragomiretskiy, K., & Zosso, D. (2013). Variational mode decomposition. IEEE
685 transactions on signal processing, 62(3), 531-544.
686 [10] Dyhrberg, A. H. (2016). Bitcoin, gold and the dollar–A GARCH volatility analysis. Finance
687 Research Letters, 16, 85-92.
688 [11] Easley, D., O'Hara, M., & Basu, S. (2019). From mining to markets: The evolution of
689 bitcoin transaction fees. Journal of Financial Economics, 134(1), 91-109.
690 [12] Foley, S., Karlsen, J. R., & Putniņš, T. J. (2019). Sex, drugs, and bitcoin: How much illegal
691 activity is financed through cryptocurrencies?. The Review of Financial Studies, 32(5),
692 1798-1853.

Electronic copy available at: https://ssrn.com/abstract=3614428


693 [13] Gandal, N., Hamrick, J. T., Moore, T., & Oberman, T. (2018). Price manipulation in the
694 Bitcoin ecosystem. Journal of Monetary Economics, 95, 86-96.
695 [14] Garcia, D., Tessone, C. J., Mavrodiev, P., & Perony, N. (2014). The digital traces of bubbles:
696 feedback cycles between socio-economic signals in the Bitcoin economy. Journal of the
697 Royal Society Interface, 11(99), 1-8.
698 [15] Gatabazi, P., Mba, J. C., Pindza, E., & Labuschagne, C. (2019). Grey Lotka–Volterra
699 models with application to cryptocurrencies adoption. Chaos, Solitons & Fractals, 122,
700 47-57.
701 [16] Heaven, D. (2019). Bitcoin for the biological literature. Nature, 566(7742), 141.
702 [17] Janssen, M., Weerakkody, V., Ismagilova, E., Sivarajah, U., & Irani, Z. (2020). A
703 framework for analysing blockchain technology adoption: Integrating institutional, market
704 and technical factors. International Journal of Information Management, 50, 302-309.
705 [18] Ji, S., Kim, J., & Im, H. (2019). A Comparative Study of Bitcoin Price Prediction Using
706 Deep Learning. Mathematics, 7(10), 898.
707 [19] Jones, N. (2018). How to stop data centres from gobbling up the world's electricity. Nature,
708 561(7722), 163-167.
709 [20] Katsiampa, P. (2017). Volatility estimation for Bitcoin: A comparison of GARCH models.
710 Economics Letters, 158, 3-6.
711 [21] Kristjanpoller, W., & Minutolo, M. C. (2018). A hybrid volatility forecasting framework
712 integrating GARCH, artificial neural network, technical analysis and principal components
713 analysis. Expert Systems with Applications, 109, 1-11.
714 [22] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.
715 [23] Leng, J., Jiang, P., Xu, K., Liu, Q., Zhao, J. L., Bian, Y., & Shi, R. (2019). Makerchain: A
716 blockchain with chemical signature for self-organizing process in social manufacturing.
717 Journal of Cleaner Production, 234, 767-778.
718 [24] Li, L., Liu, J., Chang, X., Liu, T., & Liu, J. (2020). Toward conditionally anonymous
719 Bitcoin transactions: A lightweight-script approach. Information Sciences, 509, 290-303.
720 [25] Liu, W., Cao, S., & Chen, Y. (2016). Applications of variational mode decomposition in
721 seismic time-frequency analysis. Geophysics, 81(5), 365-378.
722 [26] Madan, I., Saluja, S., & Zhao, A. (2015). Automated bitcoin trading via machine learning
723 algorithms. http://cs229. stanford.edu/proj2014/Isaac% 20Madan, 20.

Electronic copy available at: https://ssrn.com/abstract=3614428


724 [27] Malkiel, B. G., & Fama, E. F. (1970). Efficient capital markets: A review of theory and
725 empirical work. The journal of Finance, 25(2), 383-417.
726 [28] Masanet, E., Shehabi, A., Lei, N., Vranken, H., Koomey, J., & Malmodin, J. (2019).
727 Implausible projections overestimate near-term Bitcoin CO 2 emissions. Nature Climate
728 Change, 9(9), 653-654.
729 [29] Mora, C., Rollins, R. L., Taladay, K., Kantar, M. B., Chock, M. K., Shimada, M., & Franklin,
730 E. C. (2018). Bitcoin emissions alone could push global warming above 2 C. Nature Climate
731 Change, 8(11), 931-933.
732 [30] Moro, S., Cortez, P., & Rita, P. (2014). A data-driven approach to predict the success of
733 bank telemarketing. Decision Support Systems, 62, 22-31.
734 [31] Mu, W., Bian, Y., & Zhao, J. L. (2019). The role of online leadership in open collaborative
735 innovation. Industrial Management & Data Systems, 119(9), 1969-1987.
736 [32] Nowotarski, J., Tomczyk, J., & Weron, R. (2013). Robust estimation and forecasting of the
737 long-term seasonal component of electricity spot prices. Energy Economics, 39, 13-27.
738 [33] Peng, Y., Albuquerque, P. H. M., de Sá, J. M. C., Padula, A. J. A., & Montenegro, M. R.
739 (2018). The best of two worlds: Forecasting high frequency volatility for cryptocurrencies
740 and traditional currencies with Support Vector Regression. Expert Systems with
741 Applications, 97, 177-192.
742 [34] Santhosh, M., Venkaiah, C., & Kumar, D. V. (2019). Short-term wind speed forecasting
743 approach using ensemble empirical mode decomposition and deep Boltzmann machine.
744 Sustainable Energy, Grids and Networks, 19, 100242.
745 [35] Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE
746 transactions on Signal Processing, 45(11), 2673-2681.
747 [36] Tang, L., Dai, W., Yu, L., & Wang, S. (2015). A novel CEEMD-based EELM ensemble
748 learning paradigm for crude oil price forecasting. International Journal of Information
749 Technology & Decision Making, 14(01), 141-169.
750 [37] Ullah, A., Ahmad, J., Muhammad, K., Sajjad, M., & Baik, S. W. (2017). Action recognition
751 in video sequences using deep bi-directional LSTM with CNN features. IEEE Access, 6,
752 1155-1166.

Electronic copy available at: https://ssrn.com/abstract=3614428


753 [38] Wang, S., Hu, A., Wu, Z., Liu, Y., & Bai, X. (2014). Multiscale combined model based on
754 run-length-judgment method and its application in oil price forecasting. Mathematical
755 Problems in Engineering, 1-9.
756 [39] Wang, Y., Markert, R., Xiang, J., & Zheng, W. (2015). Research on variational mode
757 decomposition and its application in detecting rub-impact fault of the rotor system.
758 Mechanical Systems and Signal Processing, 60, 243-251.
759 [40] Welch, P. (1967). The use of fast Fourier transform for the estimation of power spectra: a
760 method based on time averaging over short, modified periodograms. IEEE Transactions on
761 audio and electroacoustics, 15(2), 70-73.
762 [41] Wen, F., Yang, X., Gong, X., & Lai, K. K. (2017). Multi-scale volatility feature analysis and
763 prediction of gold price. International Journal of Information Technology & Decision
764 Making, 16(01), 205-223.
765 [42] Yu, J. H., Kang, J., & Park, S. (2019). Information availability and return volatility in the
766 bitcoin Market: Analyzing differences of user opinion and interest. Information Processing
767 & Management, 56(3), 721-732.
768 [43] Yu, L., Wang, S., & Lai, K. K. (2008). Forecasting crude oil price with an EMD-based
769 neural network ensemble learning paradigm. Energy Economics, 30(5), 2623-2635.
770 [44] Yu, L., Wang, Z., & Tang, L. (2015). A decomposition–ensemble model with
771 data-characteristic-driven reconstruction for crude oil price forecasting. Applied Energy, 156,
772 251-267.
773 [45] Zhang, C., Zhou, J., Li, C., Fu, W., & Peng, T. (2017). A compound structure of ELM based
774 on feature selection and parameter optimization using hybrid backtracking search algorithm
775 for wind speed forecasting. Energy Conversion and Management, 143, 360-376.
776 [46] Zheng, Z., Xie, S., Dai, H.N., Chen, X., & Wang, H. (2018). Blockchain challenges and
777 opportunities: a survey. International Journal of Web and Grid Services, 14(4), 352-375.
778 [47] Zola, P., Cortez, P., & Carpita, M. (2019). Twitter user geolocation using web country noun
779 searches. Decision Support Systems, 120, 50-59.

Electronic copy available at: https://ssrn.com/abstract=3614428

You might also like