Predictive Modelling of Turbofan Engine Components Condition Using Machine and Deep Learning Methods
Predictive Modelling of Turbofan Engine Components Condition Using Machine and Deep Learning Methods
a
Institute of Heat Engineering, Faculty of Power and Aeronautical Engineering, Warsaw University of Technology, Nowowiejska 21/25, 00-665 Warsaw, Poland
b
General Electric Company Polska sp. z. o. o., Al. Krakowska 110/114, 02-256 Warsaw, Poland
Highlights Abstract
• 0-10 condition rank of a turbofan life limiting The article proposes an approach based on deep and machine learning models to predict a
component is predicted. component failure as an enhancement of condition based maintenance scheme of a turbofan
engine and reviews currently used prognostics approaches in the aviation industry. Com-
• Environmental and engine sensors data preceding
ponent degradation scale representing its life consumption is proposed and such collected
the condition observation are used.
condition data are combined with engines sensors and environmental data. With use of data
• Ensemble meta-model of neural networks shown manipulation techniques, a framework for models training is created and models' hyperpa-
the best performance. rameters obtained through Bayesian optimization. Models predict the continuous variable
• Support vector machines and gradient boosted representing condition based on the input. Best performed model is identified by detemining
models did not match neural nets. its score on the holdout set. Deep learning models achieved 0.71 MSE score (ensemble
meta-model of neural networks) and outperformed significantly machine learning models
• Linear model demonstrated the worst perform- with their best score at 1.75. The deep learning models shown their feasibility to predict the
ance among considered models. component condition within less than 1 unit of the error in the rank scale.
Keywords
This is an open access article under the CC BY license reliability, prognostics, deep learning, machine learning, gas turbine, turbofan engine,
([Link] neural network, condition-based maintenance.
1. Introduction vironmental factors like volcanic activity [12] and air contaminants
presence like dust aerosols as seen in a test [6] and in operation [26].
A modern aircraft’s turbofan engine is a complex mechanical sys-
Additionally, an ease of performing a visual on-wing inspection
tem with numerous components that need to be properly maintained
of the hardware depends on its location in the engine and capability
to continue its safe and profitable operation. As the components dete-
of the inspecting crew and its equipement. Thus with all the factors
riorate they need to be replaced or repaired which drive the engine off
combined the actual confirmation of the part condition is not always
wing for often time consuming overhaul [8] and creates a cost burden
feasible.
requiring proper engine fleet management to continue the aircraft op-
It is common that engine components wear occurs at different
eration [18].
rates and single components compete in being limiting for the engine
Aircraft engine components condition is assessed on recurring in-
useful life. Hence a prediction of the current state of the wear of the
spections and compared to the limits provided by the engine manufac-
components becomes a crucial task in the fleet management. With the
turer which constitute the Instructions for Continued Airworthiness
development of health monitoring systems and on board diagnostics
approved and controlled by the regulatory agency in a form of an
technologies deployment, a significant amount of data has become
engine manual [16]. The engine manual limits proposed by the engine
available for engineers to analyze which enables enhancement of clas-
manufacturer are based upon understanding of the physics behind the
sical condition based maintenance [29].
particular wear out scheme and the condition progression until the
In the light of the latest research based in the field of predicting
part cannot be operated any longer and has to be replaced.
components life this paper proposes a data-driven approach for an
With the complexities of loads that parts are exposed to a variety
aviation turbofan engine.
of competing failure modes occuring at different stages of part’s age
and progresing at different rates comes with significant impact of en-
Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021 359
2. Failure prediction methods overview in predicting a capture a non-linear relationship between RUL and
sensor data.
2.1 Prognostics approaches Having in mind the mentioned researches in the field of prognos-
There exist numerous examples of attempts to predict the compo- tics, the deep learning methods deliver promising results replacing
nent failure of a part or the entire system depending on the problem at physics based models provided sufficient understanding of the matter
hand, design phase and data available. is reached as authors demonstrated in number of publications [23].
In the concept design phase where numerical models are avail-
able Ning Baojun et al. proposed a method to incorporate boundary 2.2. Target variable in researches
condition uncertainity into the FEA of a turbofan engine combustor to An important role in prognostics and health management plays a
obtain a stochastic life predition [4]. Another approach is presented by systems health index (HI) as it reflects the system condition and its
Echarda et al. [10] where a SARFAN’s aviation engine blade support potential to perform its function throughout the system useful life.
is analyzed with a variation of geometry, material properties and load The index is widely used concept across researches based in various
variation to computionally capture the life prediction and its probabil- industries ranging from electronics equipment, through heavy ma-
ity. These models can be very accurate and deliver useful information chinery to the aviation industry.
about the type design, however a good understanding of the failure A paper published by Amir et al. has researched a condition-based
mode is necessary. health index concept where overall health index was calculated based
With available failure data one can apply different predictive meth- on the individual indicators [1] and used a 10-grade scale differentiat-
ods. In their article Yang et al. explore potential for matching the ing a system condition from good to bad and enabling to categorize
failure times of an aeronautical equipment components to probability the particular system units. In power transformer application a health
distributions to the outcome of finding that the normal distribution index ranging from 0 to 1 have been presented by Lata et al. [2] and
to best reflects the actual life distribution [38]. Whereas some cases incorporated a various input relevant to that particular system to es-
show promise of normal distribution use, the others like the subject tablish the resulting index value.
studied in the other paper by Yang et al. indicate 3 parameter Weibull In the case of turbofan engine, a health index based on engine sen-
to best represent failure probability of airborne equipment [37]. These sor flight by flight data were used to establish and predict a high-
modelling approach enables the engineer to make predictions of the pressure compressor deterioration [33,34].
part failure based on the sample of fielded hardware and employing Another interesting way to develop a health index out of turbofan
statistical methods in place of finite element computations with a engine sensor readings have been proposed in [5] where a step by step
challenge of collecting sufficient amount of well comprehended data. aggregation of the normalized feature values was proposed. In such
The other researches focus on the engine health monitoring and arrangement a growing health index would cumulate over time of op-
fault diagnostics, where engine sensors are used to look for a signal of eration and judgements about RUL can be made.
a deteriorating engine health or a faulty component. Turbofan engine Based on the solid fundaments established by the research com-
health degradation and prognostics of the remaining useful life (RUL) munity the subject of this paper uses a condition-based health index
was deployed by Zaidan et al. [41] with a use of a Bayesian Network with 10 grade scale.
Regression. Xiu et al. present an aviation turbofan engine fault diag-
nosis scheme based on deep belief network (DBN) [36]. The neural 3. Problem description
network composed of mulitple layers forming restricted Botlzmann Turbofan engine components are inspected reccurently at least as
machines (RBM) succesfully modeled engine systems and engine often as recommended by the engine manufacturer thus providing a
sensory data have been fed into the model and corresponding engine valueable condition data. The considered component operates on the
fault state have been predicted. condition based maintenance scheme. The participating engines have
Another deep learning model is researched in a paper by Sina been monitored for a period from third quarter of 2014 to first quarter
Tayarani-Bathaie et al. and revealed that dynamic neural networks of 2020 to obtain one of the hot gas path component data.
based on multi-layer perceptron (MLP) networks demonstrated prom- The obvious challenge is in the formulation of the life prediction
ising performance in prediction of a turbofan engine fault [31]. Also, problem. The intent is to determine, based on available information,
Heimnes in [14] reports a satisfactory results in RUL prediction with at what stage of degradation the given component is. A very efficient
a MLP classifier. technique to determine a moment when a given system would fail is
In [19] the researchers are introducing useful classifications of the RUL estimation. As the authors of the [11] presented, RUL can be de-
AI-based methodologies used in the aerospace industry for systems termined by use of a degradation characteristic of an aviation engine
health management; (1) knowledge-based, (2) probabilistic and (3) as input variable to obtain a survival function that later can be used to
data-driven with authors pointing out towards the growing interest predict moment of a probable failure. A degradation characteristic is
paid by the scientific community to the deep learning methods. Sikor- specific to the system and may depend on the physics of a considered
ska et al. [30] report successes in the field of prognostics and predic- wear out mechanism. For a gas turbine it could be an exhaust gas tem-
tion of RUL by artificial neural networks (ANN) and making them perature [14] or a compressor recoup pressure [25], both being related
a separate category of RUL prediction models noting their ability to to the system wear out and continuous trend of either could be a signa-
handle noisy data. Pawełczyk et al [25] have recently reported a suc- ture that can be used to judge incoming expiration of useful life.
cesful use of machine learning methods to predict the condition of However, in the researched system, the component wear out, de-
high pressure compressor in a stationary gas turbine. spite progressing with time, is not picked up by engine sensors and
A different take on asset failure prediction is presented in the works thus a trend as such cannot be the degradation characteristic. Also,
of Yoon et al. where deep generative models in semi-supervised learn- there exist no spike in any of the sensor readings when the component
ing scheme have been implemented to predict estimated time to failure reaches the condition at which it is desirable to be removed to avoid
and show that data-driven approaches are alternatives to the physics- further costly engine damage and potential impact to the customers
driven modelling [40]. In the presented study for the sparse labelled operation schedule. Therefore an anomaly detection methods are not
turbofan data the variational autoencoders have delivered great results available in this case.
over the gated recurrent units (GRU) and long short-term memory Regardless of its lack of visibility in the engine system sensors, the
(LSTM) network architectures. component life is limiting to the entire system. To adress this problem,
Among other network architectures deep convolutional neural net- the authors propose to use component condition data and the engine
works (CNN) have been demonstrated by Babu et al. [3] to be feasible operation data preceding the inspection at which the condition rank
360 Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021
was collected. Then, by the means of data science; conducting data facturer and supported by conclusions from conducting a root cause
cleaning, feature engineering and feature selection train the models to analysis of this failure mode. In this specific problem, the inspection
predict the condition. The expectation behind such an approach is that limits placed in the engine maintenance documentiation have been
there might be non-obvious or hard to quantify differences between not sufficient to capture the early progression of the wear and a scale
the engines so that the component in one engine fails at different time based purely on inspection findings would be highly non-linear. Be-
that the other. The difference could be operational: frequently fully tween the point at which the part exhibits no wear and the point at
loaded aircraft, high altitute of an airports used, short climb path, en- which first inspection limits for reccuring inspection apply there exist
vironmental: air aerosols and dusts present, high temperatures at the a relatively long period of preceeding damage accumulation that gives
airport or manufacturing related; tolerances stacking up results in dif- away certain symptoms. Upon completed root cause analyses, metal-
ferent loads that the component is exposed to. It is expected that, since lurgical surveys of the components at different damage stages, expert
a turbofan engine is a closed system, these differences can be deter- knowledge and numerical simulations the ranks 1-6 have been intro-
mined by sensors not directly related to the considered component and duced which improves proportionality of the used scale and makes it
those that cannot be otherwise used as a degradation characteristic. more linear. During this procedure limits have been established that
Such differences accumulated over the operation time could be reson- enable to assign the rank to inspected hardware. Although, the main-
sible for the condition rank progression at different rate and modern tenance documentation enables safe and profitable engine operation,
models are anticipated to fit to them. it had to be expanded to be create a proportional scale that can be used
Due to the data amount, complexity and high non-linearity neural in this research to formulate a regression framework. The inspection
networks are main focus of the research, however machine learning data have been revisited to assign proper value of the rank per the
models are used for comparison basis. Once models are developed, it extended scale as presented in the Table 1. Introduction of new limits
would be possible to use them to monitor the remaining fleet and plan that would cause maintenance actions should be carefuly considered
maintenance provided the sensor data would be provided as an input as more operation stoppages would be created, driving the aircraft
to the models. maintenance cost up and are potentially unnecessary. At this stage, au-
thors of this research are trying to study if a model build on such data
can deliver results that could be a starting point to reduce the airline
maintenance burden by making the findigs at inspection predictable.
Nevertheless, as Figure 1 summarizes, the majority of engines labeled
are cases requiring replacement and there is a potential class imbal-
ance for a pure classification oriented problem.
As the engine hardware inspection to establish its condition is a
recurrent process that needs to be accomodated into the airline main-
tenance schedule, it puts a time pressure burden with a potential con-
sequence of unplanned delays and it would be beneficial in that re-
gard to obtain a model that could rank the engines prior to obtaining
inspection data.
From the perspective of the fleet management such prognostics
Fig. 1. The number of engines per rank collected during the monitoring pro- would enable to plan ahead of time for the replacement hardware de-
gram and used as the dataset for this research livery and point out to the engines in the fleet needing it first. These
are the challenges that authors of this article are trying to adress.
Over 150 engines have participated in the monitoring program,
running at five different thrust ratings, belonging to 40 different air- 4. Approach
lines and more importantly operating on different routes across the
globe. The engines have been exposed to take-offs and landings in dif- 4.1. Dataset creation
ferent environmental conditions, altitudes, aircraft loads and runway
Engines are equipped with a number of sensors collecting flight
lenghts, however sharing the same part design. The part condition at
data. Each engine module from front to aft monitors essential opera-
the exposure time counted in flight cycles have been recorded. Simi-
tion paramters; pressure, temperature, variable vanes position setting,
larly to authors of [1] a 10-grade scale have been selected to assign
shafts rotational speeds and fuel flow injected just to name a few.
meaningful health index, a condtion rank, to the parts based on their
On the top of that, there exist thermodynamics models deployed,
actual condition as shown in Table 1. The condition ranks are estab-
validated through testing campaigns, that utilize these readings and
lished based on the inspection limits provided by the engine manu-
Table 1. 10-grade scale used to assign the health index to the part condition
9 Not accetable for further operation Exceeded Engine removal & part replacement
Increased recurrent inspection frequency
8 Conditionally acceptable for a short duration Allow for operation for short interval
on wing
7 Conditionally acceptable for a long duration Allow for operation for long interval Recurrent inspection on wing
6
5
Wear progression – subsequent expansion of
4 Monitoring of the progression on scheduled
the affected area on the component
Observed condition is permitted or no spe- overhauls when part is exposed
3
cific limits applicable
2
1 Visible wear initiation
0 No wear confirmed visually No action – no wear
Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021 361
deliver predictions of other useful parameters
that are not acquired directly. Additionally,
environmental data for arrivals and departure
airports are collected with information about
ambient temperature, pressure, elevation above
sealevel and air aerosols and added to the da-
tabase. A Python programming language with
Keras [17], Tensorflow [34], Sci-kit learn [28]
and pandas [24] libraries are used for data han-
dling and modelling.
Overall the parameters relevant to the en-
gines for which condition-based ranks were
established are retrieved from the database and
arranged in such a way that every rank at given
inspection is preceded by a number of timesteps Fig. 3. Feature creation on the example of a single input parameter
and the parameters set for each timestep. The
strategy to create the dataset is depicted in the
Figure 2. 4.2. Aggregation and feature selection
To shape the dataset into a problem that can be tackled by machine
learning methods the time series data from the sensors are represented
by their time independent distributions with the idea depicted in the
Figure 3. The values defining the distributions; median, max, 75th per-
centile value and 95th value are chosen as the new features for the
modelling. The selected distribution characteristics come from ex-
perimentation with the dataset.
The environmental aerosols data are instead represented by the
sum of its departure and arrival values per the flight and accumulated
over the total number of flights that engine has completed.
As the engine is a thermodynamic system, a high degree of colin-
earity is expected between some of its parameters collected during
its operation. To adress this issue, a collinerality check is performed
within the groups of parameters as shown in the Figure 4. Redundant
parameters are identified in this manner that are excluded later from
feature creation process.
362 Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021
4.3. Data transformations
Upon completion of data cleaning and aggregation, the x set is in a
form of dataframe of the 62 features by the number of the rows repre-
senting the number of the engines and the y are the engine ranks. For
the sake of simplicity and having in mind limited number of engines
the problem is transformed into a regression problem, where rank is
a continuous value from 0 to 9. Additionally, continuous rank is ex-
pected to better align with business expectations towards the continu-
ity of the damage progression.
As a next step, the dataset is randomly split into train and validation Fig. 4.1. 7 fold cross-validation procedure used in the test
dataset. The validation dataset is treated as a hold out set and is used
eventually to score the models performance against each other. Then, k test scorei
the features are standardized and transformed with Python scikit-learn CV Test score = ∑ (3)
package StandardScaler and PowerTransformer functions, with the i =1 k
care taken to fitting the functions on the train set, tranforming it and
then transforming the validation set, while repeating the procedure This strategy enables to select the model that performs the best
feature by feature. The scaling performed by the function follows the on the train set and has the best average performance while being
equation (1), where x is the value to be scaled: exposed to the variation present in the train set due to the shuffles
made by CV.
z = (x − µ) /σ (1) The aforementioned validation set is intended to be a hold out set
and not used in the model tweaks so that a data leak is avoided and
a fair and compenent comparison between the different model pos-
µ being a mean value, s is a standard deviation and z is the scaled
sible and to select the one performing best over the specific data.
value.
Thus all the comparison scores in this paper are calculated over the
Additionally, the power transform utilizes the Yeo-Johnson fam- validation set via means of multiple further splits into train and test
ily of equations without the restriction to the values of the variable sets with each of the 7 folds of cross-valdation (CV) procedure.
to be transformed as shown in the equation (2). The input data dis-
tribution vary and a transformation to make the distributions more
normal is performed. Due to negative values of certain parameters, a 4.4. Hyperparameter optimization strategy
simple Box-Cox transformation limited to non-negative values is not The hyperparameters search is conducted by the means of the
feasible. Thus, in the Yeo-Johnson, the λ parameter, representing the Bayesian optimization (BO) [32] where the parameters resulting in
transformation parameter, is determined individually for each input the maximum average test score from CV are found. In the Bayesian
feature. In the equation (2), the formulas for λ values at 0 and 2 en- optimization the objective function f (x ) over a dataset is optimized
sure continuity of the transformation function ψ ( λ , y ) for the entire using the benefits of the Bayes’ Theorem.
range y values. The equations for y ≥ 0 are in fact an equivalent This allows the selection of the most plausible objective function
of Box-Cox generalized transformations, whereas the formulas for given the prior assumptions regarding the function and hence improve
y < 0 enable transformation of negative y values [39]: on the performance of the optimization procedure in terms of com-
putational times [7]. In other words, simplifying and applying to the
(
( y + 1)λ − 1
) if λ ≠ 0, y ≥ 0
problem at hand, posterior probability of a model M given the evi-
dence (data) E is proportional to the likelihood of E given M multi-
λ
plied by the prior probability of M (4):
log ( y + 1) if λ = 0, y ≥ 0
ψ (λ, y ) = (2)
( − y + 1)
2−λ
−1 P (M | E ) ∝ P (E | M )P (M ) (4)
− if λ ≠ 2, y < 0
(2 − λ )
− log ( − y + 1) Instead of Python scikit-learn and its RandomGridSearch provid-
if λ = 2, y < 0
ing the grid search through the hyperparameters,the bayesopt package
is employed and its implementation of bayesian optimization argu-
ment used for every model parameters selection.
4.4. Validation strategy
With the dataset split into train and validation sets, having com- 4.5. Cost function
pleted the data cleaning and transformations, a validation strategy for
As a evaluation score a mean squared error (MSE) is calculated
model training, optimization and selection is required.
as in the equation (5), its used for parameters search in BO and as a
Hence the train dataset is further used to develop the model, that
mean to compare in between the models. What is more, for the benefit
is to tweak the model and find the best performing hyperparameters
of interpretation ease a R 2 score is calculated however is not used in
on the set. The train set is then often further split into train and test,
computations apart from the models comparison:
both complementary subsets of the train set, depending on the need of
the specific model. A 7 fold cross-validation (CV) process is used as
graphicaly depicted in the Figure 4.1. 1 n 2
MSE = ∑ ( yi − yˆi )
n i =1
(5)
As the data become randomly split into k subsets, repeating train-
ing over the folds occurs. The model is trained on CV train subset
for given set of hyperparameters and scored on CV test subset. In the In the equations (5) and (6) yi is the ground truth value, also called
effect, an average test score from k folds is obtained as shown in the
a target and yˆi a model prediction:
formula (3):
Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021 363
∑ i ( yi − yˆi )
2 ( x1, y1 ), ( x2 , y2 ),…, ( xl , yl ) (9)
R2 = 1 − 2 (6)
∑ i ( yi − µ )
the linear predictor function is shown as in (10).
g (x ) = w , x + b0 (10)
5. Models overview
This section describes the models that have been considered for where w , x is dot product of the model weights and the input vari-
this dataset. ables vectors and b0 an intercept. Transforming this into an optimiza-
tion problem it takes a form of:
5.1. Linear regression
For the sake of establishing a baseline model for the rank predic- 1
tion capabilities a linear model is used. The Ridge model is used from minimize w , w (11.1)
2
Python package as it incorporates a L2-regularization, called Ridge
regression, that helps the model to avoid the overfitting. With the con-
siderate number of features compared to the number of datapoints, the y j − w , x j − b0 ≤ ε
subject to (11.2)
ridge regularization introduces a penalty to the minimization objective w , x j + b0 ≤ ε
by adding the magnitude of sum of square of regression coefficients
multiplied by α factor as in the formula (7) where objective is the error
to be minimized by the objective function optimization: In the equations (11) ε represents an error, meaning the weights
vector w that results in the solutions lower than error are found. As
2 stated in [35] it is often desirable to have some errors greater than ε
n p p
and hence the formula is rewritten with introduction of slack variables
Loss function = ∑ yi − ∑ xij β j + α ∑ β j 2 (7)
i =1
δ j and δ *j taking the form of these equations (12):
j =1 j =1
l
5.2. Random Forest and Extremely Randomized Trees
1
minimize w , w + C ∑ δ j + δ *j
2 j =1
( ) (12.1)
A regressor based on the ensemble of tree predictors is selected for
evaluation in the presented problem. The tree predictors are grown
over randomly selected inputs and their combinations, offer robust- y j − w , x j − b0 ≤ ε + δ j
ness to outliers and data noise while being fast and additionally due
subject to w , x j + b0 − y j ≤ ε + δ *j (12.2)
to Law of Large Number they are less prone to overfitting. A random
subset of candidate features from the set is used to look for discrima- δ j , δ *j ≥ 0
tive thresholds via splitting into internal nodes and leafs (external
nodes). As the subset is random, the tree shape and the thresholds de- Upon optimization the first term of the equation is solved just like
termining the split cause difference between the estimators which pre- in (11.1) ensuring weights take low values whereas the second term
l
dictions are then averaged out. This becomes a strength of the model
as some prediction errors can cancel out. The idea is represented in ( )
C ∑ δ j + δ *j , where l represents the number of observations in the
j =1
equation (8):
dataset, is known as regularization term and ensures that the optimiza-
tion problem is feasible. Thus, parameter C offers a trade-off between
1 B
fˆrf ( x ) = ∑Tb ( x ) (8) the model complexity and the error values. Both parameters ε and C
B b =1 are hyperparameters subject to optimization.
364 Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021
Table 2. Models’ hyperparameters overview propagated via implemented algorithm to adjust all the
network weights based on their contribution to the out-
Model
Python package Hyperparameters optimized put error.
name
In the training process the samples are propagated
Linear model Sklearn.linear_model Ridge alpha multiple times until the weights are adjusted so that the
• max_features loss is minimized. The input data is organized in sam-
Random • max_depth ples and then into smaller batches, which are passed
Random Forest [Link] Forrest • min_sample_split through the model multiple times. In one epoch the
Regressor • min_samples_leaf model has been exposed to all samples in the training
• n_estimators set and during one iteration the model has adjusted
Extra weights to minimize error one batch. In the approach of
Extremely Ran-
[Link] Trees same as in Random Forest this research the batch size is set to 1, meaning a model
domized Trees
Regressor trains on a single randomly selected sample to adjust
SupportVector the weights.
[Link] SVR epsilon, C Dropout layers are employed to help prevent the
Machines
model overfitting, the dropout value is the percentage
• max_depth
• learning rate
of neurons in the layer that are randomly excluded from
XGB weight adjustment process and do not partake in the out-
XGboost Xgboost • colsample_bylevel
Regressor put calculation, it is known to contribute to the model
• subsample
• n_estimators robustness. The dropout undergoes hyperparameter op-
timization. Moreover, a L2 regularization (Ridge) in the
• n_layers first dense layer is turned on, contributing to the objec-
• n_units per layer tive function with its α value also determined via the
Multilayer • dropout rate optimization process.
ANN MLP Keras/Tensorflow
Perceptron • learning rate
As the problem is presented as a regression an acti-
• test set size
• regularization vation function is selected to be Parametric Rectified
Linear Unit (PReLU). In one of the landmark papers,
Kaiming He and others recognized the downfalls of
MLP en-
MLP ensemble Keras/Tensorflow same as in ANN MLP the typically used ReLUok activation function and pro-
semble
posed the alternative which is improvement over Leaky
ReLU and demonstrating improvement in image clas-
Loss if a differentiable function that measures the difference be- sification error neural network [13]. Thus, the used acti-
tween the prediction and the target. vation function is as in formula (18):
Parameters selected for hyperparameter optimization are as in Ta-
ble 2. v, if vi > 0
ϕ ( vi ) = i (18)
α i vi , otherwise
5.5. Neural networks – multilayer perceptrons (MLP)
Deep neural network is selected as the last type of the model. A
It is worth noting that the PReLU behaves like ReLU for posi-
multiple hidden layer network, where the input layer takes inputs
tive values of input and the return certain parametric linear output for
from the dataset features and then feeds it forwards to a single output
negative values.
neuron predicting the target is built in Python tensorflow using keras
As explained in the chapter 4.3 the train set is used for the model
framework.
training and optimization leaving the validation set acting as a holdout
Let the number of neurons in the layer be m, n the number of sam-
set. The train set is split in advance into the train and test subsets at
ples and k represent the index of the layer. On the very basic level, in
random using StratifiedShuffleSplit function.
the fully connected each neuron in the hidden layer obtains signals
The process repeats k times as the folds of cross-validation enforce
vector xk of m values that represent the input, it gets adjusted by
model to train and test on a different batch while test set size is main-
weights assigned to every connection wk and a bias bk and then is
tained. To achieve the perfect balance for this particular dataset, the
summed as in equation (15) to create a single output value of the layer
train to test split ratio is kept as one of the hyperparameters.
vk . Then an activation function ϕ is applied on the vk to obtain the
Lastly, learning rate is selected as a hyperparameter, meaning the
layer output yk :
rate at which the weights are adjusted. Importance of this parameter
is undoubted as too low values cause inefficient training and too high
vk = wk ⋅ xk + bk (15 ) (15) may cause the model not to converge at all.
Model training, being in the essence finding such model weights,
w1 biases and activations, also called parameters that yield the least er-
w ror, is possible thanks to a gradient descent algorithm [27]. Let J (θ )
v k = [x1 x2 xm ]⋅ 2 + bk (16 ) (16) be an objective function to be minimized and θθ ∈ R be the model
parameters, by performing the gradient descent, that is updating the
wm parameters in the opposite direction of the gradient of the objective
function ∇θ J (θ ) thus following the slope of towards a local min-
yk = ϕ ( vk ) (17)
imum. A learning rate η, selected as model hyperparameter in this
study, determines the size of the step towards the expected minimum.
Then the output becomes input for the next layer neurons and the A popular implementation of this idea, shown in (19), is a stochas-
process repeats until eventually output of the model for a single sam- tic gradient descent (SGD), which enables to calculate the objective
ple is obtained yˆ k . Eventually, the error in the prediction is calcu- function on one sample, instead of all in the batch, that significantly
lated via loss function by comparison of yk to the target. Upon the expedites the walk towards the minimum:
error calculation the back propagation occurs and the error is back
Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021 365
θt = θt −1 − η ⋅ ∇ θ J(θ) (19) As shown in the Figure 5, the best performing model for the speci-
fied problem and the data available, has been a neural network meta-
model ensemble, achieving MSE score of 0.71, that brought 17.4%
Too high value can make the optimization process unstable and error decrease from a single best neural network model with a scored
prevent the model to converge, too low value can make training proc- at 0.86.
ess ineffective. There exist numerous optimizers that attempt to im-
prove on it, introducing concepts of momentum to pass over local
minima and preventing overshoot due to the overpowering momen-
tum (Nesterov Accelerated Gradient). To better deal with data spar-
sity an adaptive learning rate algorithm was introduced, Adagrad, that
preferentially adjusts learning rates for each parameter and to coun-
teract its downfalls manifesting as monotonically decreasing learning
rate Adadelta was proposed. Neural networks trained in the research
utilized Adaptive Moment Estimation [20], ADAM, that computes
adaptive learning rates for each parameter like aforementioned adap-
tive algorithms but proposing features similar to the concept of mo-
mentum. Let the g= ∇θ J (θt ) be the gradient and ε be a small term
preventing division by zero in the formula (20):
η
θt = θt −1 − mˆt (20)
ν̂ t + ε
mt
mˆ t =
1 − β1t
(21)
vt
vˆt =
1 − β 2t Fig. 5. Results comparison – models’ scores. Train and validation series
(22)
represent model performance on the train and validation sets respec-
tively
mt = (1 − β1 ) ⋅ gt + β1mt −1 (23)
The support vector machine regressor model obtained 1.76, that
vt = (1 − β 2 ) ⋅ gt2 + β 2vt −1 (24) outperformed extremely randomized trees models with a score of 1.88
by a 6.4%. Griadient boosted tree regressor obtained a score of 2.07,
random forest model scored 2.71 and ridge regression 2.84.
where the mt is a first momentum (23) and the vt is the second mo- The difference in error between the score of simple linear model
mentum (24) and the β1, β 2 are decay terms. (ridge regression) to the neural net ensemble corresponds to 75% of
the linear model score, which justifies the effort invested into deep
5.6. Ensemble learning models exploration.
As shown in the Figure 10 even for the best model, there exist out-
Models collected in an ensemble composed of few best scored neu-
lying residual value in the validation set, which model does not pre-
ral networks have been explored. In the process of hyperparameter
dict well (model underpredicts a 5 distress rank to be little over 3) and
optimization of neural networks, three models with various scores
increases MSE score. Futhermore, a RMSE score is also calculated to
have been obtained. Similar to the concept of the random forest, an
conclude about the model applicability to the problem at hand.
ensemble of neural nets can offer an improvement in the overall score
In addition to the overall models’ performance, it has been observed
as some of the individual model errors can potentially cancel out.
that all researched models have obtained inconsistent score over the
In the study preceding this paper, an ensemble has been created
ranks as depicted in RMSE score plot in Figure 6. Due to scarcity of
through training meta-model of a similar architecture as single neural
rank 3 data points, they have not been selected for the validation set
network. The meta-model undergoes exactly the same procedure of
via a random selection train_test_split scikit-learn function. Hence
cross-validated Bayesian hyperparameter optimization with the ex-
the error values for rank 3 are not available and models ability to pre-
ception of using the stacked output of the single models as its input
dict rank in this range remains not quantified explicitly.
and in the prediction is scored with the means of the loss function.
The highest RMSE have been produced by SVR (4.01) and linear
model (3.66) for rank 1. The lowest RMSE values have been achieved
6. Results by SVR (0.07) and MLP ensemble model (0.10) while predicting rank
Presented results represent the models that have been subjected to 0. As demonstrated in the RMSE distribution plotted in Figure 6, the
hyperparameter optimization described in previous chapters. Both most common value is between 1.0 and 1.5.
scores R 2 and MSE are shown for ease of interpretation, however All studied models have obtained the lowest error while making
the MSE is selectedfor this regression problem and is used draw predictions for rank 7 with similar values scored as quantified by a
conclusions. standard deviation of 0.19 of RMSE. Conversely, the greatest incon-
The mean squared error score penalizes large errors; as a predic- sistency have been noted for rank 1; MLP-based models scored low
tion differs from the true value, the penalty score exhibits quadratic error, yet other models have been producing a high error, which con-
growth. Thus, if used as a loss fucntion in an optimization problem, tributed to a standard deviation of 1.10 of RMSE for this rank.
penalizing large error helps to find model paratmeters that result in The ensemble and single neural network models have a better per-
minimizing them. formance for target variable in range from 0-4 (RMSE in range from
The validation score is calculated over the validation holdout set 0.10 to 1.10) and 7-9 (RMSE 0.29 to 0.78), than in predicting ranks
and the train score represents how model fitted the train set. 5-6 (RMSE 1.47 to 2.92). Errors achieved by the MLP based models
366 Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021
in this range are the greatest among the considered models followed 1-2 (XGBoost: 0.94 – 1.62, tree ensemble models: 1.94 – 2.75), how-
by XGBoost that have obtained 2.14 RMSE over rank 5 and SVR with ever predicted with greater error for ranks 4-5 (XGBoost: 2.14 – 2.42,
1.61 RMSE over rank 6. tree ensemble models: 0.07 – 1.78) and offered some improvement
As a general trend and omitting the exceptionally low errors de- for ranks 8-9 (XGBoost: 1.05 – 1.20, tree ensemble models: 1.21 –
scribed earlier, the machine learning models have had higher RMSE 1.59).
values for ranks 0-4 (1.01 to 4.01), then error decreases for ranks 5-6 SVR RMSE values have been low for rank 0 (0.07) and rank 9
(0.07 to 1.61), becomes the low for all for rank 7 (0.56 to 0.94) and (0.9) and comparable to those of MLP ensemble model errors (rank
then slightly increases for ranks 8-9 (0.90 - 1.88). This error general 0: 0.1, rank 9: 0.57). Unfortunately, its prediction error inconsistency
trend is different than for earlier discussed MLP-based models. through other ranks have been relatively high (RMSE 0.6 – 4.01).
Some exceptions to this trend have occurred; XGBoost demon- Based on the plot in Figure 6 the MLP-based models can make a
strated greater RMSE value for ranks 4-5 (2.14 – 2.42) than for ranks prediction of low and high ranks with the least error.
1-2 (0.94 – 1.62), whereas other machine learning models RMSE were The described trends do not correlate with the distribution of the
in a range of 0.07 – 1.61. ranks in the training set, training set distribution is similar to that of
the entire dataset shown in Figure 7. The data points with rank 9 are
most frequent, ranks 8 and 7 occur more rarely and the other ranks
data is rather limited. Either of the earlier described trends can be
explicitly explained by the distribution of the target variable in the
training set.
As the MLP ensemble model predicts with the least error, it is se-
lected as a reference point and the differences in RMSE of the others
models to the ensemble are calculated and summarized in the plot
in Figure 8. The negative difference values, coloured by the shades
of red are cases where models have performance debit to the MLP
ensemble and conversely, positive values and shades of green show
where other models predicted with lower error.
The single MLP model have had a RMSE greatest differences for
rank 0 (-0.98) and rank 4 (-0.22). MLP ensemble greatly improved er-
ror in predicting rank 0. Otherwise, the differences in majority of ranks
are between -0.22 and 0.22 values and can be considered similar. An
exception to this observation is a rank 2 where single MLP predicted
with lower error and the differences was 0.45. Although, there have
been ranks where single MLP outperformed the meta-model, the op-
posite situation has been as frequent and due to the lower overall pre-
diction error, the ensemble model has shown a better performance.
The ensemble meta-model has brought improvement in prediction
error it is lower and upper ranges of the target variable. The other
models have had, in general, up to -3.10 difference for ranks 0-4 and
up to -1.40 difference for ranks 8-9. The models have been within
Fig. 6. RMSE score per rank (lower value = less error)
-0.42 to 0.22 in difference to the ensemble for rank 7, with SVR hav-
ing the least difference (-0.05) and random forest having the greatest
In the tree ensemble based models group; random forest and ex-
tremely randomized trees, the latter have, in general, predicted with
lower RMSE values and offered an improvement in minimum and
maximum values. The minimum and maximum values have improved
from 0.64 and 2.85 to 0.07 and 2.38, respectively.
Fig. 7. Distribution of RMSE errors calculated per rank for every model using
validation set
Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021 367
difference (-0.42). As can be observed, these models outperformed the • SVR model predicted two outlying values (overpredicted rank 1
ensemble in predicting ranks 5-6 with difference up to 2.72 (ETR). and 2).
Residual values calculated as a difference between the true and • XGBoost model residuals are noisy with perhaps one outlying
predicted values have been calculated for each model over the train value (overpredicted rank 0).
and validation sets and demonstrated for selected models in Figure 9. • MLP predicted one outlying value (underpredicted rank 5).
Non-linear models representing different algorithms families have • SVR, XGBoost and MLP do not predict the same outlying values.
been chosen: ETR, SVR, XGBoost and MLP.
Furthermore, a tendency in over and underprediction have been
SVR and XGBoost models have overfitted to the train set, as all
analysed; XGBoost tends to overpredict the lower ranks and under-
prediction values line up closely with their corresponding true values
predict higher ranks. Similar, however less pronounced, trend is ex-
with little residual error, while the validation set residuals are signifi-
hibited by ETR. The bull’s eye prediction of SVR for rank 0 seems to
cantly greater. In this particular application, MLP and ETR seem to be
be an exception and if treated as an outlier, its prediction residual error
less prone to this behaviour and greater train set residuals are visible.
trend would become similar.
Studied models have also been predicting different outlying values,
The MLP model is the least noisy in the considered group and does
however due to the noise in the residual values have been hard to
not show a residual error trend exhibited by the other models. What
interpret. The following observations regarding outlying values have
is more, the meta-model ensemble residuals depicted in Figure 10
been noted:
are similar to the single MLP in lack of the residuals trend and also
predict the same outlying value. This explains why ensemble model
shares similar performance for rank 5 and demonstrates the ensemble
model have not improved the capability to predict this value.
7. Conclusions
Based on results one can observe that certain models have per-
formed better than the others over the given dataset. The promising
results presented in the paper align with the recent conclusions of the
research community regarding deep learning models applications.
The specifics of the problem have shown that a simple linear mod-
el, although useful to certain degree, can be surpassed in performance
by more complex architectures. What is more, the superiority of the
ensemble model over single neural net model is further confirmed and
found in the referenced literatures researchers insights. Additionaly,
the neural nets outperformed tree based models and support vector
machines. As illustrated in the results, all models have a tendency to
overfit to the train set, despite the counter measures taken, however
boosted trees, extremely random trees and support vector machines
have gravitated towards overfitting more than the others. It might be
noted, that the models that have had the lowest difference between
train score and validation score are deep learning models. In the ef-
fect, their highest validation scores on this dataset could be attributed
to their ability to generlize the best and learn without overfitting to
the training set.
The best model residuals demonstrate fairly consistent error in con-
tinously predicting conditions ranks across the scale and hence it is
concluded that it could be satisfactory used for the problem at hand.
Translating the MSE 0.71 to RMSE returns value of 0.84, which, from
the forecast perspective, enables to predict ranks with error lower
than one condition rank in the scale. Such perspective places the deep
learning models considered in this paper as an adequate candidates for
the business use, however leaves a room for improvement for future
studies for the research community.
The obtained results demonstrate that a neural network model build
on the gathered data can predict the rank with average error less than
one unit of the rank scale. Although certain models error has not been
consistent over the enitre rank scale, a potential business application
could benefit by a prediction by few models, keeping in mind their
different performance in different rank scale ranges. As a conclusion
it may be underlined, that proper data collection and ranking the col-
lected inspection data is a relatively long processes, that is greatly ex-
pedited by using established inspection procedures and their findings.
An important challenge has become a selection of a proper rank
scale, which should ensure proportionality to formulate a valid regres-
Fig. 9. Residuals plots for selected models sion framework. In the specific example, the existing data based on
the engine service limits had to be expanded by introduciton of ranks
that represented early wear stages and would normally be omitted per
• ETR predictions have the most consistent absolute residual values
the existing inspection requirements as being acceptable to operate
in the group considered and there are no clear outlying values in
with. Additional ranks required revisiting the collected inspection data
the prediction.
and proper re-assignment based on the established scale. The devel-
368 Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021
opment of the scale required a study of the failure mode, conducting solution to the industry with relatively low investment of time and
destructive tests, application of material knowledge and involvment resources using the latest developments in deep and machine learning.
of industry experts and wihtout this preceding step further research In the nearest perspective, models might not be feasible to replace the
would not be possible. on-wings inspection, but can reduce an inspection burden by mak-
In the data collection process, a strong bias towards having the ma- ing its outcomes more manageable and predictable. Safety has always
jority of data points composed of worn out parts or parts near the end been a number one factor in the aviation industry and the most likely
of its useful life have been observed. This is due to the fact, that in the application of such models is expected in the fleet health monitoring
aviation industry, the airlines tend to maximize the time that aircraft is and maintenance management rather than direct replacement of well
in operation and stopagges due to the inspections and repairs are ad- established inspection processes.
ditional financial burden. Therefore components near its service limits
or requiring recurrent inspections of increased frequency are removed
earlier. This data is most widely accesible and shared with the engine
manufacturer, which explains the bias in the dataset. On the other
hand, due to some unexpected events, i.e. foreign object damage to
the engine, the component becomes exposed before the wear process
is initiated and the dataset has more data points of this stage than few
of the subsequent ranks. The least available data are from the early
progression stage of the wear from initiation point to the moment of
first service limits apply. This is explained by the fact, that such data
is considered acceptable per the inspectors and typically not captured
in the inspection process as it presents hardware condition that will
continue to operate for a significant time between the wear out. This
mindset is a challenge for implementation of a data collection process
that enables building a high fidelity prediction model, where a model
should be trained with a balanced dataset to predict over the entire
range of the target variable with an acceptably low error. With such
limitation, ranking scale selection process may become a trade off
between having sufficiently many grades to capture the physics and
number of data points per each rank for the model to be able to fit to it.
As a conclusion from this research, implementation of a data collec-
tion scheme expanding the scope of the current inspection data would
enable further development of such models. However, it should be
noted, that a potential data collection processes to keep the models up
to date can be done without the modification of the inspection limits
and done post inspection by the engine manufacturer. This approach
would help to reduce the maintenance cost by providing a way to Fig. 10. Ensemble meta-model residuals
monitor fleet’s health and manage the maintenance without creating
additional opeartion stoppages.
Using the model, a prediction for every turbofan engine condition 8. Next steps
in the fleet can be obtained easily and updating the prediction regu- Authors of the article recognize the promising results obtained
larly with the new input data can provide useful information about by the scientific community using recurring neural networks archi-
the progression of the wear and change in the fleet’s health. Infor- tectures in similarly stated problems, the demonstrated performance
mation about the rank could enable to schedule maintenance and set of deep Bayesian networks and the advantages of combining the ef-
expectations regarding the condition once engine is visually inspected ficiency of semi-supervised learning variational autoencoders with
on-wing. The information available ahead of time can enable a pri- deep Bayesian network models on sparsely labelled data typically
oritization of engine repairs and ordering replacement hardware. Pre- encountered in the aviation industry, thus wish to try these methods to
sented study demonstrates that use of such data can deliver a valuable further research this particular problem.
References
1. Amir M D M, Muttalib E S A., Health index assessment of aged oil-filled ring main units. IEEE 8th International Power Engineering and
Optimization Conference, 24-25 March 2014, [Link]
2. Azipurua J I, Stewart B G, McArthur S D J, Lambert B, Cross J G, Catterson V M., Improved power transformer condition monitoring
under uncertainty through soft computing and probabilistic health index., Applied Soft Computing 2019; 85, [Link]
asoc.2019.105530.
3. Babu G S, Zhao P, Li X L., Deep Convolutional Neural Network Based Regression Approach for Estimation of Remaining Useful Life.,
Database Systems for Advanced Applications. DASFAA 2016. Lecture Notes in Computer Science 9642, [Link]
32025-0_14.
4. Baojun N, Mei Y, Xingjian S, Peng W. Random FEA and reliability analysis for combustor case., 2017 Prognostics and System Health
Management Conference (PHM-Harbin), Harbin 2017: 1-5, [Link]
5. Bektas O, Jones J A, Sankararaman S, Roychoudhury I, Goebel K., A neural network filtering approach for similarity-based remaining useful
life estimation., The International Journal of Advanced Manufacturing Technology 2019; 101: 87-103, [Link]
2874-0.
6. Bojdo N, Ellis M, Filippone A, Jones M, Pawley A. Particle-Vane Interaction Probability in Gas Turbine Engines., Journal of Turbomachinery
2019, 141(9), [Link]
7. Brochu E, Cora V M, de Freitas N. A Tutorial on Bayesian Optimization of Expensive Cost Functions with Application to Active User
Modeling and Hierarchical Reinforcement Learning 2010, [Link]
Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021 369
8. Cerdeiera J O, Lopes I C, Silva E C., Scheduling the Repairment of Aircrafts' Engines., 2017 International Conference on Control, Artificial
Intelligence, Robotics & Optimization (ICCAIRO), Prague, 2017: 259-267, [Link]
9. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, August 2016: 785-794, [Link]
10. Echarda B, Gaytona N, Bignonnet A. A reliability analysis method for fatigue design. International Journal of Fatigue 2014: 59: 292-300,
[Link]
11. Gao Z, Li Jiwu, Wang R. Prognostics uncertainty reduction by right-time prediction of remaining useful life based on hidden Markov model
and proportional hazard model. Eksploatacja i Niezawodnosc - Maintenance and Reliability 2021; 23(1): 154-164, [Link]
ein.2021.1.16.
12. Guffanti M, Tupper A. Chapter 4 - Volcanic Ash Hazards and Aviation Risk. Volcanic Hazards Risks and Disasters 2015: 87-108, [Link]
org/10.1016/B978-0-12-396453-3.00004-6.
13. He K, Zhang X, Ren S, Sun J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. 2015 IEEE
International Conference on Computer Vision (ICCV), Santiago 2015: 1026-1034, [Link]
14. Heimes F. Recurrent neural networks for remaining useful life estimation. 2008 International Conference on Prognostics and Health
Management 2008: 1-6, [Link]
15. Illustration, [Link] (accessed
10 October 2020).
16. Instructions for Continued Airworthiness, Code of Federal Regulations 14 CFR § 33.4 (1980).
17. Keras documentation, [Link] (accessed 10 April 2020).
18. Khan N, Manarvi I A. Identification of delay factors in C-130 aircraft overhaul and finding solutions through data analysis. 2011 Aerospace
Conference, Big Sky, MT, 2011: 1-8, [Link]
19. Khan S, Yairi T. A review on the application of deep learning in system health management. Mechanical Systems and Signal Processing
2018; 107: 241-265, [Link]
20. Kingma D P, Ba J. ADAM: A Method for Stochastic Optimization. 3rd International Conference for Learning Representations 2015: 1-13,
[Link]
21. Kursa M B, Rudnicki W R. Feature Selection with the Boruta Package. Journal of Statistical Software 2010; 36(11), [Link]
jss.v036.i11.
22. Li Y X, Shi J, Gong W, Zhang M. An ensemble model for engineered systems prognostics combining health index synthesis approach and
particle filtering. Quality and Reliability Engineering International 2017; 33(8): 2711-2725, [Link]
23. Malik K, Zbikowski M, Teodorczyk A. Detonation cell size model based on deep neural network for hydrogen, methane and propane
mixtures with air and oxygen. Nuclear Engineering and Technology 2019; 51(2): 424-431, [Link]
24. Pandas documenation, [Link] (accessed 20 September 2020).
25. Pawełczyk M, Fulara S, Sepe M, De Luca A, Badora M. Industrial gas turbine operating parameters monitoring and data-driven prediction.
Eksploatacja i Niezawodnosc - Maintenance and Reliability 2020; 22(3): 391-399, [Link]
26. Przysowa R, Gawron B, Kulaszka A, Placha-Hetman K. Polish experience from the operation of helicopters under harsh conditions. Journal
of Konbin 2018; 48(1): 263-299, [Link]
27. Ruder S. An overview of gradient descent optimization algorithms, 2017, [Link]
28. Scikit-learn documentation, [Link] (accessed 20 September 2020).
29. Shin J H, Jun H B. On condition based maintenance policy. Journal of Computational Design and Engineering 2015; 2(2): 119-127, https://
[Link]/10.1016/[Link].2014.12.006.
30. Sikorska J Z, Hodkiewicz M, Ma L. Prognostic modelling options for remaining useful life estimation by industry. Mechanical Systems and
Signal Processing 2011; 25(5): 1803-1836, [Link]
31. Sina Tayarani-Bathaie S, Sadough Vanini Z N, Khorasani K. Dynamic neural network-based fault diagnosis of gas turbine engines.
Neurocomputing 2014; 125: 153-165, [Link]
32. Snoek J, Larochelle H, Adams R P. Practical Bayesian Optimization of Machine Learning Algorithms. Advances in Neural Information
Processing Systems 2012; 25, [Link]
33. Sun J, Zuo H, Wang W, Pecht M G. Application of a state space modeling technique to system prognostics based on a health index for
condition based maintenance. Mechanical Systems and Signal Processing 2012; 28: 585-596, [Link]
34. Tensorflow documentation, [Link] (accessed 9 May 2020).
35. Wu S, Akbarov A. Support Vector Regression for Warranty Claim Forecasting. European Journal of Operational Research 2011; 213(1):
196-204, [Link]
36. Xu J, Liu X, Wang B, Lin J. Deep Belief Network-Based Gas Path Fault Diagnosis for Turbofan Engines. IEEE Access 2017; 7: 170333-
170342, [Link]
37. Yang Y, Ding Y,and Zhao Z. Fault distribution analysis of airborne equipment based on probability plot. 3rd IEEE International Conference
on Control Science and Systems Engineering 2017: 239-242, [Link]
38. Yang Y, Guo F. Reliability Analysis of Aero-Equipment Components Life Based on Normal Distribution Model. IEEE 4th Information
Technology and Mechatronics Engineering Conference (ITOEC) 2018, Chongqing, China, 2018: 1070-1074, [Link]
ITOEC.2018.8740448.
39. Yeo I K, Johnson R A. A new family of power transformations to improve normality or symmetry. Biometrika 2000; 87(4): 954-959, https://
[Link]/10.1093/biomet/87.4.954.
40. Yoon A S, Lee T, Lim Y, Jung D, Kang P, Kim D, Park K, Choi Y. Semi-supervised Learning with Deep Generative Models for Asset
Failure Prediction. KDD17 Workshop on Machine Learning for Prognostics and Health Management 2017, Canada, [Link]
abs/1709.00845.
41. Zaidan M A, Harrison R F, Mills A R, Fleming P J. Bayesian hierarchical models for aerospace gas turbine engine prognostics. Expert
Systems with Applications 2015; 42(1): 539-553, [Link]
370 Eksploatacja i Niezawodnosc – Maintenance and Reliability Vol. 23, No. 2, 2021