0% found this document useful (0 votes)
185 views8 pages

Cancer Detection Using Machine Learning

This document discusses machine learning techniques for early detection of breast cancer. It proposes a hybrid model combining Support Vector Machine (SVM), Artificial Neural Network (ANN), K-Nearest Neighbor (KNN), and Decision Tree (DT) algorithms. The model can be used with different data types like images and blood samples to more accurately detect breast cancer at earlier stages. Early detection is important as it increases survival rates by allowing treatment to begin sooner before cancer spreads. Machine learning can help reduce unnecessary biopsies and surgeries by improving the accuracy of diagnosis.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
185 views8 pages

Cancer Detection Using Machine Learning

This document discusses machine learning techniques for early detection of breast cancer. It proposes a hybrid model combining Support Vector Machine (SVM), Artificial Neural Network (ANN), K-Nearest Neighbor (KNN), and Decision Tree (DT) algorithms. The model can be used with different data types like images and blood samples to more accurately detect breast cancer at earlier stages. Early detection is important as it increases survival rates by allowing treatment to begin sooner before cancer spreads. Machine learning can help reduce unnecessary biopsies and surgeries by improving the accuracy of diagnosis.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/327974742

Early Detection of Breast Cancer Using Machine Learning Techniques

Article · September 2018

CITATION READS

1 2,182

5 authors, including:

Babak Bashari Rad Mervat Adib Bamiah


Asia Pacific University of Technology and Innovation HMW Global Consulting
46 PUBLICATIONS   241 CITATIONS    36 PUBLICATIONS   150 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Digital Business Transformation View project

Cloud Computing View project

All content following this page was uploaded by Mervat Adib Bamiah on 30 September 2018.

The user has requested enhancement of the downloaded file.


Early Detection of Breast Cancer Using Machine
Learning Techniques
M. Tahmooresi1, A. Afshar2, B. Bashari Rad1, K. B. Nowshath1 and M. A. Bamiah2
1
Asia Pacific University of Technology and Innovation (APU), Malaysia.
2
University of Malaya, Malaysia.
maryam.tahmooresi@yahoo.com

Abstract—Cancer is the second cause of death in the world. [6], an analysis of 87 studies strongly concluded that female
8.8 million patients died due to cancer in 2015. Breast cancer is patients with breast cancer who start their therapy less than 3
the leading cause of death among women. Several types of months after the appearance of symptoms significantly have
research have been done on early detection of breast cancer to a higher chance of survival compare to those who wait for
start treatment and increase the chance of survival. Most of the
studies concentrated on mammogram images. However,
more than 3 months.
mammogram images sometimes have a risk of false detection Many previous studies confirm that detection of breast
that may endanger the patient’s health. It is vital to find cancer in early stages significantly increase the chance of
alternative methods which are easier to implement and work survival because it prevents the spreading of malignant cells
with different data sets, cheaper and safer, that can produce a throughout the entire body [6].
more reliable prediction. This paper proposes a hybrid model The main contribution of this paper is to review the role of
combined of several Machine Learning (ML) algorithms machine learning techniques in early detection of the breast
including Support Vector Machine (SVM), Artificial Neural cancer.
Network (ANN), K-Nearest Neighbor (KNN), Decision Tree Artificial Intelligence (AI) can be applied to improve
(DT) for effective breast cancer detection. This study also
discusses the datasets used for breast cancer detection and
breast cancer detection and diagnosis, as well as prevent
diagnosis. The proposed model can be used with different data overtreatment. Nevertheless, combining AI and Machine
types such as image, blood, etc. Learning (ML) methods enables the prediction and empower
accurate decision making. For example, deciding on the
Index Terms—Breast Cancer; Breast Cancer Detection; biopsy results for detecting breast cancer if the patient needs
Medical Images; Machine Learning. surgery or not.
Currently, Mammograms are the most used test available,
I. INTRODUCTION however, still, they have false positive (high-risk) results
which shows abnormal cells that can lead to unnecessary
World Health Organization (WHO) reported the breast cancer biopsies and surgeries. Sometimes surgery is done to remove
is the most common cancer amongst women globally [1]. It lesions reveals that it is benign which is not harmful. This
is also the highest ranked type of cancer cause the death means that the patient will go through unnecessary painful
among women in the world [2, 3]. In Malaysia, Breast cancer and expensive surgery.
has the highest rate of cancer deaths, around 25%, and it is ML Algorithms were introduced with many features such
the commonest cancer among women [4]. Around 5% of as effective performance on healthcare related dataset which
Malaysian women are at risk of breast cancer while Europe involve images, x-rays, blood samples, etc. Some methods
and the United States, it is around 12.5% [3]. It confirms that are appropriate for the small dataset whereby others are
women with breast cancer in Malaysia present at a later stage suitable for huge datasets. However, noise can be a
of the disease compared to women from other countries [4]. problematic concern in some methods.
Usually, breast cancer can be easily detected if specific This paper is organized as follows, Section II introduces
symptoms appear. However, many women who are suffering the breast cancer briefly, Section III explains the ML
from breast cancer have no symptoms. Hence, regular breast algorithms used for detecting breast cancer. A summary of
cancer screening is very important for early detection [3]. previous related works is given in section IV. Finally,
Early detection of breast cancer aids for early diagnosis and Section V concludes the paper.
treatment, because the prognosis is very important for long-
term survival [5]. Since early detection, diagnosis, and II. BREAST CANCER
treatment of cancer can reduce the risk of death, it plays a
significant role in saving the life of the patient. Any delay in Breast cancer is the most found disease in the women,
detection of cancer in early stages leads to disease worldwide, where abnormal growth of a mass of tissue, cause
progression and complication of treatment [5], therefore long the expansion of malignant cells leads to acute breast cancer.
waiting time prior to diagnosis of breast cancer and starting These malignant cells are originally created from milk glands
the treatment process is of prognostic concern. of the breast. These malignant cells which are the main reason
Previous studies on the investigation of the consequences for breast cancer can be classified into different groups
of a late diagnosis of cancer confirm that it is strongly according to their unusual progress and capability affecting
associated with progression of the disease to more advanced other normal cells [7]. The capability of affecting means
stages, consequently less chance to save the patient’s life. In whether these malignant cells affect only the local cells or can
a systematic review conducted by Prof MA Richards et al. spread throughout the full body. The effect of spreading these

e-ISSN: 2289-8131 Vol. 10 No. 3-2 21


Journal of Telecommunication, Electronic and Computer Engineering

malignant cells throughout the whole body of the patient is and Smooth Support Vector Machine (SSVM).
called as metastasis [7]. It is very important to prevent this
spreading effect by a diagnosis of cancer in the early stages C. K-Nearest Neighbors (KNN)
using advanced techniques and equipment. In recent decades, KNN is a supervised learning method which is used for
there are many efforts to employ artificial intelligence and diagnosing and classifying cancer [12]. In this method, the
other related methods to assist in the detection of cancer in computer is trained in a specific field and new data is given
earlier stages. to it. Additionally, similar data is used by the machine for
Early detection of cancer boosts the increase of survival detecting (K) hence, the machine starts finding KNN for the
chance to 98% [8]. Figure 1. shows different types of cancers unknown data. It is recommended to choose a large dataset
whereby breast cancer is leading with 24% as follows. for training also K value must be an odd number.

D. Decision Tree (DT)


Breast Tranchea,Bronchus,Lung
Colorectum Ovary DT is a data mining technique used for early detection of
Cervix Uteri Other breast cancer. It is a model that presents classifications or
regressions as a tree. In this model, the data set is broken to
small sub-data, then to smaller ones. As a result, the tree is
developed and at the last level, the result is revealed. In a tree
24% structure, the leaves characterize the class labels whereby the
branches characterize conjunctions of feature leading to the
41%
class labels Hence, DT is not sensitive to noise [13].

E. Random Forest (RF) Algorithm


13% RF algorithm is used at the regularization point where the
model quality is highest, variance and bias problems are
compromised [14]. RF builds numerous numbers of DTs
6% 10% using random samples with a replacement to overcome the
6%
problem of DTs. Each tree classifies its observations, and
majority votes decision is chosen. RF is used in the
Figure 1: Types of cancer unsupervised mode for assessing proximities among data
points.
III. MACHINE LEARNING METHODS
F. AdaBoost Classifier
Machine Learning is a process that machines (computers) are This algorithm is used for classification and regression to
trained with data to make the decision for similar cases [9]. predict breast cancer existence. It converts weak learners to
ML is employed in various applications, such as object strong ones by combining all weak learners to form a single
recognition, network, security, and healthcare. There are two strong rule. It gets the weight of the node and changes it
ML types i.e. single and hybrid methods like ANN, SVM, continuously until an accurate result is found. However, it is
Gaussian Mixture Model (GMM), K-Nearest Neighbor sensitive to noise and quality of features [15].
(KNN), Linear Regressive Classification (LRC), Weighted
Hierarchical Adaptive Voting Ensemble (WHAVE), etc. G. Naïve Bayes (NB) Classifier
Following are the used ML algorithms: Naïve Bayes refers to a probabilistic classifier that applies
Bayes’ theorem with robust independence assumptions [16].
A. Artificial Neural Network (ANN) In this model, all properties are considered separately to
ANN is a model like human brains nerve system that has a detect any existing relationship between them. It assumes that
large number of nodes connected to each other. Each node predictive attributes are conditionally independent given a
has two states: 0 means active and 1 means active. Also, each class. Moreover, the values of the numeric attributes are
node has a positive or negative weight that adjusts the distributed within each class. NB is fast and performs well
strength of the node and can activate or deactivate it. ANN even with a small dataset. However, it is difficult to find
provides samples of data to train the machine. The trained independent properties in real life. [16]. have deployed NB
machine is used to detect the pattern of hidden date. It can classifier for breast cancer detection and it gave the maximum
search for patterns among patients’ healthcare and personal accuracy with only five dominant.
records to identify high-risk lesions [10].
IV. PREVIOUS RELATED WORKS
B. Support Vector Machine (SVM)
SVM is a supervised pattern classification model which is Several studies have been conducted on the
used as a training algorithm for learning classification and implementation of ML on Breast Cancer detection and
regression rule from gathered data [11]. The purpose of this diagnosis using different methods or combination of several
method is to separate data until a hyperplane with high algorithms to increase the accuracy. S. Gc et al. [17] worked
minimum distance is found. SVM is used to classify two or on extracting features including variance, range, and
more data types. SVM include single or hybrid models such compactness. They used SVM classification to evaluate the
as Standard SVM (St-SVM), Proximal Support Vector performance. Their findings showed the highest variance of
Machine (PSVM), Newton Support Vector Machine 95%, range 94%, compactness 86%. According to their
(NSVM), Lagrangian Support Vector Machines (LSVM), results, SVM can be considered as an appropriate method for
Linear Programming Support Vector Machines (LPSVM), Breast Cancer Detection.

22 e-ISSN: 2289-8131 Vol. 10 No. 3-2


Early Detection of Breast Cancer Using Machine Learning Techniques

Chunqiu Wang et al. [18] chose Microwave Tomography Vector, 68% for Statistical and LBP based Feature Vector,
Imaging (MTI) to extract features and classify the images then the features were combined (Taxonomic Indices,
using ANN. Two different techniques were compared in this Statistical and LBP based Feature Vector) and again checked
study, GMM and KNN. Their results showed that the for accuracy. The evaluation results were the best after 4
sensitivity obtained by KNN is 87%, while for GMM is 67%. times testing. The researchers claimed that to increase
The accuracy was 85% for KNN and 75% for GMM. The performance and efficiency of detecting breast cancer is
result for Matthews Correlation Coefficient (MCC) was 67% performed by using different features.
and 48% for KNN and GMM, respectively. Finally, the Mejia et al. [27] have chosen Thermogram images for
specificity was 84% for KNN and 86% for GMM. According detecting breast cancer as it is cheaper and safer than other
to their findings, Sensitivity, Accuracy, and MCC for KNN methods. It can detect cancer in the earlier stage compared to
were better than GMM, but GMM was better in Specificity other images or tests, and it doesn’t have any limitation such
and Precision. as pregnancy, size or density of breast. Also, it doesn’t need
Chowdhary and Acharjya [19] focused on mammogram any complex features for extracting. They selected 18 cases
images as they are cheaper and more efficient in detection. with 9 abnormal and 9 normal cases. KNN classifier was used
However, since selecting and extracting features are to improve the accuracy. The results were 88.88% for
important for improving performance, Fuzzy Histogram abnormal and 94, 44% for normal cases.
Hyperbolization (FHH) was chosen to increase the quality of Ayeldeen et al. [28] used AI and its techniques for breast
images, Fuzzy C-mean for segmenting, and Gray level cancer detection. They used 5 different methods for
dependence model for extracting the features. Their method performance comparison. RF algorithm showed the highest
showed 94% accuracy for detecting malignant breast lesions. result with 99% performance.
In a study conducted by Aminikhanghahi et al. [20], Avramov and Si [29] worked on feature extraction and the
wireless cyber mammography images were explored. After impact of the selection on performance. They applied 4 ways
selecting features and extracting them, the researcher has of correlation selection (PCA, T-Test Significance and
chosen two different ML techniques, SVM and GMM to Random feature selection) and 5 models of classification (LR,
check their accuracy. Their findings showed that SVM is DT, KNN, LSVM, and CSVM). Best result was achieved by
more accurate if there is no noise or error, else GMM is better stacking the logistic, SVM and CSVM improve accuracy to
and safer. 98.56%.
Durai et al. [21] Have selected Data Mining technique for Ngadi et al. [30] used NSVC algorithm to test different
detecting diseases including breast cancer. They used LRC classification methods including RBF, Poly, and Linear. Then
and compared it with four other techniques including BFI, they compared the results with other classification methods
ID3, J48, and SVM. The result shows that LRC is the most such as Naïve Bayes, DT, K-NN, SVM, RF, and Adaboost.
accurate one with 99.25% accuracy. RF has the best performance result with 93% accuracy. This
Wang and Yoon [22] chose four methods of Data Mining proves that NSVC was better than the other methods.
to measure their effectiveness in detection. These models Jiang and Xu [31] used Diffusion-Weighted Magnetic
were: SVM, ANN, Naïve Bayes Classification and Adaboost Resonance Image (DWI) for breast cancer detection. They
tree. In addition, PCs and PCi were used for making hybrid used two types of features; one based on ROI and another one
models. After checking the accuracy, they have found out that based on ADC- on 61 patient’s data. Moreover, they
Principal Component Analysis (PCA) can be a critical factor implemented RF-RFE and RF algorithm was used. The study
to improve performance. findings show that the accuracy of RF-RFE and RF and
Hafizah et al. [23] compared SVM and ANN using four Histogram + GLCM is 77.05% which indicates that feature-
different datasets of breast and liver cancer including WBCD, based texture has a critical role in improving performance and
BUPA JNC, Data, Ovarian. The researchers have detection.
demonstrated that both methods are having high performance Salma [32] selected two different data sets from WBCD
but still, SVM was better than ANN. and KDD also they used FM-ANN for both of them. They
Azar and El-Said [24] worked on six different methods of compared the results with other techniques (RBF, FNN, and
SVM. They have compared ST-SVM with LPSVM, LSVM, MNN). After training and testing KDD achieved better
SSVM, PSVM, and NSVM to find out which method accuracy of 99.96% due to the number of features were more.
performs the best in accuracy, sensitivity, specificity, and Comparing the results FM- ANN proved to be more accurate.
ROC. LPSVM proved to be the best with accuracy 97.1429%, Bevilacqua et al. [33] selected MR images for training and
sensitivity 98.2456%, specificity 95.082%, and ROC testing. After extracting data and processing, they used ANN
99.38%. Therefore, LPSVM has the highest performance. for classification and detecting breast cancer. However, when
Deng and Perkowski [25] used a new method called Genetic Algorithm was used to optimize ANN, the observed
Weighted Hierarchical Adaptive Voting Ensemble specificity was 90.46%, sensitivity was 89.08% and the
(WHAVE). They compared the accuracy of WHAVE with average accuracy was improved to 89.77% and high accuracy
seven other methods that had the highest accuracies in changed to 100%.
previous researchers. WHAVE proved to achieve the highest Table 1 represents all the related work ML method used in
performance value of 99.8%. this study [17-33]. It contains the references, type of extracted
Rehman et al. [26] extracted different features including features, data sets and measured performances. Performance
Phylogenetic trees, Statistical Features and Local Binary is the most significant feature in choosing the proper method.
Patterns from mammography images. They used a hybrid
model combined with SVM and RBF for classification. They
checked the accuracy of each feature separately. In this step
the best accuracy value was 76% for 90 features that were
chosen based on Taxonomic Indices based Feature (TIF)

e-ISSN: 2289-8131 Vol. 10 No. 3-2 23


Journal of Telecommunication, Electronic and Computer Engineering

Table 1
Related work on different types of methodology, features, dataset, and references for breast cancer detection

R Methodology Features Data Base Performance Dataset

MCC Sensitivity Specificity Accuracy Digital Database


Variance, Range, Variance 83.2%, 95%, 88% 91.5% for Screening
[17] SVM Mammogram
Compactness Range 82.1% 94% 88% 90.5% Mammography
Compactness 70% 86% 84% 85% (DDSM)

Microwave MCC% Sensitivity Specificity Precision Accuracy


GMM
[18] Tissue Tomography KNN 67% 87%, 84% 70% 80-90% ETRI
KNN
Image GMM 48% 67% 86% 70.8% 70-80%

Fuzzy Histogram
Training set Accuracy %
Hyperonization, Mammographic
SVM, KNN, Normal 70 100
[19] Fuzzy C-mean, and Gray Mammogram Image Analysis
RSDA Benign 60 96.67
level dependence model Society (MIAS)
Malignant 50 94
Contrast, Homogeneity,
MCC Sensitivity Specificity DDSM
Mean, Correlation, Energy,
[20] SVM, GMM Mammography SVM 78.78% 82% 96% University of
Maximum
GMM 72.06% 84% 86% South Florida

Mitoses, Marginal-Adhesion,
Accuracy percentage
Normal Nucleoli, Clump
LRC 99.25
Thickness, Bland Chromatin,
BFI 95.46
[21] LRC Uniformity of cell shape, Standard Data UCI
ID3 92.99
Single Epithelial cell size,
J48 98.14
Uniformity of cell size, Bare
SVM 96.40
Nuclei

WBC: Mitoses, Marginal- Accuracy percentage


Adhesion, Normal Nucleoli, WBC WDBC
Clump Thickness, Bland SVM 97.10 97.99 Wisconsin Breast
Chromatin, Uniformity of PCs-SVM 97.47 98.12 Cancer Database
cell shape, Single Epithelial PCi-SVM 96.73 97.90 Original (WBC)
cell size, Uniformity of cell ANN 89.88 99.60
SVM, ANN, NB,
size, Bare Nuclei PCs-ANN 95.52 99.61 Wisconsin
[22] Adaboost tree, Standard Data
WDBC, Radius, Texture, PCi-ANN 94.33 99.63 Diagnostic Breast
PCA
Perimeter, Area, Naïve 96.21 93.32 Cancer Database
Smoothness, Compactness, PCs-Naïve 96.50 91.79 (WDBC)
Concavity, Concave Points PCi-Naïve 96.16 91.72
Symmetry, Fractal Adaboost 95.84 97.19
Dimension PCs-Adaboost 96.24 96.73
PCi-AdaBoost 96.32 96.83

Mitoses, Marginal-Adhesion,
Normal Nucleoli, Clump
Accuracy Sensitivity Specificity AUC Wisconsin Breast
ANN, Thickness, Bland Chromatin,
[23] Standard Data SVM 99.51% 99.25% 100% 99.63% Cancer Database
SVM Uniformity of cell shape and
ANN 98.54% 99.25% 97.22% 98.24% (WBCD)
size, Single Epithelial cell
size, Bare Nuclei

Accuracy Sensitivity Specificity ROC


St-SVM, Mitoses, Marginal-Adhesion,
LPSVM 97.1429 98.2456 95.082 99.38
PSVM, Normal Nucleoli, Clump
LSVM 95.4286 96.5217 93.3333 97.18
LSVM, Thickness, Bland Chromatin,
[24] Mammography SSVM 96.5714 96.5812 96.5517 98.35 WBCD
NSVM, Uniformity of cell shape and
PSVM 96 97.3684 93.4426 97.75
LPSVM, size, Single Epithelial cell
NSVM 96.5714 96.5812 96.5517 98.35
SSVM size, Bare Nuclei
ST-SVM 94.86 95.65 93.33 96.61

Weighted
Method Accuracy Percentage
Hierarchical
DNF 65. 72
Adaptive Voting Mitoses, Marginal-Adhesion,
DT 94.74
Ensemble Normal Nucleoli, Clump
NB 84.5
(WHAVE) Thickness, Bland Chromatin,
[25] SVM 99.54 WBCD
Disjunctive Uniformity of cell shape and
Hybrid 99.54
Normal Form size, Single Epithelial cell
KNN 97.14
(DNF) rule-based size, Bare Nuclei
Quadratic Classifier 97.14
method,
WHAVE 99.8
DT, NB, SVM

Model I Model II Model III


Training

TIF % (LBP) % TIF and LBP %


Testing
(%)

Accura Specifi Accura Specifi Accura Specifi


Phylogenetic trees, cy city cy city cy city
SVM
[26] Statistical Features, and DDSM MIAS
RBF kernel
Local Binary Patterns 80 20 64 58 54 51 66 60
70 30 71 66 52 49 65 61
60 40 76 73 68 64 80 76
50 50 70 76 64 60 72 67

Federal
Accuracy
Fluminense
[27] KNN Mean, Standard Deviation Thermogram Normal Abnormal
KNN University
94.44% 88.88%
Hospital

24 e-ISSN: 2289-8131 Vol. 10 No. 3-2


Early Detection of Breast Cancer Using Machine Learning Techniques

R Methodology Features Data Base Performance Dataset

RF on FP
Bayes Net (BN), Precision Recall F ROC
TP rate Rate Department of
Multi-Class
Blood Serum BN 0.947 0.035 0.949 0.947 0.945 0.995 Biochemistry and
Classifier, TP Rate, FP Rate, Precision,
[28] Multi CC 0.933 0.043 0.933 0.933 0.93 0.987 Molecular
DT, Recall, F-measure, ROC area
DT 0.87 0.084 0.878 0.87 0.868 0.966 Biology of Kasr
Radial Basis
RBF 0.774 0.128 0.722 0.774 0.739 0.908 Alainy
Function, RF
RF 0.99 0.007 0.99 0.99 0.99 1

Accuracy percentage
DT with 30 features 92.51
KNN with 30 features 91.56
LR with 3 features 96.27
Logistic
Radius, Texture, Perimeter, LR with 6 features 97.77
Regression (LR),
Area, Smoothness, LR with 30 features 95.65
DT. Microscope
[29] Compactness, Concavity, LSVM with 3 features 97.47 UCI
KNN, Digital Image
Concave Points, Symmetry, LSVM with 10 features 97.87
Cubic SVM
Fractal, Dimension LSVM with 30 features 97.30
(CSVM)
CSVM with 11 features 97.98
SVM and CSVM 98.56
CSVM with 30 features 98
Stacking the Logistic, LSVM, and CSVM 98.56

BI-RADS, Age, Shape,


[30] NSVC Mammography Accuracy: 99% UCI
Margin, Density, Severity

Diffusion-
ROI: Mean, Variance, Weighted Accuracy Sensitivity Specificity AUC
RF-Recursive
Skewness, Kurtosis, Energy, Magnetic RF-RFE and RF 77.05% 84.21% 65.21% 0.76
Feature Zhejiang Cancer
[31] Entropy Resonance Histogram 68.85% 76.32% 56.52% 0.73
Elimination (RF- Hospital
ADC: Contrast, Entropy, Image (DW GLCM 65.57% 71.05% 56.52% 0.63
RFE) method
ASM, Correlation (Convert to Histogram + GLCM 77.05% 84.21% 65.21% 0.76
ADC)-MRI)

Feedforward MLP RBF MNN FM-


WBCD: f4, f8, f12, f14, f24, % % % % ANN
Fast Modular f27, f28 WBCD 70:30 98.45 91.50 93.75 99.22 99.80
Artificial Neural WBCD 50:50 94.91 89.5 90.65 93.57 95.71 WBCD, KDD
[32] X-Ray
Network (FM- KDD: f22, f29, f47, f50, f60, WBCD after training Accuracy 99.8 Cup 2008
ANN) f61, f62, f63, f64, f65, f71, KDD 70:30 94.91 93.95 98.45 99.22 99.96
f97f80, f98, f108, KDD 50:50 93.21 92.95 97.98 98.22 98.96
KDD cup 2008 after training Accuracy 99.96

Size, Convexity, Solidity,


Eccentricity, Aspect ratio,
Circularity, the standard Radiologists of
High Average
[33] Optimized ANN deviation value of the gray MRI Sensitivity Specificity the University of
Accuracy Accuracy
levels of Bari Aldo Moro
Optimized ANN 100% 89.77% 89.08% 90.46%
images with and without MC
in ROIs;

According to Figure 2, most researchers have worked on mammogram was the most frequent data set used compared
mammogram images as its quicker than other types of breast to other types of data such as ultrasound images, thermal
cancer detection and it is safe and more effective [34]. images or blood features.
Figure 3 presents a comparison of using ML methods and
Mammogram Standard MRI
algorithms methodologies employed for breast cancer
detection in the reviewed literature listed in Table 1. It is MTI Thermogram Blood Test
observed that SVM is the most frequently used method. MDI DW X-Ray
Whereby, Figure 4 presents the results of breast cancer
detection using ML methods.
6%
6%
V. CONCLUSION
6%
In the present paper, breast cancer and ML were introduced 35%
as well as an in-depth literature review was performed on 6%
existing ML methods used for breast cancer detection. The
findings of these researchers suggest that SVM is the most 6%
popular method used for cancer detection applications. SVM
was used either alone or combined with another method to 6%
improve the performance. The maximum achieved accuracy
of SVM (single or hybrid) was 99.8% that can be improved
12%
to 100%. It was observed from the work of [33] who used 17%
optional ANN on MRI resulted in 100% accuracy in
detecting breast cancer. This method can be applied and
tested on another dataset like mammogram and ultrasound to
check the performance of different data types. The Figure 2: Different breast cancer detection methods

e-ISSN: 2289-8131 Vol. 10 No. 3-2 25


Journal of Telecommunication, Electronic and Computer Engineering

[6] M.A. Richards, A.M. Westcombe, S.B. Love, P. Littlejohns, and A.J.
Popularity of Machine Learning Methods Ramirez, “Influence of delay on survival in patients with breast cancer:
a systematic review,” The Lancet, 1999, vol. 353, no. 9159, pp. 1119-
1126.
NSVC [7] B. Stewart and C.P. Wild, World Cancer Report 2014, International
MCC Agency for Research on Cancer, WHO, 2014.
BNN [8] S. A. Korkmaz, and M. Poyraz, “A New Method Based for Diagnosis
of Breast Cancer Cells from Microscopic Images: DWEE—JHT,” J.
DNF Med. Syst., vol. 38, no. 9, p. 92, 2014.
PCA [9] P. Louridas, and C. Ebert, “Machine Learning,” IEEE Softw., vol. 33,
no. 5, pp. 110–115, 2016.
ABT [10] A. Simons, “Using artificial intelligence to improve early breast cancer
RSDA detection, “2017. Retrieved on April 10, 2018, from
https://www.csail.mit.edu/news/using-artificial-intelligence-improve-
RBF
early-breast-cancer-detection
NB [11] E. Ali, and W. Feng, “Breast Cancer classification using Support
LRC Vector Machine and Neural Network,” International Journal of
Science and Research, pp. 2013, 2319-7064.
GMM [12] S. Medjahed, T. Saadi, and A. Benyettou, “Breast Cancer Diagnosis by
RF using k-Nearest Neighbor with Different Distances and Classification
Rules,” International Journal of Computer Applications, 2013, vol. 62,
DT no. 1, pp. 0975 – 8887.
ANN [13] R. Sumbaly, N. Vishnusri, and S. Jeyalatha, “Diagnosis of Breast
Cancer using Decision Tree Data Mining Technique,” International
K-NN
Journal of Computer Applications, 2014, vol. 98, no. 10, pp. 0975 –
SVM 8887.
[14] M. Elgedawy, “Prediction of Breast Cancer using Random Forest,
Support Vector Machines and Naïve Bayes,” International Journal of
Figure 3: Using machine learning methods in cancer detection Engineering and Computer Science, 2017, vol. 6, no. 1, pp. 19884-
19889.
[15] R. Senkamalavalli, and T. Bhuvaneswari,” Improved classification of
Accuracy (%) breast cancer data using hybrid techniques, “International Journal of
Advanced Research in Computer Science. 2017, vol. 8, no. 8, pp. 454-
100
457.
[16] A. Hazra, S. Mandal, and A. Gupta” Study and Analysis of Breast
90 Cancer Cell Detection using Naïve Bayes, SVM and Ensemble
Algorithms,” International Journal of Computer Applications. 2016,
80 vol. 145, no.2, pp. 0975 – 8887.
[17] S. Gc, R. Kasaudhan, T. K. Heo, and H.D. Choi, “Variability
70 Measurement for Breast Cancer Classification of Mammographic
Masses,” in Proceedings of the 2015 Conference on research in
60 adaptive and convergent systems (RACS), Prague, Czech Republic,
2015, pp. 177–182.
50 [18] C. Wang, W. Wang, S. Shin, and S. I. Jeon, “Comparative Study of
Microwave Tomography Segmentation Techniques Based on GMM
40 and KNN in Breast Cancer Detection,” in Proceedings of the 2014
Conference on Research in Adaptive and Convergent Systems (RACS
30 '14), Towson, Maryland, 2014, pp. 303–308.
[19] C. L. Chowdhary, and D. P. Acharjya, “Breast Cancer Detection using
20 Intuitionistic Fuzzy Histogram Hyperbolization and Possibilitic Fuzzy
c-mean Clustering algorithms with texture feature-based Classification
10 on Mammography Images,” in Proceedings of the International
Conference on Advances in Information Communication Technology &
0 Computing, Bikaner, India, 2016, pp. 1–6.
[20] S. Aminikhanghahi, S. Shin, W. Wang, S. I. Jeon, S. H. Son, and C.
[28]

[30]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]

[29]

[31]
[32]
[33]

Pack, “Study of wireless mammography image transmission impacts


on robust cyber-aided diagnosis systems,” Proc. 30th Annu. ACM
Symp. Appl. Comput. - SAC ’15, pp. 2252–2256, 2015.
Figure 4: Accuracy percentages in different literatures [21] S. G. Durai, S. H. Ganesh, and A. J. Christy, “Novel Linear Regressive
Classifier for the Diagnosis of Breast Cancer,” In Computing and
REFERENCES Communication Technologies (WCCCT), 2017 World Congress on
2017.
[22] H. Wang, and S. W. Yoon, “Breast cancer prediction using data mining
[1] World Health Organization, “Cancer country profiles 2014,” WHO, method,” IIE Annu. Conf. Expo 2015, pp. 818–828, 2015.
http://www.who.int/cancer/country-profiles/en/ [23] S. Hafizah, S. Ahmad, R. Sallehuddin, and N. Azizah, “Cancer
[2] M. Stalin, and R. Kalaimagal, “Breast cancer diagnosis from low- Detection Using Artificial Neural Network and Support Vector
intensity asymmetry thermogram breast images using fast support Machine: A Comparative Study,” J. Teknol, vol. 65, pp. 73–81, 2013.
vector machine,” i-manager's Journal on Image Processing, vol. 3, no. [24] A. T. Azar, and S. A. El-Said, “Performance analysis of support vector
3, pp. 17–26, 2016. machines classifiers in breast cancer mammography recognition,”
[3] R. Kirubakaran, T. C. Jia, and N. M. Aris, “Awareness of Breast Cancer Neural Comput. Appl., vol. 24, no. 5, pp. 1163–1177, 2014.
among Surgical Patients in a Tertiary Hospital in Malaysia,” Asian [25] C. Deng, and M. Perkowski, “A Novel Weighted Hierarchical Adaptive
Pacific Journal of Cancer Prevention, 2017, vol. 18, no. 1, pp. 115– Voting Ensemble Machine Learning Method for Breast Cancer
120. Detection,” Proc. Int. Symp. Mult. Log., vol. 2015–Septe, pp. 115–120,
[4] T. M. Khan, and S. A. Jacob, “Brief review of complementary and 2015.
alternative medicine use among Malaysian women with breast cancer,” [26] A. U. Rehman, N. Chouhan, and A. Khan, “Diverse and Discriminative
Journal of Pharmacy Practice and Research, 2017, vol. 47, no. 2, pp. Features Based Breast Cancer Detection Using Digital
147–152. Mammography,” 2015 13th Int. Conf. Front. Inf. Technol., pp. 234–
[5] L. Caplan, “Delay in breast cancer: implications for the stage at 239, 2015.
diagnosis and survival,” Frontiers in Public Health, 2014, vol. 2, [27] T. M. Mejia, M. G. Perez, V. H. Andaluz, and A. Conci, “Automatic
Article 87, pp. 1–6. Segmentation and Analysis of Thermograms Using Texture

26 e-ISSN: 2289-8131 Vol. 10 No. 3-2


Early Detection of Breast Cancer Using Machine Learning Techniques

Descriptors for Breast Cancer Detection,” 2015 Asia-Pacific Conf. International Conference on Bioinformatics and Computational
Comput. Aided Syst. Eng., pp. 24–29, 2015. Intelligence 2017.
[28] H. Ayeldeen, M. A. Elfattah, O. Shaker, A. E. Hassanien, and T.-H. [32] M. U. Salma, “Fast Modular Artificial Neural Network for the
Kim, “Case-Based Retrieval Approach of Clinical Breast Cancer Classification of Breast Cancer Data,” Proc. Third Int. Symp. Women
Patients,” 2015 3rd Int. Conf. Comput. Inf. Appl., pp. 38–41, 2015. Comput. Informatics - WCI ’15, pp. 66–72, 2015.
[29] T. K. Avramov and D. Si, “Comparison of Feature Reduction Methods [33] V. Bevilacqua, A. Brunetti, M. Triggiani, D. Magaletti, M. Telegrafo,
and Machine Learning Models for Breast Cancer Diagnosis,” Proc. Int. and M. Moschetta, “An Optimized Feed-forward Artificial Neural
Conf. Comput. Data Anal. - ICCDA ’17, pp. 69–74, 2017. Network Topology to Support Radiologists in Breast Lesions
[30] M. Ngadi, A. Amine, and B. Nassih, “A Robust Approach for Classification,” Proc. 2016 Genet. Evol. Comput. Conf. Companion -
Mammographic Image Classification Using NSVC Algorithm,” Proc. GECCO ’16 Companion, pp. 1385–1392, 2016.
Mediterr. Conf. Pattern Recognit. Artif. Intell. - MedPRAI-2016, pp. [34] M. Rmili, and A. El, “A Combined Approach for Breast Cancer
44–49, 2016. Detection in Mammogram,” 2016 13th International Conference on
[31] Z. Jiang, and W. Xu, “Classification of benign and malignant breast Computer Graphics, Imaging and Visualization, pp. 350–353, 2016.
cancer based on DWI texture features,” ICBCI 2017 Proceedings of the

e-ISSN: 2289-8131 Vol. 10 No. 3-2 27

View publication stats

You might also like