Skin Cancer Detection Using Machine Learning
Skin Cancer Detection Using Machine Learning
INTRODUCTION
In computer science, “intelligent” refers to take some action and to maximize the change of its
success. Therefore, the term artificial intelligent means that, to implement human associated
actions. By using some agents and to maximize the successive and accurate completion of their
goals “artificial intelligent deals with giving machines ability to seem like they have human
intelligent”. That means, “it’s the development of intelligent system and application that can
reasoning, learning, gather knowledge, converse, operate and observe the objects “. Similarly,
(CI) computational intelligence is also the branch of artificial intelligence and which the
vehemence is placed on experimental algorithm such as neural networks, fuzzy system and
evolutionary computation. AI makes machine smarter and easy for use. Through AI
implementation machine can perform the assigned task much faster and with higher accuracy
than a human. An intelligent system can translate, understand and answer/respond to the natural
language.
Artificial intelligence (AI) is a creative tool that simulates the human intelligence and ability
processes by machines, principally computer systems, robotics, and digital equipment. In this
chapter we had discus the role of Artificial intelligence in agriculture that how we can classify
diseases of plants. Problem description, Goals and objective of our study, significance of the
study and some useful contributions.
1.1 Introduction
In the past two decade, from 2008 to 2022, the annual number of melanoma cases has increased
by 53%, partly due to increased UV exposure [1, 2]. Although melanoma is one of the most
lethal types of skin cancer, a fast diagnosis can lead to a very high chance of survival. Skin
cancer one of the most common cancers in humans and its incidence is increasing dramatically
[3]. New incidences of the lethal skin cancer malignant melanoma (MM) in Denmark has
increased fivefold to six fold from 1942 to 1982, and the mortality rate has been doubled from
1955 to 1982 [4]. Currently, approximately 800 cases of MM are reported in Denmark every year
(approximately 15/100 000). In Germany, 9000–10 000 new cases are expected every year
1
(approximately 13/100
2
000) with an annual increase of 5%–10% [5]. Basal cell carcinoma (BCC) is the most common
of skin tumors and is mainly considered to be provoked by ultraviolet radiation and does not
metastasize. In contrast, MM can metastasize rapidly. This cancer is also considered to be
provoked by ultraviolet radiation, most probably by repeated high doses resulting in heavily
burned skin.
The first step in the diagnosis of a malignant lesion by a dermatologist is visual examination of
the suspicious skin area. A correct diagnosis is important because of the similarities of some
lesion types; moreover, the diagnostic accuracy correlates strongly with the professional
experience of the physician [6]. Without additional technical support, dermatologists have a
65%-80% accuracy rate in melanoma diagnosis [7]. In suspicious cases, the visual inspection is
supplemented with dermoscopic images taken with a special high-resolution and magnifying
camera. During the recording, the lighting is controlled, and a filter is used to reduce reflections
on the skin, thereby making deeper skin layers visible. With this technical support, the accuracy
of skin lesion diagnosis can be increased by a further 49% [8]. The combination of visual
inspection and dermoscopic images ultimately results in an absolute melanoma detection
accuracy of 75%-84% by dermatologists [9, 10].
There are different types of melanoma skin cancers such as nodular melanoma, superficial
spreading melanoma, and accrual trigonous melanoma and lentigo maligna. The majority of
cancer cases fall under the umbrella of non-melanoma categories, such as basal cell carcinoma
(BCC), and sebaceous gland carcinoma (SGC) and squamous cell carcinoma (SCC). BCC, SCC,
and SGC are formed in the middle and upper layers of the epidermis, respectively. These cancer
cells have little tendency to spread to other parts of the body. Non-melanoma cancers can be
easily treated as compared to melanoma cancers. Therefore, the key factor in treating skin cancer
is early detection. Doctors usually use the biopsy method to detect skin cancer. In this procedure,
a sample is taken from a suspected skin lesion for medical examination to determine whether it is
cancerous. This process is very painful, slow, and also time consuming. Computer-based
technology enables convenient, cheaper, and quicker diagnosis of skin cancer symptoms. To
study the skin cancer symptoms, whether they represent melanoma or non-melanoma, several
techniques are proposed that are non-invasive in nature. The general procedure in skin cancer
detection is to acquire the
3
image, pre-process it, segment the acquired pre-processed image, extract the desired feature, and
classify it.
Machine learning and deep learning has revolutionized the entire diagnosis methods landscape
over the past few decades. It is the most challenging subfield of machine learning dealing with
artificial neural network algorithms. These algorithms are inspired by the function and structure
of the human brain. Deep learning techniques are used in a variety of fields such as speech
recognition, pattern recognition, and bioinformatics. Compared to other classic machine learning
approaches, deep learning systems have achieved impressive results in these applications.
Various deep learning approaches have been used for computer-assisted skin cancer detection in
recent years. In this article, we discuss and analyze skin cancer detection techniques based on
deep learning.
In this study we used different supervised machine learning algorithms to done comparative
analysis of these models on skin cancer classification images dataset to classify this skin cancer
disease into benign and malignant,
1.2 Motivation
In recognition of continuing increases in the incidence of skin cancer, including malignant
melanoma, the American Academy of Dermatology has encouraged dermatologic communities
nationwide to offer free skin cancer screening to the public. Memorial Sloan-Kettering Cancer
Center took part in one such effort last spring. This article summarizes the results of a survey of
that center's participants. The data revealed that more than 90% of their attendees learned of the
screening through the mass media. In rural areas most of the people when face such kind of skin
injury so they ignore them by saying that it will recovered soon or they simple not go to the
doctor for a little injury and cannot pay the expensive fee of doctor for that. Basically they have
skin cancer issue and they unaware about it so we develop skin cancer detection application
using Machine learning for examining skin cancer in initial stage.
This application will help to the needy people or poor people which can’t afford treatments of
such a skin cancer disease. This application also provides the initial detection of skin diseases
either it has effected or not. The user will simply install this application on their mobiles
through which
4
they can scan the effected area and then this will show the result to the user that the selected area
is effected or not.
5
• Upload picture or use camera to capture picture for diagnosis.
• The app would analyze that uploaded image through use of different ML models such as
SVM, Neural Network, Bagged Tree Ensemble, K-Nearest Neighbor (KNN) to compare
their accuracy and implement the best performing model in the App. and show the result.
• There are two classes of skin cancer that are skin -cancerous or non-cancerous
• The app would display previous result or record of user.
6
CHAPTER 02
REQUIREMENT ANALYSIS
2.1 Overview
In this chapter, we will discuss some of the existing research on skin cancer classification which
helps dermatologist and doctors in the classification of skin diseases. In this section we further
discuss the current research on machine learning and deep learning-based detection and
classification of types of skin cancer.
This is noticeable that ABCD rubrics are subjective which results in a high inter as well as
intraobserver bias [20]. Therefore, a high-quality initiative feature technique was utilized to
represent the asymmetry characteristic of ABCD rubrics skin cancer images [21]. The proposed
system achieved 86% performance accuracy. Amelard et. al. [22] extended his findings by
adding new high-quality initiative features for distinct color channels and achieved 94%
accuracy. The color features can be extracted using statistical values of color channels. Few other
colors extracted
7
approaches include color’s irregularity, centroid distance, and LUV histogram distance [19,
2325]. In another group of researchers classifies melanoma skin cancer based on local and global
features. They achieved 93% accuracy by combining textual features with color features.
Some other researchers utilized the same approach but in addition to textual and color features,
they also utilized shape-based features. Ganster et. al. [26]used the KNN classifier along with
colored and shaped features from lesion skin. For training, 5300 images were taken and obtained
87% sensitivity and 92% specificity. Rubegni et. al. [27] used textual and geometrical features as
well and achieved the same sensitivity and specificity of 96%. Used a feature vector of shape,
texture, and color features with SVM and achieved 93% sensitivity and 92% specificity.
Almansour et. al. [28] utilized the SVM classifier with color and textual features and attained
90% performance accuracy. The textual feature of the image represents the spatial distribution of
pixel intensity levels. Textural features show intensity levels’ underlying pattern and layout and
serve as one of the most discriminative features for object or ROI detection. The textural features
are generally used for skin cancer analysis as it calculates structural irregularity between nevus
and melanoma cancer classification [29]. It has been observed that the computerized examination
is getting popular. Epilu-computerized microscopy (ELM) based method was proposed to
enhance the initial diagnosis of a melanoma classification scheme for the computerized analysis.
ROI was extracted by utilizing the segmentation algorithms and the merged featured approach
was utilized based on the shape and radiometric features. K Nearest Neighbor was utilized and
attained 87% and 92% sensitivity and specificity respectively. The automated analysis of data for
the melanoma initial recognition scheme [30] achieved 80% for both specificity and sensitivity
utilizing asymmetry and boundary descriptor with Support Vector Machine as a classifier.
Iyatomi et. al. [31] utilized a similar technique and achieved 100% specificity and 95%
sensitivity. In [32], researchers achieved 91% and 88.2% specificity and sensitivity respectively
trained over 120 cancer images. In the researcher utilized active contour and watershed methods
for the segmentation. They extracted shape, color, and texture related features. This proposed
architecture was trained on 50 DermIS datasets and attained 80% performance accuracy. A new
CAD technique of early identification of melanoma was proposed for web and android phone
applications. All the images were taken from high definition cameras rather than utilizing any
repository [33]. The proposed system utilized digital cameras with context information like kind
of skin, span, and all
8
the affected parts of the body. The proposed architecture also tried to remove the Dermoscopic
ADCD compatible features. This attribute was then classified in numerous phases including a
preprocessing phase for selection of association-based attributes. This scheme attained 68%
specificity and 94% sensitivity on 45 nevus and 107 melanoma skin cancer datasets.
The degree of something that performs a required operation without failure and to be accurate is the
reliability. This application is developed to be run on any android devices.
2.4.2 Maintainability:
The change in software after development has been completed is known as maintenance. As the times
passed, the needs are changed and the application will be updated and modified time by time and as
look after the user’s feed backs.
2.4.3 Portability:
Portability is way in which software can be able to be transferred from one place to another or from a
machine to another. As the android phones are portable, this application is also having the ability to be
transferred from one android to another.
9
CHAPTER 03
METHODOLOGY
3.1 Overview
In this chapter we will first present the overall description of the framework and
subsequently describe the individual components and then it will describe in detail
the concept, and the implementation of the framework.
3.3 Preprocessing
10
3.3.2 Image Preprocessing
Due to the presence of dewdrops, dust, and insect excrement on the plants, image
preprocessing is required to provide superior outcomes in subsequent steps. These
effects are referred to as the corn image's noise. To get over these issues the source
to deliver, an RGB photo is converted to a grayscale image. Precise outcomes the
size of the images in this situation is extremely huge, which necessitates the use of
large fonts. It is required to reduce the image size. This image reduction can be
used for a variety of purposes and reduce the size of the memory.
3.3.3 Feature Extraction
Feature extraction extracts the characteristics of the objects in the photos. These
collected characteristics are utilized to depict an entity. These characteristics were
retrieved and grouped into three groups: shape, color, and texture. It's possible that
the infections will spread. Diseases cause them to change their shapes into a variety
of various image shapes. The disease can be clearly identified by the shape of the
features in the model. These are the shapes axes, areas, and angles of the features
differ. Color is the second parameter is a key component of these three
characteristics. It distinguishes the diseases from one another to each other the third
option, texture, determines how the color patterns are created strewn over the
images. The color information is extracted using RGB feature extraction. Among
the most commonly used photos for pattern recognition and processing RGB stands
for red, green, and blue. Highly recommended for image object detection. It has
undergone a dramatic transformation colored.
Following feature descriptor are used to extract features from images dataset:
11
3.4 Machine Learning Models Architecture
Skin cancer is the most common human malignancy, is primarily diagnosed visually, beginning
with an initial clinical screening and followed potentially by dermoscopic analysis, a biopsy and
histopathological examination. Automated classification of skin lesions using images is a
challenging task owing to the fine-grained variability in the appearance of skin lesions.
The dataset is taken from the ISIC (International Skin Image Collaboration) Archive. It consists of
1800 pictures of benign moles and 1497 pictures of malignant classified moles. The pictures have
all been resized to low resolution (224x224x3) RGB. The task of this kernel is to create a model,
which can classify a mole visually into benign and malignant.
As the dataset is pretty balanced, the model will be tested on the accuracy score, thus (TP +
TN)/(ALL).
1. Benign
2. Malignant
3. Normal
In this kernel I will try to detect 3 different classes of moles using Convolution Neural Network
with keras tensorflow in backend and then analyse the result to see how the model can be useful
in practical scenario.
In this kernel I have followed following 14 steps for model building and evaluation which are as
follows :
Step 4: Normalization
12
Step 7: Cross-validating model
Step 9: ResNet50
3.4.2 CNN
I used the Keras Sequential API, where you have just to add one layer at a time, starting from the
input.
The first is the convolutional (Conv2D) layer. It is like a set of learnable filters. I choosed to set 64
filters for the two firsts conv2D layers. Each filter transforms a part of the image (defined by the
kernel size) using the kernel filter. The kernel filter matrix is applied on the whole image. Filters
can be seen as a transformation of the image.
The CNN can isolate features that are useful everywhere from these transformed images (feature
maps).
The second important layer in CNN is the pooling (MaxPool2D) layer. This layer simply acts as a
downsampling filter. It looks at the 2 neighboring pixels and picks the maximal value. These are
used to reduce computational cost, and to some extent also reduce overfitting. We have to choose
the pooling size (i.e the area size pooled each time) more the pooling dimension is high, more the
downsampling is important.
Combining convolutional and pooling layers, CNN are able to combine local features and learn
more global features of the image.
Dropout is a regularization method, where a proportion of nodes in the layer are randomly ignored
(setting their wieghts to zero) for each training sample. This drops randomly a propotion of the
network and forces the network to learn features in a distributed way. This technique also improves
generalization and reduces the overfitting.
'relu' is the rectifier (activation function max(0,x). The rectifier activation function is used to add
non linearity to the network.
The Flatten layer is use to convert the final feature maps into a one single 1D vector. This flattening
step is needed so that you can make use of fully connected layers after some convolutional/maxpool
layers. It combines all the found local features of the previous convolutional layers.
In the end i used the features in one fully-connected (Dense) layer which is just artificial an neural
networks (ANN) classifier.
13
3.4.3 Model implementation
Step 1 : importing Essential Libraries
In [1]:
import os
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os
from glob import glob
import seaborn as sns
from PIL import Image
np.random.seed(11) # It's my lucky number
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, KFold, cross_val_score,
GridSearchCV
from sklearn.metrics import accuracy_score
import itertools
import keras
from keras.utils.np_utils import to_categorical # used for converting labels to one-
hot-encoding
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras import backend as K
from keras.layers.normalization import BatchNormalization
from keras.utils.np_utils import to_categorical # convert to one-hot-encoding
from keras.optimizers import Adam, RMSprop
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau
from keras.wrappers.scikit_learn import KerasClassifier
from keras.applications.resnet50 import ResNet50
from keras import backend as K
Using TensorFlow backend.
In [2]:
folder_benign_train = '../input/data/train/benign'
folder_malignant_train = '../input/data/train/malignant'
folder_benign_test = '../input/data/test/benign'
folder_malignant_test = '../input/data/test/malignant'
14
read = lambda imname: np.asarray(Image.open(imname).convert("RGB"))
# Create labels
y_benign = np.zeros(X_benign.shape[0])
y_malignant = np.ones(X_malignant.shape[0])
y_benign_test = np.zeros(X_benign_test.shape[0])
y_malignant_test = np.ones(X_malignant_test.shape[0])
# Merge data
X_train = np.concatenate((X_benign, X_malignant), axis = 0)
y_train = np.concatenate((y_benign, y_malignant), axis = 0)
# Shuffle data
s = np.arange(X_train.shape[0])
np.random.shuffle(s)
X_train = X_train[s]
y_train = y_train[s]
s = np.arange(X_test.shape[0])
np.random.shuffle(s)
X_test = X_test[s]
y_test = y_test[s]
In [3]:
# Display first 15 images of moles, and how they are classified
w=40
h=30
fig=plt.figure(figsize=(12, 8))
columns = 5
rows = 3
In [4]:
y_train = to_categorical(y_train, num_classes= 2)
y_test = to_categorical(y_test, num_classes= 2)
Step 4 : Normalization
Normalize all Values of the pictures by dividing all the RGB values by 255
In [5]:
# With data augmentation to prevent overfitting
X_train = X_train/255.
X_test = X_test/255.
In [6]:
# See learning curve and validation curve
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer=init))
model.add(Dense(num_classes, activation='softmax'))
model.summary()
if optim == 'rmsprop':
16
optimizer = RMSprop(lr=lr)
else:
optimizer = Adam(lr=lr)
Output:
WARNING:tensorflow:From
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263:
colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a
future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From
/opt/conda/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout
(from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future
version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 56, 56, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 200704) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 25690240
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 25,729,218
Trainable params: 25,729,218
Non-trainable params: 0
_________________________________________________________________
WARNING:tensorflow:From
/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from
tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
18
Epoch 00007: ReduceLROnPlateau reducing learning rate to 4.999999873689376e-06.
In [8]:
K.clear_session()
del model
del history
In [9]:
# define 3-fold cross validation test harness
kfold = KFold(n_splits=3, shuffle=True, random_state=11)
cvscores = []
for train, test in kfold.split(X_train, y_train):
# create model
model = build(lr=lr,
init= init,
activ= activ,
optim=optim,
input_shape= input_shape)
In [10]:
# Fitting model to all data
model = build(lr=lr,
init= init,
21
activ= activ,
optim=optim,
input_shape= input_shape)
model.fit(X_train, y_train,
epochs=epochs, batch_size= batch_size, verbose=0,
callbacks=[learning_rate_reduction]
)
print(accuracy_score(np.argmax(y_test, axis=1),y_pred))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 56, 56, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 200704) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 25690240
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 25,729,218
Trainable params: 25,729,218
Non-trainable params: 0
_________________________________________________________________
/opt/conda/lib/python3.6/site-packages/keras/callbacks.py:1109: RuntimeWarning: Reduce LR on
plateau conditioned on metric `val_acc` which is not available. Available metrics are: loss,acc,lr
(self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
0.6818181818181818
In [11]:
# save model
# serialize model to JSON
model_json = model.to_json()
22
with open("model.json", "w") as json_file:
json_file.write(model_json)
Step 8: ResNet50
The CNN above is not a very sophisticated model, thus the resnet50, is also tried
In [12]:
input_shape = (224,224,3)
lr = 1e-5
epochs = 50
batch_size = 64
model = ResNet50(include_top=True,
weights= None,
input_tensor=None,
input_shape=input_shape,
pooling='avg',
classes=2)
model.compile(optimizer = Adam(lr) ,
loss = "binary_crossentropy",
metrics=["accuracy"])
23
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
Output
Train on 2109 samples, validate on 528 samples
Epoch 1/50
- 33s - loss: 0.9927 - acc: 0.5974 - val_loss: 0.5673 - val_acc: 0.7614
Epoch 2/50
- 18s - loss: 0.5286 - acc: 0.7634 - val_loss: 0.5200 - val_acc: 0.6932
Epoch 3/50
- 18s - loss: 0.4536 - acc: 0.7757 - val_loss: 0.6281 - val_acc: 0.7254
Epoch 4/50
- 18s - loss: 0.4181 - acc: 0.7933 - val_loss: 0.5645 - val_acc: 0.7386
Epoch 5/50
- 18s - loss: 0.4011 - acc: 0.8075 - val_loss: 0.4142 - val_acc: 0.7898
Epoch 6/50
- 18s - loss: 0.3887 - acc: 0.8075 - val_loss: 0.4471 - val_acc: 0.7917
Epoch 7/50
- 18s - loss: 0.3729 - acc: 0.8132 - val_loss: 0.4708 - val_acc: 0.7822
Epoch 8/50
- 18s - loss: 0.3552 - acc: 0.8288 - val_loss: 0.4532 - val_acc: 0.7879
Epoch 9/50
- 18s - loss: 0.3419 - acc: 0.8369 - val_loss: 0.4875 - val_acc:
0.7955
Epoch 10/50
- 18s - loss: 0.3273 - acc: 0.8530 - val_loss: 0.4950 - val_acc:
0.8106
Epoch 11/50
- 18s - loss: 0.3069 - acc: 0.8563 - val_loss: 0.4702 - val_acc:
0.7822
Epoch 12/50
- 18s - loss: 0.3037 - acc: 0.8549 - val_loss: 0.4657 - val_acc: 0.7955
Epoch 13/50
- 18s - loss: 0.2920 - acc: 0.8729 - val_loss: 0.5437 - val_acc: 0.7803
Epoch 14/50
- 18s - loss: 0.2764 - acc: 0.8777 - val_loss: 0.4889 - val_acc: 0.7841
24
Epoch 15/50
- 18s - loss: 0.2717 - acc: 0.8758 - val_loss: 0.4500 - val_acc: 0.7936
In [13]:
# Train ResNet50 on all the data
model.fit(X_train, y_train,
26
epochs=epochs, batch_size= epochs, verbose=0,
callbacks=[learning_rate_reduction]
)
/opt/conda/lib/python3.6/site-packages/keras/callbacks.py:1109: RuntimeWarning: Reduce LR on
plateau conditioned on metric `val_acc` which is not available. Available metrics are: loss,acc,lr
(self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
Out[13]:
<keras.callbacks.History at 0x7fef804cb160>
In [14]:
linkcode
# Testing model on test data to evaluate
y_pred = model.predict(X_test)
print(accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_pred, axis=1)))
# save model
# serialize model to JSON
resnet50_json = model.to_json()
27
3.5 SDLC Model
28
3.6 Flow Chart:
Description:
In the flow chart we describe the flow of the system as in the above diagram we defined the flow
of our system. Defining the user side of the flow chart first the user will enter user name and
password in the text field after entering the password the system will check either the password
is wrong or correct if the password is wrong the system will pop up an error message. If the user
entered the correct login credentials the system will let the user login to the system after entering
the correct password and user name the user will see a main page of the system from there the
user can select many options provided by the system the user can select from the many option
like import, take photo and after Analysis he can check the result either it have cancer or normal.
29
3.7 Use case Diagram:
Description
User can login into the Skin cancer detection app and then this information gets
stored in the database system. Users need to make profile first. User takes photo or
import image from mobile local location. App shows skin color and disease to user.
30
Figure 3.13 Registration Page
Description:
A signup page (also known as a registration page) enables users and organizations to independently
register and gain access to your system.
After clicking on the signup button, after selecting that he can enter valid data required for sign up to
create his account in our portal.
31
Figure 3.14 Login Page
Description:
A login page or an entry page that requires user identification and authentication, regularly performed
by entering a username and password combination. Logins may provide access to an entire site.
After creating account in the portal, the user will be shown a sign in page where he can login into the
portal with correct password and email.
32
Figure 3.15 Update Profile
Description:
In this section the system provide one of the option to update easily his profile he can take a photo or import
photo from gallery.
33
Figure 3.16 Profile Page
Description:
In this Profile page the user can can add his personal information for example his name, email, mobile
number.
34
Figure 3.17 Take Photo or Import Image
Description:
In this page the user have two options either import photo from gallery or take a photo on camera.
35
Figure 3.18 About Page
Description:
In this page the user can see or get info about cancer and its initial stages and also its signs and symptoms.
36
Figure 3.19 History of Skins Page
Description:
In History page user can see their previous results of their skin which add or store to this History page.
37
Figure 3.20 Storage in Database
Description:
The Firebase Real-time Database is a cloud-hosted NoSQL database that lets you store and sync data
between your users in real-time. NEW: Cloud Fire store enables you to store, sync and query app data at
global scale.
38
Figure 3.21 History in Database
Description:
In this database section we will see all user email profile image and other details.
39
Figure 3.22 Users in Database
Description:
In the real time database section Where we will see all the users email password name for which they signup.
40
Figure 3.23 Authentication of User in Database
Description:
Firebase authentication provides backend services, easy to use software development kit (SDK’s), and ready-
made UI libraries to authenticate users to our system.
41
CHAPTER 04
4.1 Overview
In this chapter we will talking about system configuration and experimental result, we present
the experimental evaluations of our proposed methods. Evaluating the performance of machine
learning models on corn leaf images.
42
Name Configuration
OS Window
Coding Python
This dataset was uploaded on website (Kaggle) which is hub of data for machine learning models.
This data set have two categories which are arrange as:
0) Benign
1) Malignant
It contains 2468 images which were categories accordingly Benign (1306) and Malignant (1162).
43
4.4Results and Model Evolution Learning
4.4.1 Curves
The main objective of our proposed architecture is to show that the pipeline we assembled, will
maximize classification accuracy, and minimize any loss of corn leaf disease classification. To
assess the performance of the models and as a design guide for opting a backbone, we compared
the model accuracy of the KNN, SVM, Naïve Bayes and Random Forest model backbones on
corn leaf images dataset and overall accuracy in this work.
Learning curves of KNN, SVM, Naïve Bayes and Random Forest for our proposed work are
given below.
Figure 4.2 Learning curves of KNN Figure 4.3 Learning curves of SVM
44
Figure 4.4 Learning curves of Naïve Bayes Figure 4.5 Learning curves of Random Forest
A confusion matrix is a table that is often used to describe the performance of a classification
model (or "classifier") on a set of test data for which the true values are known.
• True positives (TP): These are cases in which we predicted yes (they have the
disease), and they do have the disease.
• True negatives (TN): We predicted no, and they don't have the disease.
• False positives (FP): We predicted yes, but they don't have the disease.
• False negatives (FN): We predicted no, but they do have the disease.
45
Figure 4.6 Naïve Bayes Figure 4.7 KNN Confusion matrix
Figure 4.8 SVM Confusion matrix Figure 4.9 Random Forest Confusion matrix
46
In table 4.3 below shows the training and validation accuracy results of proposed framework
overall accuracy on 20 epochs. The table below shows performance of different machine
learning model on ultrasound images dataset in our work.
Model Accuracy
K-Nearest Neighbor 73
Support vector machine 79%
Naïve Bayes 71%
Random Forest 81%
47
CHAPTER 05
This chapter concludes the thesis and summarizes the findings of the presented work. It will first
give a summary of what has been done in this thesis. After that, and outlook will be given, in
which further developments or improvements are possible is in the present work.
5.2 Conclusions
In this work, we have applied supervised machine learning classification model with changing
the parameter and testing it on skin cancer images dataset using machine learning framework
Sckit learn (Simple and efficient tools for predictive data analysis). With the help of Machine
learning supervised models, we have extracted the features of an image using HOG feature
descriptor and have classified the image into benign and malignant. It is observed that the
classification accuracy mainly depends on how HOG extract features.
Machine learning represents powerful, efficient, highly accurate tool that reduces pathologist
and Farmers workload. It is expected to use machine learning approaches in various medical
situations such as image classification, object detection, and segmentation etc. We must have
basic knowledge of machine learning, anticipate the problems that will occur when it is
introduced, and prepare to address these various problems.
48
Our next challenge would be to improve the accuracy as close to 100 %. Currently in our work,
we only classify type of disease into benign and malignant. In the future, we will monitor
progression of these diseases. Hence, another future direction from this study will be extending
the model implementation on conventional smartphone processor to do fast and cheap on-device
inference
49
5.4 References
50
"Morphologic changes of pigmented skin lesions: a useful extension of the ABCD rule for
dermatoscopy," Journal of the American Academy of Dermatology, vol. 40, no. 4, pp. 558562,
1999.
[13] M. Keefe, D. Dick, and R. Wakeel, "A study of the value of the seven‐point checklist in
distinguishing benign pigmented lesions from melanoma," Clinical and experimental
dermatology, vol. 15, no. 3, pp. 167-171, 1990.
[14] M. Healsmith, J. Bourke, J. Osborne, and R. Graham‐Brown, "An evaluation of the
revised seven‐point checklist for the early diagnosis of cutaneous malignant melanoma,"
British Journal of Dermatology, vol. 130, no. 1, pp. 48-50, 1994.
[15] H. P. Soyer et al., "Three-point checklist of dermoscopy," Dermatology, vol. 208, no. 1,
pp. 27-31, 2004.
[16] A. Masood and A. Ali Al-Jumaily, "Computer aided diagnostic support system for skin
cancer: a review of techniques and algorithms," International journal of biomedical
imaging, vol. 2013, 2013.
[17] Z. Shi et al., "A deep CNN based transfer learning method for false positive reduction,"
Multimedia Tools and Applications, vol. 78, no. 1, pp. 1017-1033, 2019.
[18] S. Demyanov, R. Chakravorty, M. Abedini, A. Halpern, and R. Garnavi, "Classification
of dermoscopy patterns using deep convolutional neural networks," in 2016 IEEE 13th
international symposium on biomedical imaging (ISBI), 2016: IEEE, pp. 364-368.
[19] R. Moussa, F. Gerges, C. Salem, R. Akiki, O. Falou, and D. Azar, "Computer-aided
detection of Melanoma using geometric features," in 2016 3rd Middle East Conference
on Biomedical Engineering (MECBME), 2016: IEEE, pp. 125-128.
[20] G. Argenziano et al., "Dermoscopy of pigmented skin lesions: results of a consensus
meeting via the Internet," Journal of the American Academy of Dermatology, vol. 48, no.
5, pp. 679-693, 2003.
[21] R. Amelard, A. Wong, and D. A. Clausi, "Extracting high-level intuitive features (HLIF)
for classifying skin lesions using standard camera images," in 2012 Ninth Conference on
Computer and Robot Vision, 2012: IEEE, pp. 396-403.
[22] R. Amelard, J. Glaister, A. Wong, and D. A. Clausi, "High-level intuitive features
(HLIFs) for intuitive skin lesion description," IEEE Transactions on Biomedical
Engineering, vol.
51
62, no. 3, pp. 820-831, 2014.
[23] M. E. Celebi et al., "A methodological approach to the classification of dermoscopy
images," Computerized Medical imaging and graphics, vol. 31, no. 6, pp. 362-373, 2007.
[24] R. J. Stanley, W. V. Stoecker, and R. H. Moss, "A relative color approach to color
discrimination for malignant melanoma detection in dermoscopy images," Skin Research
and Technology, vol. 13, no. 1, pp. 62-72, 2007.
[25] C. Barata, M. Ruela, M. Francisco, T. Mendonça, and J. S. Marques, "Two systems for
the detection of melanomas in dermoscopy images using texture and color features,"
IEEE systems Journal, vol. 8, no. 3, pp. 965-979, 2013.
[26] H. Ganster, P. Pinz, R. Rohrer, E. Wildling, M. Binder, and H. Kittler, "Automated
melanoma recognition," IEEE transactions on medical imaging, vol. 20, no. 3, pp.
233239, 2001.
[27] P. Rubegni et al., "Automated diagnosis of pigmented skin lesions," International
Journal of Cancer, vol. 101, no. 6, pp. 576-580, 2002.
[28] E. Almansour and M. A. Jaffar, "Classification of Dermoscopic skin cancer images using
color and hybrid texture features," IJCSNS Int J Comput Sci Netw Secur, vol. 16, no. 4,
pp. 135-9, 2016.
[29] R. B. Oliveira, J. P. Papa, A. S. Pereira, and J. M. R. Tavares, "Computational methods
for pigmented skin lesion classification in images: review and future trends," Neural
Computing and Applications, vol. 29, no. 3, pp. 613-636, 2018.
[30] I. Stanganelli et al., "Computer-aided diagnosis of melanocytic lesions," Anticancer
Research, vol. 25, no. 6C, pp. 4577-4582, 2005.
[31] H. Iyatomi et al., "Computer-based classification of dermoscopy images of melanocytic
lesions on acral volar skin," Journal of Investigative Dermatology, vol. 128, no. 8, pp.
2049-2054, 2008.
[32] M. A. Farooq, M. A. M. Azhar, and R. H. Raza, "Automatic lesion detection system
(ALDS) for skin cancer classification using SVM and neural classifiers," in 2016 IEEE
16th International Conference on Bioinformatics and Bioengineering (BIBE), 2016:
IEEE,
pp. 301-308.
52
[33] J. F. Alcón et al., "Automatic imaging system with decision support for inspection of
pigmented skin lesions and melanoma diagnosis," IEEE journal of selected topics in
signal processing, vol. 3, no. 1, pp. 14-25, 2009.
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78