0% found this document useful (0 votes)
162 views78 pages

Skin Cancer Detection Using Machine Learning

This document provides an introduction to skin cancer detection using machine learning. It discusses the increasing rates of skin cancer cases and importance of early detection. Current methods for diagnosis involve visual examination by dermatologists, which has accuracy of 65-80%. The goal of this study is to develop a mobile application using machine learning algorithms to detect skin cancer from images in early stages. Several machine learning models will be compared to select the best performing one for the app. This could help provide affordable early detection to those who cannot access clinical tests.

Uploaded by

irfan khan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
162 views78 pages

Skin Cancer Detection Using Machine Learning

This document provides an introduction to skin cancer detection using machine learning. It discusses the increasing rates of skin cancer cases and importance of early detection. Current methods for diagnosis involve visual examination by dermatologists, which has accuracy of 65-80%. The goal of this study is to develop a mobile application using machine learning algorithms to detect skin cancer from images in early stages. Several machine learning models will be compared to select the best performing one for the app. This could help provide affordable early detection to those who cannot access clinical tests.

Uploaded by

irfan khan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 78

CHAPTER 01

INTRODUCTION

In computer science, “intelligent” refers to take some action and to maximize the change of its
success. Therefore, the term artificial intelligent means that, to implement human associated
actions. By using some agents and to maximize the successive and accurate completion of their
goals “artificial intelligent deals with giving machines ability to seem like they have human
intelligent”. That means, “it’s the development of intelligent system and application that can
reasoning, learning, gather knowledge, converse, operate and observe the objects “. Similarly,
(CI) computational intelligence is also the branch of artificial intelligence and which the
vehemence is placed on experimental algorithm such as neural networks, fuzzy system and
evolutionary computation. AI makes machine smarter and easy for use. Through AI
implementation machine can perform the assigned task much faster and with higher accuracy
than a human. An intelligent system can translate, understand and answer/respond to the natural
language.

Artificial intelligence (AI) is a creative tool that simulates the human intelligence and ability
processes by machines, principally computer systems, robotics, and digital equipment. In this
chapter we had discus the role of Artificial intelligence in agriculture that how we can classify
diseases of plants. Problem description, Goals and objective of our study, significance of the
study and some useful contributions.

1.1 Introduction
In the past two decade, from 2008 to 2022, the annual number of melanoma cases has increased
by 53%, partly due to increased UV exposure [1, 2]. Although melanoma is one of the most
lethal types of skin cancer, a fast diagnosis can lead to a very high chance of survival. Skin
cancer one of the most common cancers in humans and its incidence is increasing dramatically
[3]. New incidences of the lethal skin cancer malignant melanoma (MM) in Denmark has
increased fivefold to six fold from 1942 to 1982, and the mortality rate has been doubled from
1955 to 1982 [4]. Currently, approximately 800 cases of MM are reported in Denmark every year
(approximately 15/100 000). In Germany, 9000–10 000 new cases are expected every year
1
(approximately 13/100

2
000) with an annual increase of 5%–10% [5]. Basal cell carcinoma (BCC) is the most common
of skin tumors and is mainly considered to be provoked by ultraviolet radiation and does not
metastasize. In contrast, MM can metastasize rapidly. This cancer is also considered to be
provoked by ultraviolet radiation, most probably by repeated high doses resulting in heavily
burned skin.

The first step in the diagnosis of a malignant lesion by a dermatologist is visual examination of
the suspicious skin area. A correct diagnosis is important because of the similarities of some
lesion types; moreover, the diagnostic accuracy correlates strongly with the professional
experience of the physician [6]. Without additional technical support, dermatologists have a
65%-80% accuracy rate in melanoma diagnosis [7]. In suspicious cases, the visual inspection is
supplemented with dermoscopic images taken with a special high-resolution and magnifying
camera. During the recording, the lighting is controlled, and a filter is used to reduce reflections
on the skin, thereby making deeper skin layers visible. With this technical support, the accuracy
of skin lesion diagnosis can be increased by a further 49% [8]. The combination of visual
inspection and dermoscopic images ultimately results in an absolute melanoma detection
accuracy of 75%-84% by dermatologists [9, 10].

There are different types of melanoma skin cancers such as nodular melanoma, superficial
spreading melanoma, and accrual trigonous melanoma and lentigo maligna. The majority of
cancer cases fall under the umbrella of non-melanoma categories, such as basal cell carcinoma
(BCC), and sebaceous gland carcinoma (SGC) and squamous cell carcinoma (SCC). BCC, SCC,
and SGC are formed in the middle and upper layers of the epidermis, respectively. These cancer
cells have little tendency to spread to other parts of the body. Non-melanoma cancers can be
easily treated as compared to melanoma cancers. Therefore, the key factor in treating skin cancer
is early detection. Doctors usually use the biopsy method to detect skin cancer. In this procedure,
a sample is taken from a suspected skin lesion for medical examination to determine whether it is
cancerous. This process is very painful, slow, and also time consuming. Computer-based
technology enables convenient, cheaper, and quicker diagnosis of skin cancer symptoms. To
study the skin cancer symptoms, whether they represent melanoma or non-melanoma, several
techniques are proposed that are non-invasive in nature. The general procedure in skin cancer
detection is to acquire the

3
image, pre-process it, segment the acquired pre-processed image, extract the desired feature, and
classify it.

Machine learning and deep learning has revolutionized the entire diagnosis methods landscape
over the past few decades. It is the most challenging subfield of machine learning dealing with
artificial neural network algorithms. These algorithms are inspired by the function and structure
of the human brain. Deep learning techniques are used in a variety of fields such as speech
recognition, pattern recognition, and bioinformatics. Compared to other classic machine learning
approaches, deep learning systems have achieved impressive results in these applications.
Various deep learning approaches have been used for computer-assisted skin cancer detection in
recent years. In this article, we discuss and analyze skin cancer detection techniques based on
deep learning.
In this study we used different supervised machine learning algorithms to done comparative
analysis of these models on skin cancer classification images dataset to classify this skin cancer
disease into benign and malignant,

1.2 Motivation
In recognition of continuing increases in the incidence of skin cancer, including malignant
melanoma, the American Academy of Dermatology has encouraged dermatologic communities
nationwide to offer free skin cancer screening to the public. Memorial Sloan-Kettering Cancer
Center took part in one such effort last spring. This article summarizes the results of a survey of
that center's participants. The data revealed that more than 90% of their attendees learned of the
screening through the mass media. In rural areas most of the people when face such kind of skin
injury so they ignore them by saying that it will recovered soon or they simple not go to the
doctor for a little injury and cannot pay the expensive fee of doctor for that. Basically they have
skin cancer issue and they unaware about it so we develop skin cancer detection application
using Machine learning for examining skin cancer in initial stage.
This application will help to the needy people or poor people which can’t afford treatments of
such a skin cancer disease. This application also provides the initial detection of skin diseases
either it has effected or not. The user will simply install this application on their mobiles
through which

4
they can scan the effected area and then this will show the result to the user that the selected area
is effected or not.

1.3 Problem Description


Skin cancer have become a big problem, and new variant are on rise. It’s almost impossible and
very time consuming to identify the cancer of skin without access to doctor.
• Need an MRI to identify if it is cancer or not.
• Need to go through a very big hospital which is hard and time consuming.
• Take appointment for specialist and its fee is expensive.
• When skin have injuries so we ignore that by saying it will recover soon basically, its
cancer and we unaware about it.
1.3.1 Problem Statement
There is no such an application to detect the skin cancer and give us the result either it is effected
or not in early stage and there is no such a system to detect skin cancer without diagnoses a
clinical test. The clinical test does not show direct result without doctor access and not everyone
afford the clinical test and doctor fees. Therefore, to overcome this problem.
1.3.2 Problem Solution
This project aim is to build an application which will work on analysis of Machine learning
with AI techniques for taking images of samples of skin injured place for identification of
the skin cancer and the use of different ML models to compare their accuracy and
implement the best performing model in the App.

1.4 Objective and Goals


Our aim to develop an android application with machine learning algorithms to detect skin
cancer in initial stages from skin injuries images. This project aim is to build an application
which will work on analysis of Machine learning with AI techniques for taking images of
samples of skin injured place for identification of the skin cancer and the use of different ML
models to compare their accuracy and implement the best performing model in the App
• Sign in and profile.
• Update profile

5
• Upload picture or use camera to capture picture for diagnosis.
• The app would analyze that uploaded image through use of different ML models such as
SVM, Neural Network, Bagged Tree Ensemble, K-Nearest Neighbor (KNN) to compare
their accuracy and implement the best performing model in the App. and show the result.
• There are two classes of skin cancer that are skin -cancerous or non-cancerous
• The app would display previous result or record of user.

1.5 Significance of the Study


This study has both theoretical and practical significance in real world. Theoretically, it will
contribute to media campaign's role in solving the problem of skin cancer using a machine
learning or deep learning. It will serve as a database for a computer science researcher who may
be interested in diagnosis in skin cancer using these techniques. Practically, it aims to serve as a
document for government and non-governmental organizations.

6
CHAPTER 02

REQUIREMENT ANALYSIS

2.1 Overview
In this chapter, we will discuss some of the existing research on skin cancer classification which
helps dermatologist and doctors in the classification of skin diseases. In this section we further
discuss the current research on machine learning and deep learning-based detection and
classification of types of skin cancer.

2.2 Literature Review


For the classification of melanoma disease, various approaches including the ABCD-E scheme
[11, 12] a checklist of 7-points [[13], [14]], and a 3-point checklist [15] were used in different
systems. In, researchers proposed that the computational cost of extracted features from the
ABCD scheme is less expensive as compared with the checklist of 7-point. It is also observed
that ABCD achieved a high consistency rate with clinical diagnosis. Therefore, mostly CAD
systems for the diagnosis of melanoma utilized the ABCD scheme for feature extraction.
However, this approach is more prone to nevias melanomas during classification [16]. Mokrani
et al. combined the color asymmetry and dermoscopic structure with features extracted from
ABCD and attained 91.25% sensitivity. In [17], a transfer learning approach was used for
medical images to extract the features from the MRI and then they utilized a support vector
machine for classification. This technique was useful to reduce the false positives. In [18], a deep
learning-based framework was used to automatically detect dermoscopic patterns and achieved
88% accuracy. Moussa et. al. [19] also utilized this scheme without considering color features as
it requires more computational resources and achieved 89% accuracy using KNN.

This is noticeable that ABCD rubrics are subjective which results in a high inter as well as
intraobserver bias [20]. Therefore, a high-quality initiative feature technique was utilized to
represent the asymmetry characteristic of ABCD rubrics skin cancer images [21]. The proposed
system achieved 86% performance accuracy. Amelard et. al. [22] extended his findings by
adding new high-quality initiative features for distinct color channels and achieved 94%
accuracy. The color features can be extracted using statistical values of color channels. Few other
colors extracted
7
approaches include color’s irregularity, centroid distance, and LUV histogram distance [19,
2325]. In another group of researchers classifies melanoma skin cancer based on local and global
features. They achieved 93% accuracy by combining textual features with color features.

Some other researchers utilized the same approach but in addition to textual and color features,
they also utilized shape-based features. Ganster et. al. [26]used the KNN classifier along with
colored and shaped features from lesion skin. For training, 5300 images were taken and obtained
87% sensitivity and 92% specificity. Rubegni et. al. [27] used textual and geometrical features as
well and achieved the same sensitivity and specificity of 96%. Used a feature vector of shape,
texture, and color features with SVM and achieved 93% sensitivity and 92% specificity.
Almansour et. al. [28] utilized the SVM classifier with color and textual features and attained
90% performance accuracy. The textual feature of the image represents the spatial distribution of
pixel intensity levels. Textural features show intensity levels’ underlying pattern and layout and
serve as one of the most discriminative features for object or ROI detection. The textural features
are generally used for skin cancer analysis as it calculates structural irregularity between nevus
and melanoma cancer classification [29]. It has been observed that the computerized examination
is getting popular. Epilu-computerized microscopy (ELM) based method was proposed to
enhance the initial diagnosis of a melanoma classification scheme for the computerized analysis.
ROI was extracted by utilizing the segmentation algorithms and the merged featured approach
was utilized based on the shape and radiometric features. K Nearest Neighbor was utilized and
attained 87% and 92% sensitivity and specificity respectively. The automated analysis of data for
the melanoma initial recognition scheme [30] achieved 80% for both specificity and sensitivity
utilizing asymmetry and boundary descriptor with Support Vector Machine as a classifier.

Iyatomi et. al. [31] utilized a similar technique and achieved 100% specificity and 95%
sensitivity. In [32], researchers achieved 91% and 88.2% specificity and sensitivity respectively
trained over 120 cancer images. In the researcher utilized active contour and watershed methods
for the segmentation. They extracted shape, color, and texture related features. This proposed
architecture was trained on 50 DermIS datasets and attained 80% performance accuracy. A new
CAD technique of early identification of melanoma was proposed for web and android phone
applications. All the images were taken from high definition cameras rather than utilizing any
repository [33]. The proposed system utilized digital cameras with context information like kind
of skin, span, and all

8
the affected parts of the body. The proposed architecture also tried to remove the Dermoscopic
ADCD compatible features. This attribute was then classified in numerous phases including a
preprocessing phase for selection of association-based attributes. This scheme attained 68%
specificity and 94% sensitivity on 45 nevus and 107 melanoma skin cancer datasets.

2.3 Functional requirements:


In the fictional requirements we have define each module which is used in the application.

2.3.1 Module 1: User

AIM: To know the Result of skin .

INPUT: User can take photo or import photo.

OUTPUT: Display the result of skin.


PROCESS: Analyzing the input data .

2.4 Non-functional requirements:


2.4.1 Reliability:

The degree of something that performs a required operation without failure and to be accurate is the
reliability. This application is developed to be run on any android devices.

2.4.2 Maintainability:

The change in software after development has been completed is known as maintenance. As the times
passed, the needs are changed and the application will be updated and modified time by time and as
look after the user’s feed backs.

2.4.3 Portability:

Portability is way in which software can be able to be transferred from one place to another or from a
machine to another. As the android phones are portable, this application is also having the ability to be
transferred from one android to another.

9
CHAPTER 03

METHODOLOGY

3.1 Overview
In this chapter we will first present the overall description of the framework and
subsequently describe the individual components and then it will describe in detail
the concept, and the implementation of the framework.

3.2 Proposed Framework

Figure 3.1 Proposed framework

3.3 Preprocessing

3.3.1 Image Acquisition


The image dataset is available on the plant village website, notably for images of
maize disease. Maize plants are subsets with a total of 4128 photos with four
disease class names, including common rust, grey leaf spot, and northern leaf spot.
Blight and healthy have 1306, 574, 1146, and 1162 photos, respectively. These
tagged photographs will be used to train and test the robot the classification of
diseases.

10
3.3.2 Image Preprocessing
Due to the presence of dewdrops, dust, and insect excrement on the plants, image
preprocessing is required to provide superior outcomes in subsequent steps. These
effects are referred to as the corn image's noise. To get over these issues the source
to deliver, an RGB photo is converted to a grayscale image. Precise outcomes the
size of the images in this situation is extremely huge, which necessitates the use of
large fonts. It is required to reduce the image size. This image reduction can be
used for a variety of purposes and reduce the size of the memory.
3.3.3 Feature Extraction
Feature extraction extracts the characteristics of the objects in the photos. These
collected characteristics are utilized to depict an entity. These characteristics were
retrieved and grouped into three groups: shape, color, and texture. It's possible that
the infections will spread. Diseases cause them to change their shapes into a variety
of various image shapes. The disease can be clearly identified by the shape of the
features in the model. These are the shapes axes, areas, and angles of the features
differ. Color is the second parameter is a key component of these three
characteristics. It distinguishes the diseases from one another to each other the third
option, texture, determines how the color patterns are created strewn over the
images. The color information is extracted using RGB feature extraction. Among
the most commonly used photos for pattern recognition and processing RGB stands
for red, green, and blue. Highly recommended for image object detection. It has
undergone a dramatic transformation colored.

Following feature descriptor are used to extract features from images dataset:

11
3.4 Machine Learning Models Architecture

3.4.1 Model Classifier Kernel

Skin cancer is the most common human malignancy, is primarily diagnosed visually, beginning
with an initial clinical screening and followed potentially by dermoscopic analysis, a biopsy and
histopathological examination. Automated classification of skin lesions using images is a
challenging task owing to the fine-grained variability in the appearance of skin lesions.

The dataset is taken from the ISIC (International Skin Image Collaboration) Archive. It consists of
1800 pictures of benign moles and 1497 pictures of malignant classified moles. The pictures have
all been resized to low resolution (224x224x3) RGB. The task of this kernel is to create a model,
which can classify a mole visually into benign and malignant.

As the dataset is pretty balanced, the model will be tested on the accuracy score, thus (TP +
TN)/(ALL).

It has 3 different classes of skin cancer which are listed below :

1. Benign

2. Malignant

3. Normal

In this kernel I will try to detect 3 different classes of moles using Convolution Neural Network
with keras tensorflow in backend and then analyse the result to see how the model can be useful
in practical scenario.

In this kernel I have followed following 14 steps for model building and evaluation which are as
follows :

Step 1 : Importing Essential Libraries

Step 2: Loading pictures and making Dictionary of images and labels

Step 3: Categorical Labels

Step 4: Normalization

Step 5: Train and Test Split

Step 6: Model Building

12
Step 7: Cross-validating model

Step 8: Testing model

Step 9: ResNet50

3.4.2 CNN
I used the Keras Sequential API, where you have just to add one layer at a time, starting from the
input.

The first is the convolutional (Conv2D) layer. It is like a set of learnable filters. I choosed to set 64
filters for the two firsts conv2D layers. Each filter transforms a part of the image (defined by the
kernel size) using the kernel filter. The kernel filter matrix is applied on the whole image. Filters
can be seen as a transformation of the image.

The CNN can isolate features that are useful everywhere from these transformed images (feature
maps).

The second important layer in CNN is the pooling (MaxPool2D) layer. This layer simply acts as a
downsampling filter. It looks at the 2 neighboring pixels and picks the maximal value. These are
used to reduce computational cost, and to some extent also reduce overfitting. We have to choose
the pooling size (i.e the area size pooled each time) more the pooling dimension is high, more the
downsampling is important.

Combining convolutional and pooling layers, CNN are able to combine local features and learn
more global features of the image.

Dropout is a regularization method, where a proportion of nodes in the layer are randomly ignored
(setting their wieghts to zero) for each training sample. This drops randomly a propotion of the
network and forces the network to learn features in a distributed way. This technique also improves
generalization and reduces the overfitting.

'relu' is the rectifier (activation function max(0,x). The rectifier activation function is used to add
non linearity to the network.

The Flatten layer is use to convert the final feature maps into a one single 1D vector. This flattening
step is needed so that you can make use of fully connected layers after some convolutional/maxpool
layers. It combines all the found local features of the previous convolutional layers.

In the end i used the features in one fully-connected (Dense) layer which is just artificial an neural
networks (ANN) classifier.

13
3.4.3 Model implementation
Step 1 : importing Essential Libraries

In [1]:
import os
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os
from glob import glob
import seaborn as sns
from PIL import Image
np.random.seed(11) # It's my lucky number
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, KFold, cross_val_score,
GridSearchCV
from sklearn.metrics import accuracy_score
import itertools

import keras
from keras.utils.np_utils import to_categorical # used for converting labels to one-
hot-encoding
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras import backend as K
from keras.layers.normalization import BatchNormalization
from keras.utils.np_utils import to_categorical # convert to one-hot-encoding
from keras.optimizers import Adam, RMSprop
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau
from keras.wrappers.scikit_learn import KerasClassifier
from keras.applications.resnet50 import ResNet50
from keras import backend as K
Using TensorFlow backend.

Step 2 : Loading pictures and making Dictionary of images and labels


In this step I load in the pictures and turn them into numpy arrays using their RGB values. As the
pictures have already been resized to 224x224, there's no need to resize them. As the pictures do not
have any labels, these need to be created. Finally, the pictures are added together to a big training
set and shuffeled.

In [2]:
folder_benign_train = '../input/data/train/benign'
folder_malignant_train = '../input/data/train/malignant'
folder_benign_test = '../input/data/test/benign'
folder_malignant_test = '../input/data/test/malignant'

14
read = lambda imname: np.asarray(Image.open(imname).convert("RGB"))

# Load in training pictures


ims_benign = [read(os.path.join(folder_benign_train, filename)) for filename in
os.listdir(folder_benign_train)]
X_benign = np.array(ims_benign, dtype='uint8')
ims_malignant = [read(os.path.join(folder_malignant_train, filename)) for filename in
os.listdir(folder_malignant_train)]
X_malignant = np.array(ims_malignant, dtype='uint8')

# Load in testing pictures


ims_benign = [read(os.path.join(folder_benign_test, filename)) for filename in
os.listdir(folder_benign_test)]
X_benign_test = np.array(ims_benign, dtype='uint8')
ims_malignant = [read(os.path.join(folder_malignant_test, filename)) for filename in
os.listdir(folder_malignant_test)]
X_malignant_test = np.array(ims_malignant, dtype='uint8')

# Create labels
y_benign = np.zeros(X_benign.shape[0])
y_malignant = np.ones(X_malignant.shape[0])

y_benign_test = np.zeros(X_benign_test.shape[0])
y_malignant_test = np.ones(X_malignant_test.shape[0])

# Merge data
X_train = np.concatenate((X_benign, X_malignant), axis = 0)
y_train = np.concatenate((y_benign, y_malignant), axis = 0)

X_test = np.concatenate((X_benign_test, X_malignant_test), axis = 0)


y_test = np.concatenate((y_benign_test, y_malignant_test), axis = 0)

# Shuffle data
s = np.arange(X_train.shape[0])
np.random.shuffle(s)
X_train = X_train[s]
y_train = y_train[s]

s = np.arange(X_test.shape[0])
np.random.shuffle(s)
X_test = X_test[s]
y_test = y_test[s]
In [3]:
# Display first 15 images of moles, and how they are classified
w=40
h=30
fig=plt.figure(figsize=(12, 8))
columns = 5
rows = 3

for i in range(1, columns*rows +1):


ax = fig.add_subplot(rows, columns, i)
if y_train[i] == 0:
15
ax.title.set_text('Benign')
else:
ax.title.set_text('Malignant')
plt.imshow(X_train[i], interpolation='nearest')
plt.show()

Step 3: Categorical Labels


Turn labels into one hot encoding

In [4]:
y_train = to_categorical(y_train, num_classes= 2)
y_test = to_categorical(y_test, num_classes= 2)

Step 4 : Normalization
Normalize all Values of the pictures by dividing all the RGB values by 255

In [5]:
# With data augmentation to prevent overfitting
X_train = X_train/255.
X_test = X_test/255.

Step 5: Model Building

In [6]:
# See learning curve and validation curve

def build(input_shape= (224,224,3), lr = 1e-3, num_classes= 2,


init= 'normal', activ= 'relu', optim= 'adam'):
model = Sequential()
model.add(Conv2D(64, kernel_size=(3, 3),padding = 'Same',input_shape=input_shape,
activation= activ, kernel_initializer='glorot_uniform'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, kernel_size=(3, 3),padding = 'Same',


activation =activ, kernel_initializer = 'glorot_uniform'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer=init))
model.add(Dense(num_classes, activation='softmax'))
model.summary()

if optim == 'rmsprop':
16
optimizer = RMSprop(lr=lr)

else:
optimizer = Adam(lr=lr)

model.compile(optimizer = optimizer ,loss = "binary_crossentropy", metrics=["accuracy"])


return model

# Set a learning rate annealer


learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc',
patience=5,
verbose=1,
factor=0.5,
min_lr=1e-7)
In [7]:
input_shape = (224,224,3)
lr = 1e-5
init = 'normal'
activ = 'relu'
optim = 'adam'
epochs = 50
batch_size = 64

model = build(lr=lr, init= init, activ= activ, optim=optim, input_shape= input_shape)

history = model.fit(X_train, y_train, validation_split=0.2,


epochs= epochs, batch_size= batch_size, verbose=0,
callbacks=[learning_rate_reduction]
)

# list all data in history


print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
17
plt.legend(['train', 'test'], loc='upper left')
plt.show()

Output:
WARNING:tensorflow:From
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263:
colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a
future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From
/opt/conda/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout
(from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future
version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 56, 56, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 200704) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 25690240
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 25,729,218
Trainable params: 25,729,218
Non-trainable params: 0
_________________________________________________________________
WARNING:tensorflow:From
/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from
tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
18
Epoch 00007: ReduceLROnPlateau reducing learning rate to 4.999999873689376e-06.

Epoch 00012: ReduceLROnPlateau reducing learning rate to 2.499999936844688e-06.

Epoch 00017: ReduceLROnPlateau reducing learning rate to 1.249999968422344e-06.

Epoch 00022: ReduceLROnPlateau reducing learning rate to 6.24999984211172e-07.

Epoch 00027: ReduceLROnPlateau reducing learning rate to 3.12499992105586e-07.

Epoch 00032: ReduceLROnPlateau reducing learning rate to 1.56249996052793e-07.

Epoch 00037: ReduceLROnPlateau reducing learning rate to 1e-07.

Epoch 00042: ReduceLROnPlateau reducing learning rate to 1e-07.

Epoch 00047: ReduceLROnPlateau reducing learning rate to 1e-07.


dict_keys(['val_loss', 'val_acc', 'loss', 'acc', 'lr'])

In [8]:
K.clear_session()
del model
del history

Step 6: Cross-Validating Model

In [9]:
# define 3-fold cross validation test harness
kfold = KFold(n_splits=3, shuffle=True, random_state=11)

cvscores = []
for train, test in kfold.split(X_train, y_train):
# create model
model = build(lr=lr,
init= init,
activ= activ,
optim=optim,
input_shape= input_shape)

# Fit the model


model.fit(X_train[train], y_train[train], epochs=epochs, batch_size=batch_size, verbose=0)
# evaluate the model
scores = model.evaluate(X_train[test], y_train[test], verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
19
cvscores.append(scores[1] * 100)
K.clear_session()
del model

print("%.2f%% (+/- %.2f%%)" % (np.mean(cvscores), np.std(cvscores)))


_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 56, 56, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 200704) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 25690240
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 25,729,218
Trainable params: 25,729,218
Non-trainable params: 0
_________________________________________________________________
acc: 63.94%
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 56, 56, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 200704) 0
20
_________________________________________________________________
dense_1 (Dense) (None, 128) 25690240
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 25,729,218
Trainable params: 25,729,218
Non-trainable params: 0
_________________________________________________________________
acc: 73.38%
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 56, 56, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 200704) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 25690240
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 25,729,218
Trainable params: 25,729,218
Non-trainable params: 0
_________________________________________________________________
acc: 69.85%
69.06% (+/- 3.90%)

Step 7: Testing the model


First the model has to be fitted with all the data, such that no data is left out.

In [10]:
# Fitting model to all data
model = build(lr=lr,
init= init,
21
activ= activ,
optim=optim,
input_shape= input_shape)

model.fit(X_train, y_train,
epochs=epochs, batch_size= batch_size, verbose=0,
callbacks=[learning_rate_reduction]
)

# Testing model on test data to evaluate


y_pred = model.predict_classes(X_test)

print(accuracy_score(np.argmax(y_test, axis=1),y_pred))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 56, 56, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 200704) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 25690240
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 25,729,218
Trainable params: 25,729,218
Non-trainable params: 0
_________________________________________________________________
/opt/conda/lib/python3.6/site-packages/keras/callbacks.py:1109: RuntimeWarning: Reduce LR on
plateau conditioned on metric `val_acc` which is not available. Available metrics are: loss,acc,lr
(self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
0.6818181818181818
In [11]:
# save model
# serialize model to JSON
model_json = model.to_json()
22
with open("model.json", "w") as json_file:
json_file.write(model_json)

# serialize weights to HDF5


model.save_weights("model.h5")
print("Saved model to disk")

# Clear memory, because of memory overload


del model
K.clear_session()
Saved model to disk

Step 8: ResNet50
The CNN above is not a very sophisticated model, thus the resnet50, is also tried

In [12]:
input_shape = (224,224,3)
lr = 1e-5
epochs = 50
batch_size = 64

model = ResNet50(include_top=True,
weights= None,
input_tensor=None,
input_shape=input_shape,
pooling='avg',
classes=2)

model.compile(optimizer = Adam(lr) ,
loss = "binary_crossentropy",
metrics=["accuracy"])

history = model.fit(X_train, y_train, validation_split=0.2,


epochs= epochs, batch_size= batch_size, verbose=2,
callbacks=[learning_rate_reduction]
)

# list all data in history


print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')

23
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

Output
Train on 2109 samples, validate on 528 samples
Epoch 1/50
- 33s - loss: 0.9927 - acc: 0.5974 - val_loss: 0.5673 - val_acc: 0.7614
Epoch 2/50
- 18s - loss: 0.5286 - acc: 0.7634 - val_loss: 0.5200 - val_acc: 0.6932
Epoch 3/50
- 18s - loss: 0.4536 - acc: 0.7757 - val_loss: 0.6281 - val_acc: 0.7254
Epoch 4/50
- 18s - loss: 0.4181 - acc: 0.7933 - val_loss: 0.5645 - val_acc: 0.7386
Epoch 5/50
- 18s - loss: 0.4011 - acc: 0.8075 - val_loss: 0.4142 - val_acc: 0.7898
Epoch 6/50
- 18s - loss: 0.3887 - acc: 0.8075 - val_loss: 0.4471 - val_acc: 0.7917
Epoch 7/50
- 18s - loss: 0.3729 - acc: 0.8132 - val_loss: 0.4708 - val_acc: 0.7822
Epoch 8/50
- 18s - loss: 0.3552 - acc: 0.8288 - val_loss: 0.4532 - val_acc: 0.7879
Epoch 9/50
- 18s - loss: 0.3419 - acc: 0.8369 - val_loss: 0.4875 - val_acc:
0.7955
Epoch 10/50
- 18s - loss: 0.3273 - acc: 0.8530 - val_loss: 0.4950 - val_acc:
0.8106
Epoch 11/50
- 18s - loss: 0.3069 - acc: 0.8563 - val_loss: 0.4702 - val_acc:
0.7822
Epoch 12/50
- 18s - loss: 0.3037 - acc: 0.8549 - val_loss: 0.4657 - val_acc: 0.7955
Epoch 13/50
- 18s - loss: 0.2920 - acc: 0.8729 - val_loss: 0.5437 - val_acc: 0.7803
Epoch 14/50
- 18s - loss: 0.2764 - acc: 0.8777 - val_loss: 0.4889 - val_acc: 0.7841
24
Epoch 15/50
- 18s - loss: 0.2717 - acc: 0.8758 - val_loss: 0.4500 - val_acc: 0.7936

Epoch 00015: ReduceLROnPlateau reducing learning rate to 4.999999873689376e-06.


Epoch 16/50
- 18s - loss: 0.2646 - acc: 0.8867 - val_loss: 0.4820 - val_acc: 0.8163
Epoch 17/50
- 18s - loss: 0.2512 - acc: 0.8938 - val_loss: 0.4397 - val_acc: 0.7879
Epoch 18/50
- 18s - loss: 0.2429 - acc: 0.8957 - val_loss: 0.4963 - val_acc: 0.7973
Epoch 19/50
- 18s - loss: 0.2387 - acc: 0.9009 - val_loss: 0.4425 - val_acc: 0.8030
Epoch 20/50
- 18s - loss: 0.2358 - acc: 0.8943 - val_loss: 0.4292 - val_acc: 0.8182
Epoch 21/50
- 18s - loss: 0.2309 - acc: 0.9047 - val_loss: 0.4578 - val_acc: 0.8125
Epoch 22/50
- 18s - loss: 0.2211 - acc: 0.9094 - val_loss: 0.4671 - val_acc: 0.8068
Epoch 23/50
- 18s - loss: 0.2144 - acc: 0.9137 - val_loss: 0.4563 - val_acc: 0.8182
Epoch 24/50
- 18s - loss: 0.2052 - acc: 0.9142 - val_loss: 0.4523 - val_acc: 0.8125
Epoch 25/50
- 18s - loss: 0.2120 - acc: 0.9090 - val_loss: 0.4321 - val_acc: 0.8106

Epoch 00025: ReduceLROnPlateau reducing learning rate to 2.499999936844688e-06.


Epoch 26/50
- 18s - loss: 0.1866 - acc: 0.9256 - val_loss: 0.4659 - val_acc: 0.8068
Epoch 27/50
- 18s - loss: 0.1911 - acc: 0.9251 - val_loss: 0.4541 - val_acc: 0.8163
Epoch 28/50
- 18s - loss: 0.1944 - acc: 0.9227 - val_loss: 0.4816 - val_acc: 0.8049
Epoch 29/50
- 18s - loss: 0.1703 - acc: 0.9322 - val_loss: 0.4862 - val_acc: 0.8030
Epoch 30/50
- 18s - loss: 0.1785 - acc: 0.9317 - val_loss: 0.4903 - val_acc: 0.8106

Epoch 00030: ReduceLROnPlateau reducing learning rate to 1.249999968422344e-06.


Epoch 31/50
- 18s - loss: 0.1576 - acc: 0.9393 - val_loss: 0.4615 - val_acc: 0.8087
Epoch 32/50
- 18s - loss: 0.1865 - acc: 0.9251 - val_loss: 0.4525 - val_acc: 0.8125
Epoch 33/50
- 18s - loss: 0.1647 - acc: 0.9360 - val_loss: 0.4391 - val_acc: 0.8220
Epoch 34/50
- 18s - loss: 0.1766 - acc: 0.9317 - val_loss: 0.4113 - val_acc: 0.8144
25
Epoch 35/50
- 18s - loss: 0.1698 - acc: 0.9312 - val_loss: 0.4421 - val_acc: 0.8125
Epoch 36/50
- 18s - loss: 0.1507 - acc: 0.9436 - val_loss: 0.4149 - val_acc: 0.8106
Epoch 37/50
- 18s - loss: 0.1828 - acc: 0.9279 - val_loss: 0.4332 - val_acc: 0.8125
Epoch 38/50
- 18s - loss: 0.1601 - acc: 0.9431 - val_loss: 0.4404 - val_acc: 0.8144

Epoch 00038: ReduceLROnPlateau reducing learning rate to 6.24999984211172e-07.


Epoch 39/50
- 18s - loss: 0.1459 - acc: 0.9426 - val_loss: 0.4470 - val_acc: 0.8144
Epoch 40/50
- 18s - loss: 0.1629 - acc: 0.9374 - val_loss: 0.4423 - val_acc: 0.8125
Epoch 41/50
- 18s - loss: 0.1641 - acc: 0.9360 - val_loss: 0.4253 - val_acc: 0.8182
Epoch 42/50
- 18s - loss: 0.1639 - acc: 0.9374 - val_loss: 0.4707 - val_acc: 0.8144
Epoch 43/50
- 18s - loss: 0.1559 - acc: 0.9445 - val_loss: 0.4466 - val_acc: 0.8239
Epoch 44/50
- 18s - loss: 0.1425 - acc: 0.9436 - val_loss: 0.4566 - val_acc: 0.8144
Epoch 45/50
- 18s - loss: 0.1553 - acc: 0.9440 - val_loss: 0.4442 - val_acc: 0.8163
Epoch 46/50
- 18s - loss: 0.1551 - acc: 0.9426 - val_loss: 0.4533 - val_acc: 0.8201
Epoch 47/50
- 18s - loss: 0.1555 - acc: 0.9388 - val_loss: 0.4338 - val_acc: 0.8239
Epoch 48/50
- 18s - loss: 0.1574 - acc: 0.9407 - val_loss: 0.4263 - val_acc: 0.8220

Epoch 00048: ReduceLROnPlateau reducing learning rate to 3.12499992105586e-07.


Epoch 49/50
- 18s - loss: 0.1576 - acc: 0.9403 - val_loss: 0.4332 - val_acc: 0.8220
Epoch 50/50

3.4.4 Model Accuracy


- 18s - loss: 0.1494 - acc: 0.9483 - val_loss: 0.4231 - val_acc: 0.8201 82%
dict_keys(['val_loss', 'val_acc', 'loss', 'acc', 'lr'])

In [13]:
# Train ResNet50 on all the data
model.fit(X_train, y_train,
26
epochs=epochs, batch_size= epochs, verbose=0,
callbacks=[learning_rate_reduction]
)
/opt/conda/lib/python3.6/site-packages/keras/callbacks.py:1109: RuntimeWarning: Reduce LR on
plateau conditioned on metric `val_acc` which is not available. Available metrics are: loss,acc,lr
(self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
Out[13]:
<keras.callbacks.History at 0x7fef804cb160>
In [14]:
linkcode
# Testing model on test data to evaluate
y_pred = model.predict(X_test)
print(accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_pred, axis=1)))

# save model
# serialize model to JSON
resnet50_json = model.to_json()

with open("resnet50.json", "w") as json_file:


json_file.write(resnet50_json)

# serialize weights to HDF5


model.save_weights("resnet50.h5")
print("Saved model to disk")
0.8287878787878787
Saved model to disk

27
3.5 SDLC Model

We will use the following SDLC model for out project:


Incremental model:
We are using Incremental model of our project because we want to divide the
project in phases and developed each phase and get feedback and then develop
next phases of the project.

Figure 3.10 Incremental Model

28
3.6 Flow Chart:

Figure 3.11 Flow Chart

Description:

In the flow chart we describe the flow of the system as in the above diagram we defined the flow
of our system. Defining the user side of the flow chart first the user will enter user name and
password in the text field after entering the password the system will check either the password
is wrong or correct if the password is wrong the system will pop up an error message. If the user
entered the correct login credentials the system will let the user login to the system after entering
the correct password and user name the user will see a main page of the system from there the
user can select many options provided by the system the user can select from the many option
like import, take photo and after Analysis he can check the result either it have cancer or normal.

29
3.7 Use case Diagram:

Figure 3.12 Use Case Diagram

Description

User can login into the Skin cancer detection app and then this information gets
stored in the database system. Users need to make profile first. User takes photo or
import image from mobile local location. App shows skin color and disease to user.

30
Figure 3.13 Registration Page

Description:
A signup page (also known as a registration page) enables users and organizations to independently
register and gain access to your system.
After clicking on the signup button, after selecting that he can enter valid data required for sign up to
create his account in our portal.

31
Figure 3.14 Login Page

Description:
A login page or an entry page that requires user identification and authentication, regularly performed
by entering a username and password combination. Logins may provide access to an entire site.
After creating account in the portal, the user will be shown a sign in page where he can login into the
portal with correct password and email.

32
Figure 3.15 Update Profile

Description:
In this section the system provide one of the option to update easily his profile he can take a photo or import
photo from gallery.

33
Figure 3.16 Profile Page

Description:
In this Profile page the user can can add his personal information for example his name, email, mobile
number.

34
Figure 3.17 Take Photo or Import Image

Description:
In this page the user have two options either import photo from gallery or take a photo on camera.

35
Figure 3.18 About Page

Description:
In this page the user can see or get info about cancer and its initial stages and also its signs and symptoms.

36
Figure 3.19 History of Skins Page

Description:
In History page user can see their previous results of their skin which add or store to this History page.

37
Figure 3.20 Storage in Database

Description:

The Firebase Real-time Database is a cloud-hosted NoSQL database that lets you store and sync data
between your users in real-time. NEW: Cloud Fire store enables you to store, sync and query app data at
global scale.

38
Figure 3.21 History in Database

Description:

In this database section we will see all user email profile image and other details.

39
Figure 3.22 Users in Database

Description:

In the real time database section Where we will see all the users email password name for which they signup.

40
Figure 3.23 Authentication of User in Database

Description:

Firebase authentication provides backend services, easy to use software development kit (SDK’s), and ready-
made UI libraries to authenticate users to our system.

41
CHAPTER 04

RESULTS AND DISCUSSIONS

4.1 Overview
In this chapter we will talking about system configuration and experimental result, we present
the experimental evaluations of our proposed methods. Evaluating the performance of machine
learning models on corn leaf images.

4.2 System Configuration


The training of the study has been done offline on windows machine with Intel(R) Core (TM)
i56500 CPU @ 3.20GHz, 3.20GHz, 24GB memory and a GeForce GTX 1070 Ti GPU.
Package Name Use
OS OS provide us all functionality related to operating system e.g
copy, delete etc.
tflite A Flutter plugin for accessing TensorFlow Lite API. Supports
image classification, object detection (SSD and YOLO), Pix2Pix
and Deeplab and PoseNet on both iOS and Android.
Numpy (np) NumPy is a Python library used for working with arrays. It also
has functions for working in domain of linear algebra, Fourier
transform, and matrices.

Sklearn Scikit-learn is a free software machine learning library for the


Python programming language. It features various classification,
regression, and clustering algorithms
Skimage Scikit-image, or skimage, is an open-source Python package
designed for image preprocessing
Matplotlib Matplotlib is a comprehensive library for creating static,
animated, and interactive visualizations in Python.

Table 4. 1 This table show all dependencies used in this


work

42
Name Configuration

OS Window

Coding Python

Implementation Environment Sckit-learn


Table 4. 2 Parameter for implementation

4.3 Dataset Description

Figure 4.1 Samples of Skin cancer

This dataset was uploaded on website (Kaggle) which is hub of data for machine learning models.
This data set have two categories which are arrange as:

0) Benign
1) Malignant

It contains 2468 images which were categories accordingly Benign (1306) and Malignant (1162).

43
4.4Results and Model Evolution Learning

4.4.1 Curves

The main objective of our proposed architecture is to show that the pipeline we assembled, will
maximize classification accuracy, and minimize any loss of corn leaf disease classification. To
assess the performance of the models and as a design guide for opting a backbone, we compared
the model accuracy of the KNN, SVM, Naïve Bayes and Random Forest model backbones on
corn leaf images dataset and overall accuracy in this work.

Learning curves of KNN, SVM, Naïve Bayes and Random Forest for our proposed work are
given below.

Figure 4.2 Learning curves of KNN Figure 4.3 Learning curves of SVM

44
Figure 4.4 Learning curves of Naïve Bayes Figure 4.5 Learning curves of Random Forest

4.4.2 Confusion Matrix

A confusion matrix is a table that is often used to describe the performance of a classification
model (or "classifier") on a set of test data for which the true values are known.

The basics term of the confusion matrix is given:

• True positives (TP): These are cases in which we predicted yes (they have the
disease), and they do have the disease.
• True negatives (TN): We predicted no, and they don't have the disease.
• False positives (FP): We predicted yes, but they don't have the disease.
• False negatives (FN): We predicted no, but they do have the disease.

45
Figure 4.6 Naïve Bayes Figure 4.7 KNN Confusion matrix

Figure 4.8 SVM Confusion matrix Figure 4.9 Random Forest Confusion matrix

46
In table 4.3 below shows the training and validation accuracy results of proposed framework
overall accuracy on 20 epochs. The table below shows performance of different machine
learning model on ultrasound images dataset in our work.

Model Accuracy
K-Nearest Neighbor 73
Support vector machine 79%
Naïve Bayes 71%
Random Forest 81%

Table 4.3 Model Accuracy Table

47
CHAPTER 05

CONCLUSIONS AND FUTURE WORK


5.1 Overview

This chapter concludes the thesis and summarizes the findings of the presented work. It will first
give a summary of what has been done in this thesis. After that, and outlook will be given, in
which further developments or improvements are possible is in the present work.

5.2 Conclusions

In this work, we have applied supervised machine learning classification model with changing
the parameter and testing it on skin cancer images dataset using machine learning framework
Sckit learn (Simple and efficient tools for predictive data analysis). With the help of Machine
learning supervised models, we have extracted the features of an image using HOG feature
descriptor and have classified the image into benign and malignant. It is observed that the
classification accuracy mainly depends on how HOG extract features.
Machine learning represents powerful, efficient, highly accurate tool that reduces pathologist
and Farmers workload. It is expected to use machine learning approaches in various medical
situations such as image classification, object detection, and segmentation etc. We must have
basic knowledge of machine learning, anticipate the problems that will occur when it is
introduced, and prepare to address these various problems.

5.3 Future Work


Machine learning methods have succeeded in state-of-the-art achievement over different medical
applications like medical image analysis, monitoring disease progression in human being.
However, there are some gaps that need to be addressed in skin cancer medical image analysis.
Building big dataset of skin screening images and making it available to researchers so that there
will be different available models trained on skin cancer images. Basically, we have used a
Machine learning models with the variation in parameter to classify the image. Though the
efficiency is pretty good, still there is a room for improvement.

48
Our next challenge would be to improve the accuracy as close to 100 %. Currently in our work,
we only classify type of disease into benign and malignant. In the future, we will monitor
progression of these diseases. Hence, another future direction from this study will be extending
the model implementation on conventional smartphone processor to do fast and cheap on-device
inference

49
5.4 References

[1] N. Nami, E. Giannini, M. Burroni, M. Fimiani, and P. Rubegni, "Teledermatology:


stateof- the-art and future perspectives," Expert Review of Dermatology, vol. 7, no. 1, pp.
1-3, 2012.
[2] G. Fabbrocini et al., "Epidemiology of skin cancer: role of some environmental factors,"
Cancers, vol. 2, no. 4, pp. 1980-1989, 2010.
[3] R. Marks, "An overview of skin cancers," Cancer, vol. 75, no. S2, pp. 607-612, 1995.
[4] A. Houghton, J. Flannery, and M. V. Viola, "Malignant melanoma in Connecticut and
Denmark," International Journal of Cancer, vol. 25, no. 1, pp. 95-104, 1980.
[5] G. Rassner and A. Blum, "Früherkennung des malignen Melanoms der Haut," in
Fortschritte der Dermatologie: Springer, 2002, pp. 185-192.
[6] H. A. Haenssle et al., "Man against machine: diagnostic performance of a deep learning
convolutional neural network for dermoscopic melanoma recognition in comparison to
58 dermatologists," Annals of oncology, vol. 29, no. 8, pp. 1836-1842, 2018.
[7] G. Argenziano and H. P. Soyer, "Dermoscopy of pigmented skin lesions–a valuable tool
for early," The lancet oncology, vol. 2, no. 7, pp. 443-449, 2001.
[8] H. Kittler, H. Pehamberger, K. Wolff, and M. Binder, "Diagnostic accuracy of
dermoscopy," The lancet oncology, vol. 3, no. 3, pp. 159-165, 2002.
[9] A.-R. A. Ali and T. M. Deserno, "A systematic review of automated melanoma detection
in dermatoscopic images and its ground truth data," in Medical Imaging 2012: Image
Perception, Observer Performance, and Technology Assessment, 2012, vol. 8318:
International Society for Optics and Photonics, p. 83181I.
[10] G. Fabbrocini et al., "Teledermatology: from prevention to diagnosis of nonmelanoma
and melanoma skin cancer," International journal of telemedicine and applications, vol.
2011, 2011.
[11] N. R. Abbasi et al., "Early diagnosis of cutaneous melanoma: revisiting the ABCD
criteria," Jama, vol. 292, no. 22, pp. 2771-2776, 2004.
[12] H. Kittler, M. Seltenheim, M. Dawid, H. Pehamberger, K. Wolff, and M. Binder,

50
"Morphologic changes of pigmented skin lesions: a useful extension of the ABCD rule for
dermatoscopy," Journal of the American Academy of Dermatology, vol. 40, no. 4, pp. 558562,
1999.
[13] M. Keefe, D. Dick, and R. Wakeel, "A study of the value of the seven‐point checklist in
distinguishing benign pigmented lesions from melanoma," Clinical and experimental
dermatology, vol. 15, no. 3, pp. 167-171, 1990.
[14] M. Healsmith, J. Bourke, J. Osborne, and R. Graham‐Brown, "An evaluation of the
revised seven‐point checklist for the early diagnosis of cutaneous malignant melanoma,"
British Journal of Dermatology, vol. 130, no. 1, pp. 48-50, 1994.
[15] H. P. Soyer et al., "Three-point checklist of dermoscopy," Dermatology, vol. 208, no. 1,
pp. 27-31, 2004.
[16] A. Masood and A. Ali Al-Jumaily, "Computer aided diagnostic support system for skin
cancer: a review of techniques and algorithms," International journal of biomedical
imaging, vol. 2013, 2013.
[17] Z. Shi et al., "A deep CNN based transfer learning method for false positive reduction,"
Multimedia Tools and Applications, vol. 78, no. 1, pp. 1017-1033, 2019.
[18] S. Demyanov, R. Chakravorty, M. Abedini, A. Halpern, and R. Garnavi, "Classification
of dermoscopy patterns using deep convolutional neural networks," in 2016 IEEE 13th
international symposium on biomedical imaging (ISBI), 2016: IEEE, pp. 364-368.
[19] R. Moussa, F. Gerges, C. Salem, R. Akiki, O. Falou, and D. Azar, "Computer-aided
detection of Melanoma using geometric features," in 2016 3rd Middle East Conference
on Biomedical Engineering (MECBME), 2016: IEEE, pp. 125-128.
[20] G. Argenziano et al., "Dermoscopy of pigmented skin lesions: results of a consensus
meeting via the Internet," Journal of the American Academy of Dermatology, vol. 48, no.
5, pp. 679-693, 2003.
[21] R. Amelard, A. Wong, and D. A. Clausi, "Extracting high-level intuitive features (HLIF)
for classifying skin lesions using standard camera images," in 2012 Ninth Conference on
Computer and Robot Vision, 2012: IEEE, pp. 396-403.
[22] R. Amelard, J. Glaister, A. Wong, and D. A. Clausi, "High-level intuitive features
(HLIFs) for intuitive skin lesion description," IEEE Transactions on Biomedical
Engineering, vol.

51
62, no. 3, pp. 820-831, 2014.
[23] M. E. Celebi et al., "A methodological approach to the classification of dermoscopy
images," Computerized Medical imaging and graphics, vol. 31, no. 6, pp. 362-373, 2007.
[24] R. J. Stanley, W. V. Stoecker, and R. H. Moss, "A relative color approach to color
discrimination for malignant melanoma detection in dermoscopy images," Skin Research
and Technology, vol. 13, no. 1, pp. 62-72, 2007.
[25] C. Barata, M. Ruela, M. Francisco, T. Mendonça, and J. S. Marques, "Two systems for
the detection of melanomas in dermoscopy images using texture and color features,"
IEEE systems Journal, vol. 8, no. 3, pp. 965-979, 2013.
[26] H. Ganster, P. Pinz, R. Rohrer, E. Wildling, M. Binder, and H. Kittler, "Automated
melanoma recognition," IEEE transactions on medical imaging, vol. 20, no. 3, pp.
233239, 2001.
[27] P. Rubegni et al., "Automated diagnosis of pigmented skin lesions," International
Journal of Cancer, vol. 101, no. 6, pp. 576-580, 2002.
[28] E. Almansour and M. A. Jaffar, "Classification of Dermoscopic skin cancer images using
color and hybrid texture features," IJCSNS Int J Comput Sci Netw Secur, vol. 16, no. 4,
pp. 135-9, 2016.
[29] R. B. Oliveira, J. P. Papa, A. S. Pereira, and J. M. R. Tavares, "Computational methods
for pigmented skin lesion classification in images: review and future trends," Neural
Computing and Applications, vol. 29, no. 3, pp. 613-636, 2018.
[30] I. Stanganelli et al., "Computer-aided diagnosis of melanocytic lesions," Anticancer
Research, vol. 25, no. 6C, pp. 4577-4582, 2005.
[31] H. Iyatomi et al., "Computer-based classification of dermoscopy images of melanocytic
lesions on acral volar skin," Journal of Investigative Dermatology, vol. 128, no. 8, pp.
2049-2054, 2008.
[32] M. A. Farooq, M. A. M. Azhar, and R. H. Raza, "Automatic lesion detection system
(ALDS) for skin cancer classification using SVM and neural classifiers," in 2016 IEEE
16th International Conference on Bioinformatics and Bioengineering (BIBE), 2016:
IEEE,
pp. 301-308.

52
[33] J. F. Alcón et al., "Automatic imaging system with decision support for inspection of
pigmented skin lesions and melanoma diagnosis," IEEE journal of selected topics in
signal processing, vol. 3, no. 1, pp. 14-25, 2009.

53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

You might also like