Plant Disease Detection with CNN
Plant Disease Detection with CNN
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE & ENGINEERING
Submitted by
B G RAHUL KARRTHIK-(ENG17CS0044)
B M MADHURYA-( ENG17CS0046)
BHAVYA M-( ENG17CS0052)
DARSHAN A-(ENG17CS0060)
i
ABSTRACT
Agriculture and modern farming are one of the fields where AI and Deep learning can have a
great impact. Maintaining healthy plants and monitoring their environment in order to
identify or detect diseases is essential to maintain a maximum crop yield. The implementation
of current high rocketing technologies including artificial intelligence (AI), machine learning,
and deep learning has proved to be extremely important in modern agriculture as a method of
advanced image analysis domain. Artificial intelligence adds time efficiency and the
possibility of identifying plant diseases, in addition to monitoring and controlling the
environmental conditions in farms. Several studies showed that machine learning and deep
learning technologies can detect plant diseases upon analyzing plant leaves with great accuracy
and sensitivity. In this study, considering the worth of machine learning for disease detection,
we present a convolutional neural network VGG-16 model to detect plant diseases, to allow
farmers to make timely actions with respect to treatment without further delay. To carry this
out, 39 different classes of plants diseases were chosen, and use with it 6 different
Augmentation Techniques. The techniques are image flipping, Gamma correction, noise
injection, PCA color augmentation, rotation, and Scaling. were 61,486 plant leaf images (both
diseased and healthy leaves) were acquired from the Plant Village dataset for training and
testing. In our work, we used 85% of the data for training, 15% for testing, 70% of the trained
data were taken as validation data. Total 36584 for train, 15679 for validation and remaining
images for testing. Based on the experimental results, the proposed model can achieve an
accuracy of about 98.7% with the testing loss being only 0.4418. The proposed model provides
a clear direction toward a deep learning-based plant disease detection to apply on a large scale
in future.
ii
TABLE OF CONTENTS
iii
PLANT DISEASE DETECTION USING ML
CHAPTER 1
1.1. Introduction
Agriculture has always been a basic human need ever since Humans’ existence as plants were
a primary source of food. Even nowadays, agriculture is still considered an essential food
resource and is the center of several aspects in humans’ lives. As a matter of fact, agriculture
serves as the pillar of economy in many countries regardless of their developmental stages.
The various domains that show the importance of agriculture include the fact that agriculture
is a main source of livelihood where approximately 70% of the population depends on plants
and their cultivation for livelihood. This great percentage reflects on agriculture being the
most important resource that can stand a chance in the face of the rapidly increasing
population. One of the most critical challenges that face agriculture and affects it trade is plant
diseases and how to timely detect them and deal with them to improve the health of crops.
A common approach in this case is the use of remote sensing techniques that explore multi
and hyper spectral image captures. The methods that adopt this approach often employ digital
image processing tools to achieve their goals. Image processing technology in agricultural
research has made significant development.
Deep learning techniques, and in particular Convolutional Neural Networks (CNNs), have led
to significant progress in image processing. Since 2016, many applications for the automatic
identification of crop diseases have been developed. These applications could serve as a basis
for the development of expertise assistance or automatic screening tools. Such tools could
contribute to more sustainable agricultural practices and greater food production security.
1
PLANT DISEASE DETECTION USING ML
2
PLANT DISEASE DETECTION USING ML
1.2. Objective
Detection of plant leaves diseases and pests needs experience and experts. So, we like to
equip the young generation of inexperienced farmers with a Flask web application that can
help them in their farms to detect the diseases in their plants and provide them with
appropriate remedies and set of guidelines to deal with affected part according to the disease
detected.
CHAPTER 2
2. Literature Survey
Introduction
Plant diseases pose a significant threat to agriculture, impacting crop yield and quality. Early
detection of plant diseases is crucial for effective management and prevention of widespread
outbreaks. Recent advancements in deep learning, particularly Convolutional Neural
Networks (CNNs), have shown promise in accurately detecting and classifying plant diseases
from images, contributing to more sustainable agriculture. This literature review examines
current research on CNNs and related models in plant disease detection, focusing on
methodologies, datasets, performance, and limitations.
Several studies have highlighted CNNs as a powerful tool for image-based plant disease
detection. CNNs are well-suited for this task because of their ability to learn complex spatial
hierarchies in images, making them effective in identifying subtle visual patterns associated
with specific diseases.
3
PLANT DISEASE DETECTION USING ML
For instance, Ferentinos (2018) developed a deep learning model based on CNNs for
detecting diseases in 25 different plant species. The model achieved over 99% accuracy on the
Plant Village dataset, demonstrating CNNs' effectiveness in recognizing visual features
indicative of disease. This study used a VGG16-based model, emphasizing the impact of
transfer learning in achieving high accuracy with limited labeled data.
In another study, Mohanty, Hughes, and Salathé (2016) trained deep CNN models on the
Plant Village dataset, comprising more than 50,000 labeled images of healthy and diseased
plant leaves. Their model achieved an accuracy of 99.35%, underscoring CNNs’ potential for
high-performance disease detection. Their work further explored generalization to new
datasets, highlighting that training on diverse datasets improves the model's robustness in real-
world applications.
Transfer learning has become an essential approach in plant disease detection due to limited
labeled data availability. By leveraging pre-trained CNN models, researchers have achieved
high accuracy with relatively small datasets. Too et al. (2019) investigated the performance of
pre-trained architectures such as ResNet, VGG, and Inception on plant disease detection.
They found that ResNet-50 outperformed other models, achieving an accuracy of 97.35% on a
custom dataset, demonstrating the power of deeper networks in capturing disease-specific
features.
Additionally, Picon et al. (2019) employed MobileNet, a lightweight CNN model, to detect
grapevine diseases. The model achieved accuracy levels comparable to more complex
architectures while significantly reducing computational requirements, making it suitable for
mobile and edge devices in real-time applications. This research illustrates the adaptability of
transfer learning for practical applications in agriculture.
Moreover, Fuentes et al. (2018) proposed a region-based CNN (R-CNN) for detecting
multiple diseases in tomato plants. This model used selective search and data augmentation to
improve accuracy and robustness, demonstrating that careful data preparation is crucial for
disease detection across varying conditions and plant species.
Despite the promising results, several challenges remain in CNN-based plant disease
detection. The dependency on large labeled datasets, limited generalization to different
conditions (e.g., varying lighting, angles, or environmental conditions), and high
4
PLANT DISEASE DETECTION USING ML
Future research may focus on expanding datasets with diverse environmental conditions,
developing lightweight models for deployment on mobile and IoT devices, and improving
robustness through advanced data augmentation or semi-supervised learning techniques.
Additionally, integrating CNNs with Internet of Things (IoT) technology for real-time disease
monitoring could greatly enhance agricultural practices.
3. An Artificial Intelligence and Cloud Based Collaborative Platform for Plant Disease
Identification, Tracking and Forecasting for Farmers, 2018
This paper presents an automated, low cost and easy to use end-to-end solution to one of the
biggest challenges in the agricultural domain for farmers – precise, instant and early diagnosis
of crop diseases and knowledge of disease outbreaks - which would be helpful in quick decision
making for measures to be adopted for disease control. This proposal innovates on known prior
art with the application of deep Convolutional Neural Networks (CNNs) for disease
classification, introduction of social collaborative platform for progressively improved
accuracy, usage of geocoded images for disease density maps and expert interface for analytics.
High performing deep CNN model “Inception” enables real time classification of diseases in
the Cloud platform via a user facing mobile app. [5]
5
PLANT DISEASE DETECTION USING ML
4. CNN based Leaf Disease Identification and Remedy Recommendation System, 2019
This paper focus upon plant disease detection using image processing approach. This work
utilizes an open dataset of 5000 pictures of unhealthy and solid plants, where convolution
system and semi supervised techniques are used to characterize crop species and detect the
sickness status of 4 distinct classes. Convolution neural network is used to detect and classify
plant diseases. The Network is trained using the images taken in the natural environment and
achieved 99.32% classification ability. This shows the ability of CNN to extract important
features in the natural environment which is required for plant disease classification. [6]
5. Plant Leaf Diseases Detection and Classification Using Image Processing and Deep
Learning Techniques, 2020
This paper presents a system that is used to classify and detect plant leaf diseases using deep
learning techniques. The used images were obtained from (Plant Village dataset) website. In
our work, we have taken specific types of plants; include tomatoes, pepper, and potatoes, as
they are the most common types of plants in the world and in Iraq in particular. This Data Set
contains 20636 images of plants and their diseases. In our proposed system, we used the
convolutional neural network (CNN), through which plant leaf diseases are classified, 15
classes were classified, including 12 classes for diseases of different plants that were detected,
such as bacteria, fungi, etc., and 3 classes for healthy leaves. As a result, we obtained excellent
accuracy in training and testing, we have got an accuracy of (98.29%) for training, and
(98.029%) for testing for all data set that were used.[7]
6
PLANT DISEASE DETECTION USING ML
CHAPTER 3
3.2. Methodology
The main aim is to implement a web application with trained model or an efficient system
which provide plant diseases detection and classification. It predicts the disease class,
description for the disease and the suggested remedies for that disease. For that purpose, we
have used a collected dataset from Plant Village. The dataset has gone into two phases: 1st is
training phase and 2nd is testing phase. In the first phase: Image acquisition (fetching image
from the dataset), Image Pre-processing that include six techniques for increasing the data-set
size (The techniques are image flipping, Gamma correction, noise injection, PCA color
augmentation, rotation, and Scaling), dataset splitting into training, testing and validation
datasets and convolutional neural network (CNN) based training.
In the second phase Image acquisition, Image Pre-processing, Classification and disease
identification and suggest appropriate remedies as shown in figure3.
7
PLANT DISEASE DETECTION USING ML
A. Image Acquisition:
For training, images are taken from a plant village dataset. In this data-set, 39 different classes
of plant leaf and background images are available as shown in table 1. The data-set containing
61,486 images and are saved in the system. Since the introduction of the Plant Village dataset,
it has become the most commonly used dataset for training and developing deep learning-based
plant disease identification and severity estimation models. The images are divided across 39
different diseases affecting 14 crops as shown in table1. Most of the images were acquired
under controlled lab conditions with uniform backgrounds. For the purpose of testing the
images of the plant leaves are taken from the dataset or captured images and when required and
then transferred to a folder on the system for analysis.
B. Image Pre-Processing:
Image should be processed before sending to the algorithm for testing and training purpose.
For that purpose, we used six different augmentation techniques for increasing the data-set size.
The techniques are image flipping, Gamma correction, noise injection, PCA color
augmentation, rotation, and Scaling.
Image flipping:
Description: This involves creating a mirrored version of the image either
horizontally (left to right) or vertically (top to bottom).
Purpose: Image flipping increases the dataset's variety by adding new orientations,
allowing the model to recognize plant diseases from different perspectives. This helps
the model become more robust in detecting features irrespective of how the image is
oriented.
8
PLANT DISEASE DETECTION USING ML
Gamma Correction
Description: Gamma correction adjusts an image's brightness by applying a non-
linear transformation to pixel values. If gamma < 1, the image darkens; if gamma > 1,
the image brightens.
Purpose: This technique simulates images taken under varying lighting conditions. It
helps the model learn features in both bright and dark environments, reducing the
impact of lighting variations on model accuracy.
Noise Injection
Description: Noise injection adds random pixel variations (like Gaussian noise) to the
image, simulating random imperfections.
Purpose: By adding noise, the model learns to focus on key features instead of noise
patterns. This makes the model more robust to image imperfections, like sensor noise
or environmental artifacts.
Rotation
Description: This involves rotating the image by a specified angle, such as 90°, 180°,
or random angles within a certain range.
Purpose: Plants can appear at any angle in real-world images. Rotation enables the
model to recognize diseases irrespective of the plant’s orientation, making it more
robust to variations in angle.
Scaling
Description: Scaling involves resizing the image, either zooming in (making features
larger) or zooming out (making features smaller).
Purpose: This technique accounts for variations in distance between the camera and
the plant. Scaling allows the model to recognize diseases at different scales, improving
its adaptability to images with varied levels of zoom.
9
PLANT DISEASE DETECTION USING ML
10
PLANT DISEASE DETECTION USING ML
1. Input Layer
The input layer contains the input images and their pixel values.
2. Convolution Layer
The Convolution Layer is a layer in a neural network where a small matrix, called a filter or
kernel, slides over the input image to perform a mathematical operation called convolution.
This operation creates a new matrix, known as a feature map or activation map, which
highlights specific features of the input image, such as edges, textures, or shapes.
How the Convolution Operation Works
1. Kernel/Filter: A kernel is a small, usually square matrix 3x3 with values that define a
specific pattern. This kernel is applied over the entire input image.
2. Sliding the Kernel: The kernel slides over the image, moving pixel by pixel
(determined by the stride). At each position, it performs an element-wise
multiplication with the overlapping section of the input image.
3. Summing the Values: The products from the element-wise multiplication are summed
up, resulting in a single value that is placed in the output feature map.
4. Resulting Feature Map: The final output of a convolution operation is a feature map,
which represents the presence and intensity of specific features in the input.
3. Pooling Layer
is used after convolution layers to down sample the feature maps. It works by summarizing
regions of the feature map, effectively reducing its size while retaining the most important
information. This is achieved by applying a specific operation, typically max pooling or
average pooling, to small patches within each feature map.
How Pooling Works
1. Pooling Window: The pooling layer uses a small sliding window (e.g., 2x2 or 3x3),
similar to the filter used in a convolution layer, to move across each feature map.
11
PLANT DISEASE DETECTION USING ML
2. Stride: The stride in pooling defines how far the window moves at each step.
Commonly, a stride of 2 is used to down sample by half.
3. Pooling Operation: For each position of the pooling window, a single value is
computed based on the chosen pooling method. The most common pooling methods
are:
o Max Pooling: Takes the maximum value within the pooling window. This
operation captures the most prominent feature within each region, helping the
network focus on key activations.
o Average Pooling: Calculates the average of all values within the pooling
window. This approach smooths out features, providing an average
representation of each region.
6. Normalize Layer
In our proposed system we use a batch normalize layer. Batch normalization layer form
normalizes any channel through a mini-batch. This can help to decrease sensitivity to data
variations. How It Works: Batch normalization normalizes the output of each layer by
adjusting the mean and variance within a mini-batch.
Specifically, for each feature in the mini-batch, the activations
are normalized to have a mean of 0 and a variance of 1. Then,
two learnable parameters, gamma (scale) and beta (shift), are
applied to maintain the flexibility of the model. Formula:
Given an input χ, batch normalization transforms it as:
12
PLANT DISEASE DETECTION USING ML
[Link] Layer
The network's performance can be difficult to interpret. It is normal to finish the CNN with a
SoftMax function in classification issues. After extracting values of 39 classes of plant diseases
in the fully connected step, a SoftMax will be made for them, so that the class will be selected
in each process and according to the features that were extracted through the previous layers
that the images of plant diseases went through it. In this layer, the correct class of disease is
determined by applying the SoftMax function.
D. Training
Training a network is a procedure of obtaining kernels in convolution layers and weights in
fully connected layers that reduce differences on a training dataset between output predictions
and specified ground truth labels. In our work, we used 85% of the data for training, 15% for
testing, 70% of the trained data were taken as validation data, through this stage so that the
network that has been built learns by extracting features from plant leaf disease images in order
to learn from these features for each image to be distinguished on its basis
E. Testing
The testing is a dataset utilized to provide an impartial final design fit evaluation on the training
set of data. In this stage, we use the groups that were trained in the previous step that was
trained in CNN, and the features were extracted by learning the network when the data set
passes from plant leaf diseases on this network, we used 15% of the data for testing.
13
PLANT DISEASE DETECTION USING ML
Start
Input image
CNN
process
End
14
PLANT DISEASE DETECTION USING ML
CHAPTER 4
4. Implementation
4.1. Requirement Analysis
4.1.1. Functional Requirement
• The system should provide an easy and efficient GUI to use.
• The system should be able to provide clear picture about the status of the leaf.
• If the leaf is found unhealthy i.e., diseased, the system should tell what type of
disease that the leaf is infected with.
• The system should also provide the remedies for the disease on which the plant
is infected.
[Link](TRAIN_DIR):
[Link](TRAIN_DIR,img) img =
[Link](path,cv2.IMREAD_COLOR) img =
[Link](img, (IMG_SIZE,IMG_SIZE))
training_data.append([[Link](img),[Link](label)])
shuffle(training_data)
15
PLANT DISEASE DETECTION USING ML
convnet = max_pool_2d(convnet, 3)
convnet = conv_2d(convnet, 64, 3,
activation='relu')
convnet = max_pool_2d(convnet, 3)
convnet = conv_2d(convnet, 128, 3,
activation='relu')
convnet = max_pool_2d(convnet, 3)
convnet = conv_2d(convnet, 32, 3,
activation='relu')
convnet = max_pool_2d(convnet, 3)
convnet = conv_2d(convnet, 64, 3,
activation='relu')
convnet = max_pool_2d(convnet, 3)
convnet = fully_connected(convnet, 1024,
activation='relu')
= train_data[-500:]
16
PLANT DISEASE DETECTION USING ML
17
PLANT DISEASE DETECTION USING ML
CHAPTER 5
5.1. Results
18
PLANT DISEASE DETECTION USING ML
19
PLANT DISEASE DETECTION USING ML
20
PLANT DISEASE DETECTION USING ML
21
PLANT DISEASE DETECTION USING ML
5.2. Conclusion
To prevent losses, small farmers are dependent on a timely and accurate crop disease diagnosis.
In this study, Convolutional Neural Network will be used, and the model will be developed.
The final result will be a plant disease detection desktop application. This service is free, easy
to use. Thus, the user’s needs as defined in this paper have been fulfilled.
A thorough investigation exposes the capabilities and limitations of the model. The achieved
accuracy depends on a number of factors including the stage of disease, disease type,
background data and object composition. Due to this, a set of user guidelines would be required
for commercial use, to ensure the stated accuracy is delivered. As the model will be trained
using a plain background and singular leaf, imitation of these features is best.
22
PLANT DISEASE DETECTION USING ML
References
[1] S. Arivazhagan, R. Newlin Shebiah, S. Ananthi, S. Vishnu Varthini, “Detection of
unhealthy region of plant leaves and classification of plant leaf diseases using texture
feature”, CIGR, 2013, vol.15, no.1, pp.211-217.
[2] Haiguang Wang, Guanlin Li, Zhanhong Ma, Xiaolong Li, “Image Recognition of Plant
Diseases Based on Backpropagation Networks”, 5th International Congress on Image and
Signal Processing (CISP 2012)
[3] Bhange, M., Hingoliwala, H.A., ‘Smart Farming: Pomegranate Disease Detection Using
Image Processing’, Second International Symposium on Computer Vision and the Internet,
Volume 58, pp. 280-288, 2015
[4] Pranali K. Kosamkar, [Link], Krushna Mantri, Shubham Rudrawar, Shubhan
Salmpuria, Nishant Gadekar, ‘Leaf Disease Detection and Recommendation of Pesticides
using Convolution Neural Network’, 2018 Fourth International Conference on Computing
Communication Control and Automation (ICCUBEA)
[5] Kaushik Kunal Singh, ‘An Artificial Intelligence and Cloud Based Collaborative Platform
for Plant Disease Identification, Tracking and Forecasting for Farmers’, 2018 IEEE
International Conference on Cloud Computing in Emerging Markets (CCEM)
[6] Suma V, R Amog Shetty, Rishab F Tated, Sunku Rohan, Triveni S Pujar, ‘CNN based Leaf
Disease Identification and Remedy Recommendation System’, Third International
Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
[7] Marwan Adnan Jasim, Jamal Mustafa AL-Tuwaijari, “Plant Leaf Diseases Detection and
Classification Using Image Processing and Deep Learning Techniques”, 2020 International
Conference on Computer Science and Software Engineering (CSASE), Duhok, Kurdistan
Region - Iraq
23