Design of A Smart Milk Adulteration Detection System Using Ai and Sensors Document
Design of A Smart Milk Adulteration Detection System Using Ai and Sensors Document
BACHELOR OF TECHNOLOGY
in
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by
CERTIFICATE
Dr. M. Rajan Babu [Link]., Ph.D., MISTE., FIE Mr. [Link] [Link](Ph.D).,
Professor Assistant Professor
Department of ECE Department of ECE
EXTERNAL EXAMINER
II
This is to certify that the project report entitled “DESIGN OF A SMART MILK
ACKNOWLEDGEMENTS
ADULTERATION DETECTION SYSTEM USING AI AND SENSORS” is the
Bonafide work of PATHIVADA ANUSHA (21KD1A04D5), ROUTHU DILEEP
We consider it as a privilege to thank all the people who helped us for the
KUMAR (21KD1A04F4), SAILADA NIHARIKA (21KD1A04F6) and
successful completion of the project work entitled ‘DESIGN
SIDDANADHAM SS VIJAY KUMAR (21KD1A04G3) the students of this college, OF A SMART MILK
ADULTERATION
submitted DETECTION
in partial fulfillment of theSYSTEM USING
requirements AI award
for the AND SENSORS’.
of degree Bachelor of
Technology in Electronics and Communication Engineering duringmy
We would like to express our heartfelt gratitude to the parents foryear
academic their
encouragement
2024- 2025. and support to achieve and fulfill our dreams.
We take this opportunity to express our deep sense of gratitude to the
Management of LENDI Institute of Engineering and Technology for providing
congenial atmosphere and encouragement.
We sincerely express our wholehearted thanks to Dr. V. V. Rama Reddy,
Principal, Lendi Institute of Engineering and Technology who has given a lot of
support and freedom during our academics.
We profoundly thank Dr. M. Rajan Babu, Head of the Department, Electronics
and Communication Engineering, for his collaboration, constant support, and positive
belief on his students for the successful completion of this project work even though he is
busy with his hectic schedule of administration and teaching.
We would like to thank our Final year Coordinator Dr. Srikant Kumar Beura
PhD, NIT Meghalaya Associate Professor for his technical guidance, constant
encouragement and support in carrying out the project work.
We would like to thank my guide Mr. K. Gurucharan, Assistant Professor for
his technical guidance, continuous encouragement, and support in carrying out the project
work.
Finally, we would like to thank all the Teaching and Non-Teaching Staff who
helped us in the successful completion of this project work. We would also like to thank all
of our friends who helped us directly and indirectly for the successful completion of our
project work.
III
LENDI INSTITUTE OF ENGINEERING AND TECHNOLOGY
An Autonomous Institution
(Permanently Affiliated to JNTUGV, Approved by AICTE Accredited by
NBA, Accredited with NAAC with ‘A’ Grade)
Jonnada (Village), Denkada (Mandal), Vizianagaram District,
Andhra Pradesh, India-535005
INSTITUTE
VISION
Producing globally competent and quality technocrats with human values for the holistic needs
of industry and society.
MISSION
● Creating an outstanding infrastructure and platform for enhancement of skills, knowledge
and behavior of students towards employment and higher studies.
● Providing a healthy environment for research, development and entrepreneurship, to meet
the expectations of industry and society.
● Transforming the graduates to contribute to the socio-economic development and welfare
of the society through value-based education.
IV
LENDI INSTITUTE OF ENGINEERING AND TECHNOLOGY
An Autonomous Institution
(Permanently Affiliated to JNTUGV, Approved by AICTE Accredited by NBA,
Accredited with NAAC with ‘A’ Grade)
Jonnada (Village), Denkada (Mandal), Vizianagaram District, Andhra
Pradesh, India-535005
Department of Electronics and Communication Engineering
VISION
MISSION
● Offering an inspiring and conducive learning environment to prepare skilled and
competent engineers by providing infrastructure, laboratory facilities, and effective
teaching-learning process.
● Fostering culture to face complex technological challenges through Internships, Projects
and Industry-Institute Interactions in order to enhance employability skills.
● Creating an environment for higher studies and entrepreneurship by way of imparting
quality education and promoting research activities.
● Imparting professional behavior and strong ethical values towards societal issues by
encouraging socially relevant activities.
PEO1: Graduates shall have strong knowledge and technical skills in core and associated
fields of Electronics and Communication Engineering to become globally competent
engineers and emerging researchers.
PEO2: Graduates shall comprehend latest tools and techniques in the field of Electronics
and Communication Engineering to analyse, design and develop novel systems and
products for addressing the real time issues.
PEO3: Graduates shall have professional attitude, ethical values, teamwork and good
communication skills, to adapt the rapidly changing technologies in Electronics and
Communication Engineering through life-long learning.
V
PROGRAM SPECIFIC OUTCOMES (PSOs)
PSO1: Capable of design, develop, test, verify and implement analog and digital
electronics and communication engineering systems and products.
PSO2: Qualify in national and international competitive examinations for successful higher
studies and employment.
COURSE OUTCOMES:
S. No Description
C411.1 Acquire technical knowledge on fundamental aspects in electronics and
communication engineering to solve complex engineering problems for real
time applications.
C411.2 Identify the work based on past experiences and from literature survey for
specific problems in the field of Engineering.
VI
Abstract
Milk adulteration is a critical public health concern, especially in developing
countries where regulatory enforcement is limited. The addition of harmful substances such
as water, starch, detergents, and urea to milk compromises its nutritional value and poses
serious health risks. Traditional laboratory methods for detecting adulterants are accurate
but impractical for everyday use due to their high cost, complexity, and time requirements.
To address these challenges, this project presents a smart, real-time, and cost-effective milk
adulteration detection system based on artificial intelligence (AI) and embedded sensors.
The system integrates various sensors including DS18B20 (for temperature),
TCS34725 (for color detection), a turbidity sensor, LED-LDR setup (for fat estimation),
and pH measurement strips to capture critical milk quality parameters. Sensor data is
collected through a Raspberry Pi and processed using multiple machine learning algorithms
—Random Forest, SVM, AdaBoost, and XGBoost. These models are combined using a
Stacking Classifier to enhance prediction accuracy and reliability.
A user-friendly web interface built using Flask enables real-time result visualization and
logging, making the system accessible for farmers, consumers, and dairy processors. This
solution not only ensures immediate adulteration detection but also supports data
calibration, model retraining, and continuous monitoring, making it a scalable tool for
ensuring milk quality across supply chains.
The proposed system achieved a prediction accuracy of 95.2% using a Stacking Classifier
combining Random Forest, SVM, AdaBoost, and XGBoost.
Keywords: Milk Quality Detection, Machine Learning, Raspberry Pi, DS18B20 Sensor,
Turbidity Sensor, TCS34725 Sensor, pH Measurement, Lactometer, Random Forest, SVM,
AdaBoost, XGBoost, Stacking Classifier, Flask Web App, Real-time Monitoring, Food
Safety, Sensor-based System
POs Attained: PO1, PO2, PO3, PO4, PO5, PO6, PO7, PO8, PO9, PO10, PO11, and PO12.
PSOs Attained: PSO1, PSO2
VII
COURSE OUTCOMES VS POs MAPPING:
(DETAILED: HIGH:3; MEDIUM:2; LOW:1)
CO PO PO PO PO PO PO PO PO PO PO PO PO PSO PSO
1 2 3 4 5 6 7 8 9 10 11 12 1 2
C411.1 3 2 2 2 1 3 2 2 1 2 3 2 2 1
C411.2 3 2 1 2 3 2 3 2 3 2 1 2 2 1
C411.3 3 2 3 1 2 2 2 2 2 2 3 2 2 1
C411.4 2 1 3 2 3 2 2 3 1 3 2 2 2 1
C411.5 3 2 2 3 1 2 1 2 3 1 3 3 2 1
C411* 3 2 2 2 2 2 2 2 3 3 3 2 2 1
VIII
TABLE OF CONTENTS
SNO. CONTENT PAGE NO.
Introduction 2
1.1 Problem Statement 4
1.2 Background on Milk Adulteration 8
Literature Review 11
Raspberry and Machine Learning 16
3.1 Methodology 20
3.2 Raspberry pi Architecture 22
3.3 Sensors Configuration 30
3.4 Machine Learning Techniques 38
Software used 47
4.1 Machine Training 47
4.2 Modules and libraries used 50
4.3 Layouts and tools used 53
Results and discussions 60
5.1 Data set collection 60
5.2 Data pre-processing 61
5.3 Feature Engineering and Machine Learning 62
5.3.1 Random Forest, SVM 62
5.3.2 Ada Boost and XG Boost 63
5.3.3 Stacking Classifier 64
5.4 Comparative analysis of various parameters across 65
different methodologies
5.5 Results of application 66
5.6 Comparison with previous years 67
Conclusion 68
References 70
IX
LIST OF FIGURES
Fig no Fig Name Pg no.
1.4.1 Common Milk Adulteration 4
3.1 Milk Testing Cycle 9
3.1.1 Milk Calibration Kit 10
[Link] Temperature Sensor DS18B20 12
[Link] Color Sensor TCS 34725 13
[Link] Calibration of Fat 12
[Link] pH Strips 13
[Link] Turbidity Sensor 14
3.2.1 Raspberry pi and Pin Diagram 15
3.2.2 Pin Configuration 21
4.1.1 Google Collab 36
5.1.1 Feature Importance of RF 55
[Link] RF Shap Analysis 56
[Link] Comparison of Different Algorithms 36
[Link] Confusion Matrix of Random Forest 39
[Link] Confusion Matrix of SVM 41
[Link] Confusion Matrix of Ada Boost 44
[Link] Confusion Matrix of XG Boost 48
[Link] Web Application output1 49
[Link] Web Application output2 49
X
LIST OF TABLES
XII
DESIGN OF A SMART MILK ADULTERATION DETECTION SYSTEM USING AI AND SENSORS
CHAPTER-1
INTRODUCTION
1. INTRODUCTION
Introduction
Milk is considered one of the most essential dietary staples consumed globally due to its rich
nutritional profile and versatility. For better health, it is crucial to consume high-quality milk.
However, milk is a perishable product and even a minor contamination can lead to significant
spoilage and economic loss. Spoiled milk can become a breeding ground for millions of bacteria
within a short time, posing serious health risks. In developing countries like India, the situation is
exacerbated by inconsistent quality control and inadequate regulation. According to an FSSAI
survey, over 68.4% of milk in India was found to be adulterated, raising grave concerns regarding
consumer safety and public health.
Adulterated milk can be a source of various diseases including brucellosis, tuberculosis, and
listeriosis. Traditional laboratory-based milk quality tests are accurate but time-consuming,
expensive, and require technical expertise, making them unsuitable for routine testing in rural or
small-scale settings. This necessitates the development of intelligent, real-time, and cost-effective
methods for detecting milk adulteration.
Recent advances in machine learning (ML) and sensor technologies offer promising
alternatives. These tools can analyze complex physical and chemical data to detect
adulterants instantly. The objective of this study is to build a smart adulteration detection
system that uses AI models trained on sensor data to evaluate milk quality in real-time. The
system uses sensors to measure parameters like pH, turbidity, temperature, color, and fat
content, and applies ML models such as SVM, Random Forest, AdaBoost, XGBoost, and
Stacking Classifier for prediction.
In this work, a Flask-based user interface was developed to deliver results immediately, empowering
users like farmers, consumers, and dairy processors to assess milk quality on-site. This system also
supports calibration, which helps the AI distinguish between pure and adulterated milk, regardless of
the source or conditions. It not only provides real-time feedback but also logs data for further
analysis, making it valuable for continuous monitoring and policy-making.
1.1 Problem Statement
Milk adulteration remains a major public health and ethical challenge, particularly in regions where
regulatory oversight is limited. Substances such as water, starch, detergents, urea, and synthetic
chemicals are commonly used adulterants that compromise milk's safety and nutritional value.
Current detection methods are unsuitable for daily field use as they are slow, expensive, and require
laboratory setups. Moreover, natural variability in milk due to region, breed, or feeding conditions
complicates the detection process, making calibration a critical issue. This project addresses these
challenges by developing a smart, AI-driven detection system that offers real-time, affordable, and
user-friendly milk quality assessment.
1.2 Objectives
The objectives of this project are:
To identify common adulterants in milk and understand their effect on milk’s physical and
chemical properties.
To integrate multiple sensors to measure pH, turbidity, temperature, conductivity, and color
of milk.
To develop a robust calibration model that differentiates between pure and adulterated milk.
To apply ML algorithms (e.g., Random Forest, SVM, AdaBoost, XGBoost) for accurate
classification.
To implement a user-friendly web interface for real-time result display and data logging.
To ensure the system is scalable and affordable, targeting use in households and dairy
industries.
To validate system performance through experimental comparison with conventional testing
methods.
1.3 Background on Milk Adulteration
Milk, often referred to as a complete food, is indispensable to human nutrition.
However, rising demand and limited supply have encouraged unethical practices of
adulteration to increase volume and profit. Adulteration involves the addition of non-
nutritive or harmful substances that alter milk’s quality. In many developing nations, this
issue is compounded by lack of monitoring, weak enforcement of food safety standards,
and limited public awareness. Many adulterants are difficult to detect due to their colorless,
tasteless, and soluble nature.
Traditional lab-based testing methods, though effective, are time-consuming and resource-intensive.
There is a growing demand for smart technologies that allow real-time testing at collection points,
homes, and retail outlets. Understanding how adulterants affect milk’s properties like pH, density,
and viscosity is essential for building an AI-powered detection model.
into supervised machine learning models. Feature engineering and model training are performed
using algorithms like Support Vector Machine (SVM), Random Forest, AdaBoost, XGBoost, and a
custom Stacking Classifier for improved accuracy. The trained model is then deployed through a
Flask-based web interface, where users can input real-time sensor data and instantly receive
predictions on milk purity. This streamlined methodology ensures a balance between accuracy,
speed, and user accessibility, enabling practical deployment in rural and urban settings alike.
CHAPTER-2
LITERATURE REVIEW
2. LITERATURE REVIEW
Milk plays a crucial role in the diet of millions around the world and is a key source of
nutrients such as calcium, protein, and vitamins. With increasing dairy production and
consumption, ensuring the quality and accurate measurement of milk has become critical.
Manual methods of milk collection and quality testing are often prone to human error,
inefficiency, and inconsistencies. In response, several technological innovations have
emerged aiming to automate and enhance the accuracy of milk collection systems. This
literature review explores past research, technologies, and developments related to milk
calibration systems, quality sensing mechanisms, and automation in dairy technology.
The traditional methods of milk collection involve manual weighing and fat content
estimation using lactometers and other simple tools. These techniques, although widely used,
often lack precision and are not scalable for large-scale operations. Studies such as those by
Patil et al. [1] (2017) have emphasized the need for digital systems capable of capturing
accurate measurements of milk quantity and quality to ensure transparency and trust between
farmers and dairy processing units.[1]
Modern milk collection systems now employ digital weighing machines, RFID-based farmer
identification, and microcontrollers like Arduino or Raspberry Pi for automating the
collection process. These innovations have drastically reduced manual intervention and
enhanced the speed of operations. For instance, Kumar and Desai (2020) proposed an IoT-
enabled milk collection unit that records milk weight, temperature, and fat percentage in real-
time, storing data on cloud platforms for traceability and analytics.[2]
Two of the most critical quality parameters in milk are fat percentage and Solids-Not-Fat
(SNF) content. Traditional fat testing methods, such as the Gerber method, although reliable,
are time-consuming and labor-intensive. Research by Goyal and Bansal (2018) introduced
the use of ultrasonic sensors for non-invasive fat and SNF detection, demonstrating
significant potential in real-time milk quality analysis.
Recent developments have integrated optical sensors, electrical conductivity sensors, and
refractometers to improve the accuracy of fat content analysis. Machine learning algorithms
are also being used to calibrate sensors based on historical milk quality data, thus improving
prediction accuracy. These systems can identify anomalies or adulterations in milk by
With the rise of Industry 4.0, the dairy industry is increasingly embracing smart technologies.
IoT (Internet of Things), AI (Artificial Intelligence), and data analytics are now being
applied to develop smart dairy systems. Automated milk analysis machines can now detect
over 20 parameters, including fat, protein, SNF, density, temperature, and presence of
contaminants.
For instance, Nestlé and Amul have invested in automated milk collection centers that
leverage RFID, cloud databases, and real-time monitoring to manage large-scale milk
procurement from thousands of farmers. These systems not only ensure fair payment but also
empower farmers with insights about milk quality trends over time.
The work by Sinha et al. (2019) on "Smart Dairy Farming" shows how predictive analytics,
using quality data collected over time, can be used to improve animal health, optimize feed
quality, and increase milk yield, thereby integrating milk calibration with overall dairy farm
management.[4]
While a lot of work has been done in the area of milk collection automation and fat content
analysis, several challenges still persist. The calibration of sensors under variable
environmental conditions remains a concern. Furthermore, many rural and semi-urban dairy
cooperatives lack the infrastructure and training needed to adopt such technologies. There's
also limited research on the long-term stability and calibration drift of low-cost sensors used
in these systems.
Moreover, affordability and maintenance remain a bottleneck, especially in developing
countries. Studies suggest the need for low-cost, modular, and easily maintainable systems
that can be adopted at the grassroots level. Additionally, data security and the privacy of milk
transaction data is an emerging concern in digitally connected systems.
Sensors require routine calibration to maintain accuracy. Factors like temperature, humidity,
dust, and electrical noise can affect readings over time.
In a paper by Jadhav et al. (2020), a smart calibration algorithm was proposed for fat sensors,
which recalibrated based on known reference values collected periodically. This approach
reduced error by 15% over 3 months.[5]
Future work may involve self-calibrating systems that learn from historical data, powered by AI
and machine learning.
Arduino microcontrollers have transformed prototyping and low-cost embedded design. They
support analog/digital inputs, serial communication, and can interface with GSM, RFID, and
cloud platforms.
Chavan et al. (2019) demonstrated a full dairy automation system using Arduino Uno, capable
of weight logging, fat detection, and sending SMS receipts to farmers. The cost-effectiveness
and simplicity of the Arduino ecosystem make it ideal for rural deployment.[6]
Modern milk calibration systems store historical data on cloud platforms or SD cards. This not
only helps maintain records but also ensures traceability—a critical factor in modern food safety
norms.
IoT-enabled systems can track milk from collection point to processing center, with real-time
alerts on quality deterioration. Projects like NDDB’s e-Milk initiatives in India have shown how
digital traceability can reduce spoilage and enhance farmer accountability.
Recent research focuses on machine learning to analyze large datasets from milk calibration
systems. Features like temperature, fat percentage, and SNF content can predict milk spoilage or
contamination.[7]
A study by Bhatt and Narayan (2022) used Random Forest algorithms to predict milk
adulteration with over 90% accuracy. These tools help processing units reject poor-quality milk
before it enters the supply chain.[8]
Fat percentage and Solids-Not-Fat (SNF) content are central to determining milk value. While
Gerber and Babcock tests remain common, modern systems aim to provide real-time digital
alternatives.
Studies by Singh et al. (2018) introduced infrared spectroscopy as a reliable method for fat analysis.
Others explored ultrasonic methods that measure the velocity of sound through milk to infer its
composition.[9]
A challenge remains in creating low-cost, rugged sensors for fat/SNF detection that can withstand
field usage without frequent recalibration.
The literature clearly demonstrates a global trend toward digitizing and automating milk calibration
processes. Systems that integrate sensors, microcontrollers, and wireless communication not only
improve accuracy but also increase transparency and trust in the dairy value chain.[10]
However, successful adoption hinges on affordability, ruggedness, and ease of use. Future research
should focus on self-calibrating, AI-enhanced sensors, offline-compatible GSM systems, and
modular designs that can be easily scaled in low-resource environments. By aligning technological
advancements with local needs, the dairy industry can significantly enhance productivity and fairness
for both producers and processors.
CHAPTER-3
Methodology
[Link]
Introduction
The system detects milk adulteration using sensor data and machine learning. Sensors
measure key milk properties turbidity, temperature, color, and pH using devices like the
DS18B20, TCS34725, and pH strips. These readings are collected via a Raspberry Pi and
stored for processing.
Labeled data from pure and adulterated samples is used to train various ML models
including Random Forest, SVM, AdaBoost, and XGBoost. A Stacking Classifier, which
combines multiple models, achieved the best accuracy and was chosen for deployment. The
Fig
3.1
Pure milk generally appears white or slightly creamy. If it is adulterated with substances like
synthetic milk, detergents, or starch, the color may subtly shift. These changes might not be visible to
the human eye but can be quantified by the TCS34725, making it highly useful in quality testing
applications.
Implementation in the Project:
In this milk adulteration detection system, the TCS34725 color sensor is interfaced with the
Raspberry Pi using I2C communication. The sensor captures the RGB values of each milk sample
and sends the digital readings to the controller. These values are then included in the feature set used
for training the machine learning model. During live tests, these readings help classify the milk as
good or bad based on previously learned patterns.
3.1.3 Fat Calculation Using LED and LDR
Milk fat content is a critical parameter for determining milk quality and commercial value.
Traditional methods for measuring fat, such as the Gerber or Babcock methods, are manual, time-
consuming, and require chemical reagents. To overcome these limitations, a non-invasive, optical
method using LED and LDR (Light Dependent Resistor) is employed in this smart system for
estimating fat concentration in milk. Fat calculated by using LDR and LED from the figure [Link].
Working Principle:
This method is based on the light scattering phenomenon. Milk is an emulsion where fat globules
scatter incident light. The amount of light that passes through milk and reaches the LDR varies with
fat content:
Higher fat content → More scattering → Less light reaches LDR
Lower fat content → Less scattering → More light reaches LDR
In this setup:
An LED emits light through a test tube containing milk.
The LDR is placed directly opposite the LED.
As milk scatters the light, the intensity of light falling on the LDR changes based
on the fat concentration.
The LDR’s resistance increases with lower light intensity (higher fat) and
decreases with higher light intensity (lower fat).
The relationship between the light intensity (L) and the resistance (R) of an LDR is inversely
R∝ L
proportional:
1
Alternatively, fat can be inversely estimated using voltage across the LDR (since it forms a voltage
divider with a fixed resistor):
R LDR
V LDR=V ¿. ----------[2]
R LDR + R¿
From
calibration:
1
F≈ k.
V LDR
Where k and ccc are constants derived experimentally.
The pH value of milk is a critical indicator of its freshness and overall quality. Fresh milk typically
has a pH ranging between 6.5 to 6.8, which indicates a slightly acidic nature. Any deviation from this
range can signal poilage, bacterial growth, or adulteration. In the proposed smart milk adulteration
detection system, pH measurement is performed manually using pH paper strips. This method is
chosen due to its simplicity, cost-effectiveness, and quick indication of acidity or alkalinity.
One of the primary advantages of using Raspberry Pi is its support for Python and other
programming languages, which makes it easy to integrate AI-based algorithms for detecting
patterns and anomalies in milk characteristics. These algorithms can be trained to distinguish
between pure and adulterated milk using sensor data. Once trained, they can be deployed on
the Raspberry Pi to provide real-time detection and alerts.
Features:
1. Processor & Performance
Most recent models (like Raspberry Pi 4 and 5) include a quad-core ARM Cortex
processor, offering clock speeds up to 1.8 GHz, with 1GB to 8GB RAM options. This
enables it to handle moderate data processing, edge computing, and even AI workloads
without needing a full computer.
2. GPIO Pins (General Purpose Input/Output)
Raspberry Pi boards come with 40 GPIO pins for connecting sensors and modules. These
pins allow it to read analog/digital inputs from temperature, turbidity, and pH sensors, or
to control actuators, displays, etc.
3. Connectivity Options
o Wi-Fi and Bluetooth for wireless communication
o USB ports for peripherals like keyboards, cameras, and storage
o Ethernet for stable, high-speed networking
o HDMI for connecting to displays
4. Operating System Support
Raspberry Pi runs Raspberry Pi OS (formerly Raspbian), a Linux-based system. It also
supports Ubuntu, Windows IoT Core, and others. This flexibility allows you to program
in Python, C, [Link], or any supported language.
Pin Classification
The 40 GPIO pins are divided into:
Pin Type Description
Power Pins Provides 3.3V or 5V output for powering sensors and modules
Ground Common ground connection for all electronic components
o Requires a 4.7kΩ pull-up resistor between the data and VCC lines.
o Measures milk temperature and sends data to the Pi via GPIO.
2. Turbidity Sensor
o Connected through analog-to-digital converter (ADC) like MCP3008, since the
Raspberry Pi does not have an analog input.
o Turbidity values help detect dilution or foreign particles.
3. TCS34725 Color Sensor
o Communicates with Raspberry Pi via I2C interface.
o Helps determine the color intensity of milk to assess quality and adulteration.
4. LED-LDR Fat Measurement Setup
o The LED is powered by a GPIO pin.
o LDR is connected to the ADC to capture analog values of light intensity, which
indirectly indicate fat content.
5. Power Supply
o The system is powered using a 5V DC adapter connected to the Raspberry Pi.
o Additional power management modules like buck converters or power hats may
be used for voltage regulation.
6. Breadboard & Jumper Wires
o Used for prototyping and establishing temporary connections between sensors
and GPIOs.
7. pH Measurement
o Although done manually using strips, the value is entered manually through the
Flask web interface, allowing seamless integration with sensor-based values.
System Flow:
Upon powering up, the Raspberry Pi initializes all sensors.
Data is collected in real-time from each connected sensor.
Values are stored locally in CSV format for training and testing purposes.
During live testing, real-time values are passed to the Flask backend to run
predictions using the trained machine learning model.
Advantages of Using Raspberry Pi:
Portable and lightweight.
Supports Python-based sensor interfacing and machine learning libraries.
Easily integrated with web technologies like Flask.
1. Support Vector Machine (SVM): A classifier that finds the best boundary between classes
by maximizing the margin.
SVM tries to find the optimal hyperplane that:
1. Separates the classes.
2. Maximizes the margin between the closest points of each class (called support vectors) and
the hyperplane.
SVM Equations (Linear Case)
1. Equation of a Hyperplane
In an n-dimensional space:
t
w x+b=0 -----[1]
This ensures that data points are on the correct side of the margin.
3. Objective (Hard Margin SVM)
Find w and b that minimize:
1
2
||w||^2 ---[1]
y i ( w r xi +b ) ≥1 ∀ i -----[2]
Objective becomes:
n
1
min ( ||w|| +c ∑ ξi)
2
-----[4]
w b ξi 2
i i i=1
Subject to:
n
0≤ α i ≤C, ∑ αi y i=0
i=1
2. AdaBoost: An adaptive boosting algorithm that combines weak learners to form a strong
classifier.
AdaBoost, short for Adaptive Boosting, is a powerful ensemble learning algorithm that combines
multiple weak learners (often decision stumps) into a strong classifier by focusing on the errors
of previous learners.
Algorithm and Equations
1. Initialize weights
For N training examples ( x 1 , y 1),…,( x N y N ), where y i ∈ {−1, +1}:
(1) 1
wi = for all i ---[1]
N
∑ w (it ) . П (h i( xi )≠ y i )
i=1
ϵt = N ---[2]
∑w (t )
i
i=1
Where:
П is the indicator function (1 if incorrect, 0 if correct).
Normalize:
(t +i)
wi
(t +i) N
wi = ---[5]
∑ w(tj +i)
j =1
Summary of Components
Term Description
ht ¿ ) Weak classifier at round t
ϵt Weighted error of ht t h
αt Weight (confidence) of ht t h
(t )
wi Weight of training point i at round
Where:
f k is a regression tree (also called CART).
1. Objective Function
n n
L(ϕ) = ∑ l ( y i , ŷ i) + ¿ ∑ Ω ( f k ) ¿ ----[2]
i=1 k=1
Where:
L is a loss function (e.g., MSE for regression or log-loss for classification).
Ω(f) is the regularization term:
o T: number of leaves in the tree
o w: leaf weights
o γ, λ: regularization parameters
2. Additive Training (Boosting)
Model is built additively:
(t ) (t −1)
ŷi = ŷi + f t (x¿ ¿i)¿ ----[3]
3. Second-order Taylor Approximation
To make optimization efficient, XGBoost uses a second-order approximation of the loss:
n
1
L(t) ≈∑ [g i f t ( x i ) + hi f t (x i )]+ Ω f t --[4]
2
i=1 2
Where:
gi = first derivative
hi = second derivative
Where:
G j ∑ gi , H j = ∑ h i --[5]
i ∈I j i∈ I j
This score is used to decide where to split the tree by maximizing gain.
5. Final Prediction
After K trees:
K
ŷ i= ∑ f k ( x i) --[6]
K=1
Summary of Components
Component Description
fk Regression tree (weak learner)
gi hi Gradient and Hessian of the loss
Ω(f) Regularization to penalize complexity
ŷi Final prediction for input x i
Stacking Classifier: A hybrid model that combines multiple base classifiers (like those above) and
uses a meta-classifier to make final predictions.
Stacking (Stacked Generalization) is an ensemble learning technique that combines multiple base
models (often of different types) and uses a meta-model to make the final prediction.
Unlike bagging or boosting, stacking focuses on learning how to best combine the predictions of
several base learners.
Mathematical Formulation
Let’s assume:
D = {( x i , y i )}: training data
You have MMM base classifiers: h1 , h2,...,h M
A meta-classifier H
1. Base Learners Training
Each base learner is trained on the original input data:
h m (x), for m=1,...,M ----[1]
2. Meta-Feature Generation
For each training instance x i, create a meta-feature vector:
z i, =[h1( x i),h2 ( x i ¿ ..,,h m( x i ¿ ¿ , ---[2]
This becomes the input to the meta-classifier.
So, new dataset for the meta-learner:
D′ = {( z i, y i )}
To avoid overfitting, these predictions are usually generated via out-of-fold predictions (cross-
validation).
3. Meta-Model Training
Train a meta-classifier H on D′:
H(z) = H(h1(x), h2(x),…, h m(x)) ---[3]
4. Final Prediction
For a new test instance x:
Each h m(x) gives a probability or class prediction, and H learns how to best combine them.
Summary of Components
Element Description
h m (x) Base learners
Meta-feature vector (predictions from base
zi
models)
H(z) Meta-model making final prediction
ŷ Final prediction from the stacking model
CHAPTER-4
SOFTWARE USED
4. SOFTWARE USED
import time
Usage in code:
o [Link](3): Pauses the script for 3 seconds to avoid overwhelming sensor readings.
import board
Purpose: Part of the Adafruit_Blinka library, used for handling I2C, SPI, UART pins on
Raspberry Pi.
Usage in code: Required for initializing the I2C interface to communicate with the
TCS34725 color sensor.
import os
Usage in code:
o [Link](...): Loads kernel modules to enable temperature sensor reading via 1-wire.
import glob
Usage in code:
o Locates the device folder where the DS18B20 temperature sensor data is stored.
import csv
Usage in code:
import adafruit_tcs34725
Usage in code:
Usage in code:
python
i2c = board.I2C()
sensor = adafruit_tcs34725.TCS34725(i2c)
[Link] =4
sensor.integration_time = 154
Initializes I2C and sets the sensor’s sensitivity and exposure time to balance brightness
detection.
python
[Link]('modprobe w1-gpio')
[Link]('modprobe w1-therm')
base_dir = '/sys/bus/w1/devices/'
device_folder = [Link](base_dir + '28*')[0]
device_file = device_folder + '/w1_slave'
Enables 1-Wire interface and locates the temperature sensor path in the device tree.
Turbidity Sensor
python
CopyEdit
TURBIDITY_PIN = 18
[Link]([Link])
[Link](TURBIDITY_PIN, [Link])
Reads raw sensor output from /w1_slave file and parses it for temperature in °C.
is_color_white(r, g, b)
python
CopyEdit
avg_rgb = round(((r+g+b)/3))
CSV Logging
python
CopyEdit
filename = "[Link]"
if not file_exists:
[Link](['Timestamp', 'Temperature (C)', 'Is White', 'Milk Purity'])
o Timestamp
o Temperature
The first step involved data preprocessing and exploration. Sensor readings such as
temperature, turbidity, RGB color values, fat content, and pH were collected using a Raspberry Pi
connected with sensors. These data points were recorded and exported as CSV files, which served as
the foundation for model training. This dataset was uploaded to Google Colab where data cleaning,
normalization, and feature selection were performed using Python libraries such as Pandas, NumPy,
and Scikit-learn.
Once the dataset was clean and structured, various machine learning classification models
were trained. The models used include Random Forest, Support Vector Machine (SVM), AdaBoost,
XGBoost, and a Stacking Classifier — an ensemble technique that combines the strengths of
multiple models. These models were chosen for their robustness, accuracy, and ability to handle non-
linear and imbalanced data distributions, which are common in real-world sensor datasets. Google
Colab facilitated this process by allowing smooth integration with key machine learning libraries like
Scikit-learn, XGBoost, and Matplotlib for visualizing performance.
The dataset was split into training and testing sets, usually at a 70:30 ratio. Feature scaling
techniques such as Standard Scaler were applied to ensure uniformity across different units and
ranges. Model training was conducted with cross-validation to minimize overfitting, and the
evaluation metrics included Accuracy, Precision, Recall, and F1-score to ensure comprehensive
performance measurement.
Among all models, the Stacking Classifier emerged as the most effective, achieving an
accuracy of 91.6%, slightly outperforming the standalone models. The stacking approach leverages
predictions from base learners (e.g., Random Forest and SVM) as inputs to a meta-learner, which
learns how to best combine them. This technique is particularly beneficial in complex classification
problems like milk adulteration detection, where patterns can vary based on subtle sensor
fluctuations.
In conclusion, Google Colab served as a powerful platform for implementing, testing, and
validating the machine learning models. Its resource-rich environment enabled rapid prototyping and
thorough experimentation, leading to a reliable AI-driven solution for real-time milk adulteration
detection.
Through the below fig 4.1.1 it is an interface for terminal opening with Google Colab. By
using Colab we can train machine and genereate .pkl file.
Library: pandas
Usage:
2. import numpy as np
Library: numpy
Usage:
3. import pickle
Usage:
Usage:
o train_test_split(...): Divides data into training (80%) and testing (20%) sets.
Library: scikit-learn
Purpose: Importing the Random Forest algorithm, which is an ensemble of decision trees.
Usage:
Library: scikit-learn
Usage:
Library: scikit-learn
Usage:
Library: matplotlib
Usage:
o [Link](...), [Link](...), [Link](...), etc.: Used to plot and customize the feature
importance bar chart.
Library: seaborn
Usage:
o [Link](...): Creates a visually appealing bar plot for feature importances using
color palettes like "viridis".
1. Import Libraries – Tools required for data handling, model training, and visualization.
4. Split Data – Separate features (X) and target (y), then split into training and test sets.
9. Feature Importance Plot – Show which features matter most using matplotlib and seaborn.
SVM Algorithm:
1. import pandas as pd
Library: pandas
Used for:
2. import numpy as np
Library: numpy
Used for:
3. import pickle
Purpose: Serialization.
Used for:
Used for:
Library: scikit-learn
Used for:
o SVC(kernel='rbf', ...): Creates an SVM classifier using the Radial Basis Function
kernel.
Library: scikit-learn
Purpose:
Used for:
o Scaling features before SVM training, which is crucial for optimal performance of
SVM.
Library: scikit-learn
Used for:
Library: matplotlib
Used for:
o You can use it to create graphs or plots (e.g., learning curves, confusion matrices) in
future expansions of your project
Library: seaborn
Note: Not used in the current script, but included for possible future plotting (like heatmaps
or pair plots).
Library: pandas
Used for:
2. import numpy as np
Library: numpy
Used for:
o Efficient manipulation of arrays, although not used directly here—it often supports
backend operations in machine learning.
Library: xgboost
Used for:
o It is one of the most powerful ensemble learning models using decision trees under
the hood.
Library: scikit-learn
Used for:
o train_test_split(...): Splits the dataset into 80% training and 20% testing.
Library: scikit-learn
Used for:
Library: scikit-learn
Used for:
7. import joblib
Used for:
XG Boost Algorithm:
import pandas as pd
Library: pandas
Use in script:
import numpy as np
Library: numpy
Use in script:
import pickle
Use in script:
Library: xgboost
Use in script:
Purpose:
Purpose:
o LabelEncoder: Converts categorical labels (like "Good", "Bad") into numeric values.
o StandardScaler: Standardizes features to have zero mean and unit variance, which
helps improve model performance especially for algorithms sensitive to feature
scaling.
Use in script:
o classification_report: Provides precision, recall, F1-score, and support for each class.
Library: matplotlib
Note: Although imported, it's not used in this script, but can be used to plot feature
importance or confusion matrices.
Library: seaborn
Note: Also imported but not used in this script; typically used for heatmaps, boxplots, and
feature importance graphs.
import pandas as pd
Library: pandas
Use in script:
import numpy as np
Library: numpy
Use in script:
import pickle
Use in script:
Library: xgboost
Use in script:
Purpose:
Purpose:
o LabelEncoder: Converts categorical labels (like "Good", "Bad") into numeric values.
o StandardScaler: Standardizes features to have zero mean and unit variance, which
Use in script:
o classification_report: Provides precision, recall, F1-score, and support for each class.
Library: matplotlib
Note: Although imported, it's not used in this script, but can be used to plot feature
importance or confusion matrices.
Library: seaborn
Note: Also imported but not used in this script; typically used for heatmaps, boxplots, and
feature importance graphs.
The entire application is powered by a Flask-based backend, which serves as the intermediary
between the Raspberry Pi hardware, the machine learning model, and the frontend interface. The
choice of Flask is justified by its lightweight nature, ease of integration, and rapid deployment
capabilities.
Backend Workflow:
1. Sensor Input Handling: Raspberry Pi sends real-time sensor values to the Flask server via
HTTP requests or direct GPIO data streams.
2. Model Prediction: Once all parameters (temperature, turbidity, color, pH, and fat) are
received, Flask loads the trained Stacking Classifier model and performs predictions.
3. Routing: Flask defines endpoints (routes) such as /, /predict, /get_data, etc., to handle
frontend requests.
4. Template Rendering: Flask uses the Jinja2 template engine to render HTML pages
dynamically based on data from sensors and prediction results.
Frontend Technologies:
JavaScript for dynamic updates (using fetch() or AJAX for real-time data).
The Flask app also logs each prediction and sensor reading to a CSV or database, which can be
viewed later through the UI.
Real-time data monitoring is a crucial component of this system, providing users with live feedback
on milk quality. This capability allows users to take immediate action if adulteration is detected.
Sensor Data Update: Raspberry Pi reads sensor values (temperature, turbidity, color)
continuously and sends them to the Flask server.
AJAX/JavaScript Polling: The frontend uses JavaScript to periodically (e.g., every 2–3
seconds) send fetch requests to a Flask endpoint like /get_data.
Dynamic DOM Update: Once new sensor data is received, JavaScript updates the UI
without reloading the page.
Real-time Prediction: As soon as all necessary features are available, the system calls the
/predict route and displays the updated milk quality result.
Visual Cues: Sensor values and the quality label are updated instantly, with animated
transitions and colour changes to draw user attention.
Additional Features:
Alerts (Optional): Pop-up notifications or buzzer activation if bad quality milk is detected.
This real-time component ensures that the milk quality is continuously monitored without requiring
manual intervention, which is essential in environments like dairy farms and milk collection centers.
1. Flask (Backend)
Flask is a lightweight Python-based micro web framework ideal for creating machine learning-based
APIs and dynamic web applications. It handles server-side logic, routing, and communication
between the frontend and the trained model.
Key Flask Features Used:
Routing: Handles different endpoints such as the home page (/), prediction endpoint
(/predict), and data upload.
Form Handling: Receives user input from frontend HTML forms (sensor values or CSV
upload).
Model Integration: Loads the trained stacking classifier model using joblib or pickle and
uses it to predict the quality of milk based on user inputs.
JSON Support: Sends back prediction results to be displayed on the web page.
Template Rendering: Uses render_template() to link Python logic to HTML pages using
Jinja2.
2. HTML (Frontend Structure)
HTML (HyperText Markup Language) structures the webpage by creating forms, buttons, labels,
and display sections. It forms the backbone of the user interface.
Key Elements Used:
<form>: For collecting user inputs (sensor values like temperature, turbidity, etc.)
<input type="text">: For user data entry
<button>: For submitting prediction request
<div> and <span>: For layout and results display
3. CSS (Styling)
CSS (Cascading Style Sheets) enhances the visual appearance of the web interface, ensuring a clean,
modern, and responsive design that is user-friendly.
Features Used:
Colour Themes: Blue/green tones for healthy feedback, red tones for adulteration warnings.
Responsive Design: Ensures compatibility with mobile and desktop devices.
Button Styling: For better UI/UX interactions.
Container Formatting: To center content and make it visually appealing.
4. JavaScript (Dynamic Interactivity)
JavaScript is used to enhance interactivity by handling form validation, adding alerts, and updating
the DOM without refreshing the page.
Applications in the Web App:
Form Validation: Prevents empty or invalid entries before sending them to the backend.
Loading Animation: Indicates prediction is being processed.
Result Display: Displays prediction result dynamically using inner HTML.}
The web interface successfully bridges the hardware-sensor system with the machine learning model,
offering a seamless and intuitive platform for real-time milk quality detection. With Flask managing
the logic and machine learning predictions, and HTML/CSS/JS creating a user-centric front end, the
application delivers functionality, reliability, and a professional look suitable for field use and
consumer awareness.
User Interface Design:
The user interface (UI) of the Smart Milk Adulteration Detection System is designed with simplicity,
accessibility, and real-time interaction in mind. Since the end-users may include milk producers,
quality inspectors, and even consumers with non-technical backgrounds, the UI follows a clean and
intuitive layout that makes the experience seamless.
The interface is built using HTML, CSS, and JavaScript, ensuring responsiveness across various
devices including desktops, tablets, and smartphones. The design includes a dashboard-like layout,
clearly displaying sensor inputs and prediction results.
Key Features of the UI:
Header Navigation: Simple navbar to navigate between the dashboard, prediction logs, and
help/documentation.
Sensor Data Display: Dedicated sections that show real-time values of temperature,
turbidity, colour, pH, and fat.
Prediction Panel: A prominent section where the milk quality is shown as either “Good” or
“Adulterated”, based on model predictions.
Colour Indicators: Green for good quality and red for adulterated milk, enhancing visual
interpretation.
Minimal Input Fields: Since most data is fetched directly from the Raspberry Pi and
sensors, the manual inputs are minimal (only fat and pH values if not sensor-integrated).
Responsive Layout: The design adapts based on screen size, using CSS Flexbox/Grid and
media queries.
The primary design goal was to keep the interface uncluttered, with essential information easily
accessible, ensuring usability for all categories of users.
CHAPTER - 5
RESULTS
RESULTS
5.1 Results of the Hardware Implementation:
The hardware prototype for the smart milk adulteration detection system was
successfully developed and tested using a Raspberry Pi microcontroller integrated with
various sensors. The aim was to detect adulterants in milk samples by measuring key
physical and chemical parameters. The system was evaluated based on its data accuracy,
response time, and usability in real-time conditions.
The diagram and results illustrate the performance and feature importance of a Random Forest
model. The feature importance plot highlights pH as the most significant contributor (importance
score ~0.4), followed by Temperature, Color, Fat, and Turbidity, indicating their relative influence
on the model's predictions. The model evaluation reveals strong performance, with a mean cross-
validation accuracy of 0.86 and consistent metrics across classes: precision, recall, and F1-scores for
"Good" (0.88) and "Bad" (0.84) labels demonstrate balanced classification. Additionally, the
reference to Fig [Link] RF Sharp Analysissuggests further interpretability analysis using SHAP
values, which would elucidate how each feature impacts individual predictions. Together, these
elements underscore the model's reliability and the key factors driving its decisions.
The SVM model achieved a mean CV accuracy of 0.8708 and final accuracy of 0.8768, with strong
performance for the "Good" class (recall=1.00) but lower recall for "Bad" (0.72). These results,
detailed in Table [Link], demonstrate the model's effectiveness but highlight class-specific trade-offs
in precision and recall.
[Link] Results of the Ada Boost Algorithm:
Model Accuracy: 0.8768
Precision: 0.8770
Recall: 0.8768
F1-Score: 0.8764
Classification Report:
Classes precision recall f1-score support
Table [Link] summarizes the performance of the Ada Boost algorithm, which achieved a high model
accuracy of 87.68% along with balanced precision, recall, and F1-scores. The model performed
slightly better in identifying "Good" samples, demonstrating strong overall effectiveness in
classifying the dataset.
Shap Analysis of Ada Boost Algorithm:
Table [Link] highlights the superior performance of the XGBoost classifier, which achieved the
highest model accuracy of 89.66% and a mean cross-validation accuracy of 86.59%. With strong
precision, recall, and F1-scores for both classes, XGBoost demonstrates excellent generalization and
robustness in classification.
The SHAP summary plot illustrates how each feature influences the model's predictions, with
pH having the most significant impact, followed by color and temperature. High and low feature
values are color-coded, revealing that both the magnitude and direction of feature contributions vary
across predictions.
The SHAP dependence plot illustrates how temperature influences the model's output, with
varying SHAP values across its range. The color gradient representing turbidity indicates interaction effects,
suggesting that the impact of temperature on predictions is modulated by turbidity levels.
combining the predictive powers of Random Forest, SVM, AdaBoost, and XGBoost, and using a
meta-learner to synthesize results, you achieve top-tier classification performance. In milk
adulteration detection—where the safety and health of consumers are at stake—this level of
reliability is crucial.
Stacking Classifier Accuracy: 0.956256157635468
Classification Report:
precision recall f1-score support
The stacking classifier achieved the highest accuracy of 95.63%, demonstrating exceptional
performance in distinguishing between "Good" and "Bad" classes. With perfect precision for the
"Bad" class and high F1-scores overall, the model shows strong predictive power and balanced
classification.
Stacking
91.62% 0.92 0.91 0.92
Classifier
Algorithms were trained and tested. These include Random Forest, Support Vector Machine
(SVM), AdaBoost, XG-Boost, and a Stacking Classifier that combines all the above models. Each
model has unique advantages and limitations, and their effectiveness can vary depending on the
nature of the dataset and classification task.
Data collected from sensors and manual inputs (pH and lactometer) were saved in a structured CSV
format.
1. Preprocessing included normalization and label encoding.
2. 70% of the data was used for training, and 30% was used for testing.
3. The trained model (Stacking Classifier) was deployed in the backend using Python and Flask.
4. Real-time predictions were made within 2 seconds, offering immediate classification results
on the web interface.
powered by a machine learning model or statistical algorithm, to determine the quality of the milk.
This interface appears to be part of a predictive tool aimed at automating quality assessment, though
the exact output or scoring system is not visible in the provided image. The inclusion of parameters
like pH and turbidity suggests a focus on both chemical and physical properties of the milk for
comprehensive evaluation.
The image shows a milk quality prediction interface where the user has entered the following values:
Temperature = 45°C, Fat = 1, Turbidity = 0, Color (RGB) = 256, and pH = 6.5. After submitting
these inputs by clicking the "Predict" button, the system outputs the result: "Milk Quality is Good",
indicating that the analyzed parameters meet the criteria for satisfactory milk quality according to the
underlying evaluation model.
Sample Output of Prediction
Sample ID Temp (°C) Turbidity RGB pH Fat Predicted Class
101 29.5 3.6 (220, 230, 250) 6.6 0 Good
102 30.2 2.4 (210, 215, 240) 5.9 1 Bad
The sample output displays two milk quality predictions: Sample ID 101 with a temperature of
29.5°C, turbidity of 3.6, RGB values (220, 230, 250), pH 6.6, and Fat 0 is classified as "Good", while
Sample ID 102 with a temperature of 30.2°C, turbidity of 2.4, RGB values (210, 215, 240), pH 5.9,
and Fat 1 is labeled "Bad", reflecting the model's assessment based on the input parameters.
Comparison Table:
Key Features
Study Methodology Result Dataset Analysis
Used
Evaporation Focused on visual
Image-Based Detection of
patterns of Custom image changes in milk drops
Adulterants in Milk Using CNN 84.94%
adulterated milk dataset using CNN image
CNNACS Omega, 2024[3]
droplets classification.
On the Utilization of Deep and RF:
Compared traditional
Ensemble Learning to Detect 93.23%, Public FTIR spectral
CNN, RF, GBM FTIR spectral data and deep models on
Milk AdulterationBioData GBM: data
chemical spectra.
Mining, 2019[4] 92.25%
Milk Source Identification and E-nose sensor Multimodal approach
SVM, RF, Logistic RF: 94%,
Milk Quality Estimation Using data, DHI test Sensor + DHI dataset using gas sensor array
Regression LR: 92.5%
E-NoseSensors, 2020[7] data and lab results.
Prediction of Fresh Milk Quality pH, SNF, protein,
ANN: R² > Manually collected Used regression for
Using ANN and ANN, MNLR density, freezing
0.90 farm data quality score estimation.
MNLRLACCEI, 2023[8] point
Sensor-based custom Real-time field system
Stacking Classifier
Proposed Project: Smart Milk Temperature, pH, dataset (Raspberry Pi with web interface;
(Random Forest,
Adulteration Detection Using AI Turbidity, Color, 95% + DS18B20, integrates sensor data
SVM, AdaBoost,
and Sensors[10] Fat TCS34725, LDR, pH and AI for on-site
XGBoost)
paper) analysis.
Table 5.3.1 Comparison of Previous papers with Proposed Project
The table compares five studies on milk quality and adulteration detection, highlighting diverse
methodologies and results. The first study (ACS Omega, 2024) used CNN to analyze evaporation
patterns in milk droplets, achieving 84.94% accuracy, while the second (BioData Mining, 2019)
employed deep and ensemble learning on FTIR spectral data, with Random Forest (RF) yielding the
highest accuracy (93.23%). The third study (Sensors, 2020) combined E-nose sensor data with DHI
test results, where RF again outperformed with 94% accuracy. The proposed project stands out by
integrating AI (Stacking Classifier) with real-time sensor data (temperature, pH, etc.), achieving 95%
accuracy and offering a practical, field-deployable solution for on-site milk quality analysis.
CHAPTER – 6
Conclusion
Conclusion
The project “Design of a Smart Milk Adulteration Detection System Using AI and Sensors” has
successfully demonstrated an innovative, cost-effective, and scalable solution to a pressing issue in
public health — the detection of adulteration in milk. Through the intelligent integration of
embedded sensor technology, machine learning algorithms, and a user-friendly web interface, the
system provides a practical approach for real-time milk quality assessment.
By utilizing key sensors such as the DS18B20 temperature sensor, turbidity sensor, and TCS34725
color sensor, the system captures critical physical properties of milk that serve as early indicators of
adulteration. Additional parameters, including pH level (measured via strips) and fat content
(measured using an LED-LDR based optical method), enhance the feature set and allow the model to
make more accurate decisions. Data collected from these sensors is stored in CSV format and used to
train and evaluate several machine learning models including Random Forest, Support Vector
Machine (SVM), AdaBoost, XGBoost, and an ensemble Stacking Classifier.
Among the tested models, the Stacking Classifier showed the highest predictive accuracy, achieving
over 91% accuracy, and demonstrated robustness in handling real-world data. This confirms the
advantage of combining multiple models to improve overall prediction performance. The models
were thoroughly validated using cross-validation, classification metrics (precision, recall, F1-score),
and comparative analysis, all of which affirmed the system's efficiency and reliability.
The final deployment of the system involved implementing the machine learning pipeline on a
Raspberry Pi, making it a compact and portable edge-computing solution. The Flask-powered
backend served as the API for prediction and sensor data handling, while the frontend — built using
HTML, CSS, and JavaScript — enabled intuitive user interaction and real-time result display.
This system not only streamlines the process of milk testing but also minimizes human errors and
dependency on sophisticated laboratory setups. It can be used by dairy farmers, quality control units,
and even consumers for quick milk quality verification. Moreover, its low-cost hardware and
scalable software architecture make it suitable for deployment in rural as well as urban
environments.
In conclusion, the project successfully bridges the gap between conventional milk quality testing and
modern-day technology. It paves the way for future enhancements such as cloud integration, mobile
app interfaces, and AI-powered anomaly detection. By ensuring milk safety at the grassroots level,
this solution contributes meaningfully to public health, consumer protection, and food industry
innovation.
REFERENCES
[1]Walter FranciscoSalas-Valerio, Didem [Link], Beatriz [Link] Sakoda, Fanny [Link]˜na-
Urquizo, ChristopherBall, MarcalPlans, LuisRodriguez-Saona, In-field screening of trans-fat levels
using mid- and near-infrared spectrometers for but ters and margarines commercialized in the
Peruvian market, LWT, Volume 157, 1 March 2022, 113074,
[Link]
[2]Agnet, Y. 1998. Fourier transform infrared spectrometry. A new con cept for milk and milk
product analysis. Bull. Int. Dairy Fed. 332:58–68.
[3]BIBLIOGRAPHY Sharma, R., Saini, A., Giri, A., & Puri, S. (2017). Development of a portable
milk fat testing device using NIR spectroscopy. International Journal of Current Engineering and
Technology, 7(6), 2144-2149.
[4]AOAC. 2000. Official Methods for Analysis. 17th ed. AOAC Interna tional, Gaithersburg, MD.
Barbano, D. M., and J. L. Clark. 1989. Infrared milk analysis Challenges for the future. J. Dairy Sci.
72:1627–1636.
[5] RupakChakravarty, a paper on‖ IT at Milk collection centers in Cooperative Diaries: The
National Dairy Development Board Experience‖, pp.37-47.
[6]Vasudha V Ayyannawar and Soumya R Metri, “Detection of Fat in Milk Using Photoconductivity
and Color Detection Technique”, ICT Analysis and Applications, pp. 399–410, Springer, 2020.
[7]ADR(ArbeitsgemeinschaftDeutscherRinderzüchtere.V.)
(2013)[Link],Bonn,Germany
AliAKA,ShookGE(1980)Anoptimumtransformationforsomaticcell concentrationinmilk.
JDairySci63:487–490
[8]Albanese D., Visintainer R., Merler S., Riccadonna S., Jurman G. & Furlanello C. 2012. mlpy:
Machine Learning Python. arXiv:1202-6548v2 Balaban M.E. & Kartal E. 2018. Veri madenciliği ve
makine öğrenmesi temel algoritmaları ve R Dili ile Uygulamalar, 2. Basım, Çağlayan Kitap &
Yayıncılık & Eğitim, İstanbul, Türkiye, pp. 48-72.
[9]Galloway, J.A., 2000. Great fare of London. The Lancet, 355, pp.323–324.
[10]Prasanth, P.; Viswan, G.; Bennaceur, K. Development of a low-cost portable spectrophotometer
for milk quality analysis. Mater. Today 2021, 46, 4863–4868. [CrossRef]