0% found this document useful (0 votes)
257 views5 pages

Smart Disease Prediction Using Machine Learning

Disease Prediction exploitation Machine Learning may be a system that predicts the illnesssupported data or the symptoms he/she enters into the system and provides the prescription for that illness and conjointly provides the correct results supported that information
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
257 views5 pages

Smart Disease Prediction Using Machine Learning

Disease Prediction exploitation Machine Learning may be a system that predicts the illnesssupported data or the symptoms he/she enters into the system and provides the prescription for that illness and conjointly provides the correct results supported that information
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 5

Volume 6 , Issue 6 , June – 2021 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Smart Disease Prediction Using Machine Learning


Nivethitha. A Pramoth Krishnan. T
Department of Computer Science Engineering Department of Computer Science Engineering
Sri Ramakrishna Engineering College Sri Ramakrishna Engineering College
Coimbatore, India Coimbatore, India

Narendran. G
Assistant Professor
Department of Computer Science Engineering
Sri Ramakrishna Engineering College
Coimbatore, India

Abstract:- Disease Prediction exploitation Machine treatment is sometimes attributed to right and correct
Learning may be a system that predicts the illness- identification. The project disease prediction using machine
supported data or the symptoms he/she enters into the learning is developed to beat general disease in earlier stages
system and provides the prescription for that illness and as we all know in the competitive environment of economic
conjointly provides the correct results supported that development the mankind has involved such a lot that he/she
information. It's a system that provides the user the ideas isn't concerned about health consistent with research there are
and tricks to keep up the health system of the user and it 40% peoples how ignores about the general disease which
provides how to seek out the illness exploitation this results in harmful disease later. the most reason for ignorance
prediction. From this technique by simply asking the is laziness to consult a doctor and time concerns the peoples
symptoms from the user and getting into within the have involved themselves such a lot that they need no time to
system and in precisely few seconds they'll tell the precise require a meeting and consult the doctor which later results in
and up to some extent the correct diseases. This illness a fatal disease. consistent with research there are 70% peoples
Prediction exploitation Machine Learning has totally in India suffers from general disease and 25% of peoples face
finished the assistance of Machine Learning and Python death thanks to early ignorance the most motive to develop
artificial language with Tkinter Interface for it and this project is that a user can sit at their convenient place and
conjointly exploitation the dataset that's obtainable have a check-up of their health the UI is meant in such an
antecedent by the hospital's exploitation that we are going easy way that everybody can easily operate it and may have a
to predict the illness. The info entered by the user holds check-up.
on within the information.
II. RELATED WORK
Keywords:- Machine Learning, Symptoms based disease
prediction, Python, Decision Tree, Random Forest, KNN, Many researchers have used machine learning
Naïve Bayes. techniques like KNN, Naïve Bayes and Decision trees to
develop disease Prediction stratergies. Sathyabama
I. INTRODUCTION Balasubramanian, Balaji Subramani discussed the system to
reduces the multiple diseases showing the similar symptoms
With the increase in the number of patients and diseases problem and it will increase the accuracy of such diagnosis. It
per annum medical system is overloaded and with time has received 71.53% accuracy. Aditya Arya, Sudhanshu,
became overpriced in many countries. Most of the disease Rohan Agarwal, attempted to show and visualized the result
involves a consultation with doctors to urge treatment. With of our study and this project. By comparing with other
sufficient data prediction of disease by an algorithm are often techniques it scores accuracy of 68.5%.Iqra anjum,
very easy and cheap. Prediction of disease by watching the Mohammed Afreed, Mohammed Kalam has developed a
symptoms is an integral part of treatment. In our project, system which predicts the disease based on the information or
we've tried to accurately predict a disease by watching the the symptoms he/she enter into the system and provides the
symptoms of the patient. The 4 different algorithms for this accurate results based on that information.
purpose and gained an accuracy of 92-95%. Such a system
can have a really large potential in medical treatment of the Raj H. Chauhan, Daksh N. Naik, Rinal A. Halpati,
longer term. An intelligently interface to encourage Sagarkumar J. Patel, Mr. A.D.Prajapati developed a system
interaction with the framework. We've additionally tried to analyzes the symptoms provided by the user as input and gives
signify and visualized the results of our study and this project. the probability of the disease as an output Disease Prediction
Currently, a day’s doctors square measure adopting several is done by implementing the Decision tree Classifier. Decision
scientific technologies and methodology for each tree Classifier calculates the probability of the disease. With
identification and identification not solely common sickness, big data growth in biomedical and health care communities,
however additionally several fatal diseases. The prosperous

IJISRT21JUN313 www.ijisrt.com 330


Volume 6 , Issue 6 , June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
accurate analysis of medical data benefits early disease and it also provides the probability estimation of the system
detection, patient care. such it shows the assorted probability like how the system
behaves when there are n number of predictions are done and
III. EXISTING SYSTEM it also does the recommendations for the patients from their
final result and also from their symptoms prefer it can show
Since the arrival of advanced computing, the doctors what to use and what to not use from the given datasets and
still needs the technology in numerous possible ways that like therefore the final results. Here we've got combined the
surgical illustration method and X-ray photography, however structure and unstructured form to data for the general risk
the technology perceptually stayed behind. The strategy still analysis that's required for doing the prediction of the disease.
needs the doctor’s data and experience because of different Using the structured analysis, we will identify the chronic
factors ranging from medical records to weather, atmosphere, kinds of disease in an exceedingly particular region and
pressure level and various different factors. The massive particular community. In unstructured analysis we select the
numbers of variables are granted as entire variables that are features automatically with the help of algorithms and
needed to grasp the whole operating method itself, still, no techniques. This technique takes symptoms from the user and
model has analyzed with success. To tackle this downside, predicts the disease accordingly supported the symptoms that
Medical decision support systems should be used. This it takes and also from the previous datasets, it also helps in
technique will assist the doctors to form the right decision. continuous evaluation of viral diseases, heart rate, blood
We tend to are applying machine learning to maintained pressure, sugar level and far more which is within the system
complete hospital knowledge Machine learning technology and along with other external symptoms its predicts the
that permits building models to get quickly analyze acceptable and accurate disease and it gives the prescription
knowledge and deliver results quicker, with the utilization of details for that disease. And also the data entered by the user
machine learning technology doctors will create an enormous are stored within the created database.
decision for patient diagnoses and treatment selections, that
results in improvement of patient care services. When doing V. MODULE DISCRIPTION
the analysis and comparison of all the algorithms and
theorems of machine learning we’ve return to conclusion that The overall proposed system is classified into five
everyone those algorithms like decision Tree, KNN, Naïve modules.
Bayes, Regression and Random Forest algorithm all are
necessary in building a sickness prediction system that ● Collection of Clinical Data
predicts the disease of the patients from that he/she is ● Data Pre-processing
suffering from and to try to do this we’ve used some ● Model Building
performance measures like ROC, KAPPA Statistics, RMSE, ● Model Building using Prescription
MEA and numerous tools. When exploitation numerous ● Database Creation
techniques like neural networks to form predictions of the
diseases and when doing that we tend to return to conclusion A. Collection of Clinical Data
that it will predict up to 90% accuracy rate when doing the This dataset could be a information database of disease-
experimentation and confirmatory the results. Existing system symptom associations generated by an automatic
will predict the disease however not the sub kind of the methodology supported data in textual discharge summaries
sickness and it fails to predict the condition of the folks, the of patients at new york presbyterian Hospital shown in Fig.1.
predictions of disease are indefinite and non-specific. The 1st column shows the disease, the second the amount of
discharge summaries containing a positive and current
IV. PROPOSED SYSTEM mention of the sickness, and therefore the associated
symptom. Associations for the a hundred and fifty most
The proposed system of disease prediction using frequent diseases supported these notes were computed and
machine learning is that we've got used many techniques and therefore the symptoms are shown hierarchal supported the
algorithms and every one other various tools to make a strength of association.
system which predicts the disease of the patient using the
symptoms and by taking those symptoms were comparing
with the system’s dataset that's previously available and it
gives prescription for that disease predicted by those
algorithms. By taking those datasets and comparing with the
patient’s disease we'll predict the accurate percentage disease
of the patient. The dataset and symptoms visit the prediction
model of the system where the information is pre-
processed for the future references so the feature selection is
finished by the user where he will enter the assorted
symptoms. Then the classification of these data is completed
with the help of assorted algorithms and techniques like
Decision Tree, KNN, Naïve Bayes, Random Forest etc. Then
the information goes within the recommendation model, there
it shows the risk analysis that's involved within the system Fig.1 Clinical Data

IJISRT21JUN313 www.ijisrt.com 331


Volume 6 , Issue 6 , June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
B. Data- Preprocessing input attributes without variable deletion. Accuracy and
As, the information pre-processing is a vital step in variable importance data will be supplied with the results. A
machine learning; we have a tendency to, thus removed all random forest is that the classifier consisting of a group of
those variables that contained over 50% missing price. The tree structured classifiers k, wherever the k is severally,
strategy used the MedLEE natural language processing identically distributed random trees and every random tree
system to get UMLS codes for diseases and symptoms from incorporates the unit of vote for classification of input.
the notes Fig.2.Then applied mathematics ways supported
frequencies and co-occurrences were used to acquire the
associations.

Fig.4 Random Forest Function

Definition of randomforest() function. “pred2” is used to


store the predicted disease using random forest algorithm
shown in Fig.4. RandomForestClassifier() is used to train the
model and predict the disease on testing dataset according to
symptoms entered by the user. Final disease for random forest
is stored in a variable named “pred2”. Accuracy of predicting
the disease is calculated using accuracy_score and confusion
matrix is created using confusion_matrix which are imported
from sklearn.metrices.

3) K-nearest Neighbour
K-Nearest Neighbour is one amongst the best Machine
Learning algorithms supported supervised Learning
technique. K-NN rule stores all the available knowledge and
classifies a replacement information supported the similarity.
Fig.2 UMLS Codes this suggests once new knowledge seems then it may be
simply classified into a well suite category by exploitation K-
C. Model Building NN algorithmic rule. K-NN rule may be used for Regression
The predictive classifier models were developed for also as for Classification however principally it's used for the
accurately identify Disease given by the user. The Classification issues. It works by finding a pattern in
classification model for predict the Disease is Random Forest knowledge that links knowledge to results and it improves
(RF), Decision Tree, K Nearest Neighbour, Naïve Bayes upon the pattern recognition with each iteration.
Algorithm.

1) Decision Tree
Decision tree induction is the learning of decision trees
from class-labelled training tuples. A decision tree is a
flowchart-like tree structure, Decision tree algorithms are
quite strong to the presence of noise, particularly once Fig.5 KNN funtion
strategies for avoiding overfitting.
Definition of KNN() function. “pred4” is used to store
the predicted disease using kNearestNeighbour algorithm
shown in Fig.4.

4) Naïve Bayes Algorithm


Fig.3 Decision Tree Funtion NaiveBayes is used to predict the categorical class
labels. It classifies the category information supported the
DecisionTreeClassifier() is used to train the model and training set and also the values during a classifying attribute
predict the disease on testing dataset according to symptoms and uses it in classifying new data. It could be a two-step
entered by the user. Final disease for decision tree is stored in process model Construction and Model Usage. This Bayes
a variable named “pred1” shown in Fig.3. Accuracy of theorem is called when Thomas Bayes and it's technique
predicting the disease is printed using accuracy score and method for classification and supervised learning method. It
confusion matrix is created using confusion matrix which are will solve both categorical and continuous values attributes.
imported from sklearn metrices.

2) Random Forest
Random forest (RF) comes under the ensemble
classification algorithm which is composed of a large number
of decision trees. The algorithm can handle thousands of Fig.6 Naïve Bayes Function

IJISRT21JUN313 www.ijisrt.com 332


Volume 6 , Issue 6 , June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Definition of NaiveBayes() function. “pred3” is used to
store the predicted disease using Naïve Bayes algorithm VI. RESULTS
shown in Fig.6.
The feature extracted data is further evaluated to predict
D. Model Building Using Medicine and Prescription the disease by using the classifiers such as Random forest and
Medicine Prescription is one of the most important thing Naïve Bayes Classifiers, Decision tree, K-Nearest Neighbour.
in everyone’s life. Collecting the medicine and precaution On comparing the four machine learning algorithms, K-
dataset from textual discharge summaries of patients at New Nearest Neighbour shows 95.6%, Naïve Bayes Classifiers
York Presbyterian Hospital shown in Fig.7. When disease is shows 94.5%, Random Forest model has the highest accuracy
predicted medicine and precaution for that disease is displayed of 95.7% than the Decision Tree algorithm shows 92.4%. The
in GUI page. results were evaluated with accuracy, sensitivity, specificity,
positive predictive value and negative predictive value. The
final accuracy of each model was shown in Fig. 10.

Fig.7 Clinical Dataset for Medicine and prescription

1) Database Creation
The database is created by using SQLite to store the
details entered by the user in the GUI page shown in Fig.9. Fig.10 Final Accuracy of each model

VII. CONCLUSION

To conclude, this project disease prediction


victimization machine learning is extremely a lot of helpful in
everyone’s day to day life and it's in the main additional vital
for the health care sector, as a result of they're the one that
daily uses these systems to predict the diseases of the patients
supported their general data and there symptoms and provides
the drugs and precaution for that disease that they're been
through. Our system is useful to those folks that are
Fig.8 Database Code continuously worrying concerning their health and that they
got to understand what happens with their body. Our main
shibboleth to develop this method is to grasp them for his or
her health. Especially, folks that are littered with
psychological state like depression, anxiety. they will start up
of those issues and might live their daily lives simply. On a
median we tend to achieved accuracy of ~95%. Such a
system will be mostly reliable to try to the task. making this
method we tend to additionally side some way to store the
information entered by the user within the information which
may be utilized in future to assist in making higher version of
such system. Our system additionally has a simple to use
interface. It additionally has numerous visual illustration of
information collected and results achieved.

Fig.9 Database Storage

IJISRT21JUN313 www.ijisrt.com 333


Volume 6 , Issue 6 , June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
REFRENCES

[1]. Pingale, Kedar, et al. "Disease Prediction using


Machine Learning." (2019).Mr. Chala Beyene, Prof.
Pooja Kamat, “Survey on Prediction and Analysis the
Occurrence of Heart Disease Using Data Mining
Techniques”, International Journal of Pure and Applied
Mathematics, 2018.
[2]. Pingale, K., Surwase, S., Kulkarni, V., Sarage, S., &
Karve, A. (2019). Disease Prediction using Machine
Learning.
[3]. Aiyesha Sadiya, Differential Diagnosis of Tuberculosis
and Pneumonia using Machine Learning(2019).
[4]. S. Patel and H. Patel, “Survey of data mining techniques
used in healthcare domain,” Int. J. of Inform. Sci. and
Tech., Vol. 6, pp. 53-60, March, 2016.
[5]. Balasubramanian, Satyabhama, and Balaji
Subramanian. "Symptom based disease prediction in
medical system by using Kmeans algorithm."
International Journal of Advances in Computer Science
and Technology 3.
[6]. Dhenakaran, K. Rajalakshmi Dr SS. "Analysis of Data
mining Prediction Techniques in Healthcare
Management System." International Journal of
Advanced Research in Computer Science and Software
Engineering 5.4 (2015)
[7]. Jin Ma, Sung Chan Park, Jung Hun Shin, Nam Gyu
Kim, Jerry H. Seo, Jong Suk Ruth Lee, Jeong Hwan Sa.
"AI based intelligent system on the EDISON platform",
Proceedings of the 2018 Artificial Intelligence and
Cloud Computing Conference on ZZZ - AICCC '18,
2018
[8]. W. Yin and H. Schutze, "Convolutional neural network
for paraphrase identification", Proc. HLT-NAACL, pp.
901-911, 2015.
[9]. Shadab Adam et al., "Prediction system for Heart
Disease using Naïve Bayes", International Journal of
advanced Computer and Mathematical Sciences, vol. 3,
no. 3, pp. 290-294, 2012, ISSN 2230-9624.
[10]. J.R. Qulan, "Induction of Decision Trees", Mach.Learn,
vol. 1, no. 1, pp. 81-10, Mar. 1986.
[11]. Sayantan Saha, Argha Roy Chowdhuri et al., "Web
Based Disease Detection System", IJERT, vol. 2, no. 4,
April 2013, ISSN 2278-0181.
[12]. Palli Suryachandra and Venkata Subba Reddy,
"Comparison of Machine Learning algorithms For
Breast Cancer".
[13]. Andrew Alikberov, Stephan Broadly et al., "The
Learning Machine", [online] Available:
https://www.thelearningmachine.ai.
[14]. M. Chen, Y. Hao, K. Hwang, L. Wang and L. Wang,
"Disease prediction by machine learning over big data
from healthcare communities", IEEE Access, vol. 5, no.
1, pp. 8869-8879, 2017.
[15]. Disease and symptoms Dataset, [online] Available:
www.github.com.

IJISRT21JUN313 www.ijisrt.com 334

You might also like