PROJECT REPORT
Clinical Test Analysis for Alzheimer's
Disease Using Machine Learning Models
BT21BTECH11008
SRI CHARVI SALAPU
1
ABSTRACT
Alzheimer’s disease(AD) is a progressive neurological disorder which causes
cognitive decline, impairment in daily functioning and memory loss. It is said to be
characterized by the neurofibrillary tangles in the brain and the accumulation of
amyloid plaques which leads to the death of neurons and hence the breakdown of
neural connections. This project aims to analyze the clinical test data of the
patients over a three year period to identify the significant clinical features that
contribute to the progression of Alzeimer’s disease using machine learning and
make classifiers using those features.
➢ ABOUT AD
Alheimer’s disease is a progressive neurodegenerative disorder which
affects the behavior, memory, ability to think and perform everyday activities. It is
caused due to the accumulation of amyloid plaques and neurofibrillary tangles in
the brain leading to breakdown of the neural connections. It’s main symptoms are
memory loss, cognitive decline, behavioral changes, language problems that are
trouble to find the right words to form sentences and make conversations,
disorientation and impaired judgment.
The cause and risk factors for AD are age, genetics, lifestyle and heart health.
2
➢ Elaboration on AD,CN,MCI.
In the context of this project the group of patients have been classified into
three classes which are AD, CN, and MCI.
[Link](Alzheimer’s Disease):
Individuals with AD typically experience the difficulty in language,
gradual memory loss, behavioral changes.
2. CN(Cognitively normal):
CN individuals are those who exhibit normal cognitive functionality
without any symptoms or any significant impairment and this class often
serves as a baseline comparison class.
[Link] (Mild Cognitive Impairment):
MCI individuals experience mild but noticeable cognitive decline which is
more than expected for that particular individual’s age.
➢ Data Analysis: Box Plots and Statistical Tests
For Visualization and Statistical Comparison of Clinical Features
3
5
➢ Box-Plots validated with t-test for Group AD.
6
7
➢ Box-Plots validated with t-test for Group MCI.
9
➢ Box-Plots validated with t-test for Group CN.
10
11
➢ Interpretation of the boxplots:
The boxplots gives us a visual representation of the change in clinical
features over time and how they differ among the different
classes(AD,CN,MCI).
From the plots we can observe that in AD group there some noticeable
deviations in various clinical features from the initial screening(sc) to the 36
months (m36) visit which indicates that there is a progression of the
Alzheimer’s disease.
Where as in CN and MCI groups the changes in clinical features are less
pronounced than compared to AD group but still some slight variations can
be seen particularly in the MCI Group which indicates the mild or early
stages of cognitive impairment while AD group exhibits more significant
changes suggesting a more advanced stage of cognitive decline.
And the t-test provides the statistical analysis, a quantitative measure of
differences for each clinical feature, specifically focusing on AD group.
As we know, in statistical analysis p value helps us to determine the
significance of the results and the null hypothesis states that there is no
effect or difference. So the low p-value(p<0.05) suggests strong evidence
against the null hypothesis which means the observed effect is statistically
significant. And if we get the high p-value(p>0.05) it means it fails to reject
the null hypothesis that is the observed difference is not statistically
significant.
So the less p-values that we got for all the clinical features in AD group
between sc and m36 visits indicates that the changes observed in the
boxplots are not due to any random variation but suggests that there are true
changes in the clinical features over time. Hence these significant variations
tell us about the progression of AD and the importance of the features in
monitoring AD progression.
12
➢ Developing classifiers for classes using ML models.
After performing the visualization and statistical analysis using box plots
and t- test, we observed significant changes in the clinical features that
contribute to Alzheimer’s disease progression. By utilizing the info we can
now apply different machine learning models to the data, perform the feature
selection, and the train the models to develop robust classifiers for the
classification between AD,CN and MCI classes.
We used the supervised machine learning models for our data as we have the
labeled data(AD,CN and MCI) and our objective is to classify the
progression of Alzheimer’s disease into specific categories.
➢ MODELS USED:
● Classification models:
1. K - Nearest Neighbors(KNN):
KNN is a classification algorithm that classifies the new data
points based on the majority class of its K nearest neighbors.
The k values is a hyperparameter that determines the number of
Neighbors to consider.
2. Logistic regression:
Logistic regression is linear classification algorithm widely used for
binary classification tasks.I t models the relationship between the
dependent variable and one or more the independent variables
using sigmoid [Link] estimates the probability that given input
belongs to a particular class.
3. Support Vector Machine(SVM):
SVM is powerful classification algorithm that find hyperplane that
Best separates the data into particular [Link] the kernel tricks, it
performs both the linear and non-linear classification by transforming the
input space into higher dimensions.
13
● Ensemble Models:
Ensemble learning involves combining multiple models to improve
performances. Tree based ensembles leverage decision trees as
base learners.
1. Random Forest:
Random forest classifier combines multiple decision trees to create
a robust classification model. It is worked by construction a set of
decision trees in the forest and independently classifies the input,
and features and while prediction individual tree in the forest
Independently classifies the input and the final output is
determined by the majority voting or averaging the predictions of all
the individual trees.
2. XGBoost:
It is an scalable and efficient implementation of the gradient
boosting framework by setting the limits of computing power.
It builds the individual trees sequentially, where each new tree
corrects the errors that are made by the previous trees and to
to avoid overfitting it uses regularization.
3. Gradient Boosting:
Gradient Boosting classifier is an ensemble model that sequentially
builds models usually decision trees and combines them to create
a strong classifier. It trains each new models to focus on reducing
the errors of the combined model gradually improving the overall
Prediction.
14
● Tree based models:
1. Decision Tree Classifier:
It is a non linear classifier that splits the data subsets based on
feature values using a decision tree structure, making decisions at
each [Link] splits the dataset into subsets based on the value of
different features and assigns class labels to the leaf nodes. They
are interpretable, easy to understand, and can handle multi class
classification tasks.
➢ Feature Importance analysis:
Performed feature importance analysis using various supervised
machine learning models to identify the most significant clinical
features those contribute to the progression of AD.
In our feature importance analysis, the CDRSB and MMSE features
are purposefully excluded due to their fundamental role in defining
the classification of AD,CN, and MCI [Link] are
recognised as primary assessments for [Link] these
ensures that the classifiers are robust and provides information into
disease progression beyond the primary diagnostic criteria.
Below are the bar graphs for visualization of the feature importance
Analysis.
While each model may give different features as the most important ,
We can identify some common features which are consistently highly
ranked in the multiple models.
The consistently important features across the various models are:
1. FAQ: It is frequently highly rated feature by all models which
indicates that it is a crucial model.
2. ADAS13: Important feature in Random Forest, XGBoost, and
Gradient Boosting.
3. LDELTOTAL: Highly ranked feature in XGBoost, Gradient
Boosting, and Decision Tree.
4. ADAS11: Frequently appeared feature in the top for Logistic
regression, SVM, and Gradient Boosting.
5. ADASQ4: Important in Logistic Regression , XGBoost and SVM.
15
➢ Visualization of feature importance analysis:
16
➢ METRICS USED FOR MODEL COMPARISON:
1. Train accuracy scores: It measures the accuracy of the model by
comparing the predicted labels of the model with the actual labels
of the training data and tells how well does the model fits the
training data, a high accuracy score means the model is performing
well.
2. Test accuracy scores: It measures the accuracy of the model on
the unseen data. It uses a separate dataset for calculating which
was not used during training model.
3. Precision: Precision is said to be the ratio of true positives to the
sum of true positives and false positives. It measures how well the
model is avoiding the false positive [Link] precision tells
that model is making less mistakes in prediction of positive class.
4. Recall: It is also said to be sensitivity and is the ratio of true
positives to the sum of true positives and false negatives. High
recall indicates that m model takes most of the positive instances
with fewer false negatives.
5. F1 Score: It is the harmonic mean of recall and precision which
provides a single metric that balances both the concerns. A high
F1 score means the model has both high precision and recall.
6. ROC-AUC: It tells about the model’s ability to distinguish between
classes by evaluating the trade off between true and false positive
rate. A high ROC score indicates the model’s better performance
in distinguishing between positive and negative classes.
7. Confusion matrix: It provides detailed insights on how the model’s
predictions are compared to the actual values. It helps to identify
the type of errors the model is making. It is a matrix which shows
the number of true positives, true negatives, false positives and
false negatives.
17
➢ Model comparison results:
Model Train accuracy score Test accuracy score Precision Recall F1 score
Logistic 76.63% 78.57% 0.80 0.79 0.79
Regression
Decision Tree 99.30% 66.33% 0.67 0.66 0.67
Random Forest 99.30% 75.26% 0.75 0.73 0.74
SVM 76.63% 79.34% 0.80 0.77 0.78
XGBoost 97.83% 72.96% 0.73 0.71 0.72
Gradient Boosting 86.72% 76.28% 0.76 0.74 0.75
K-Neighbors 81.55% 75.00% 0.74 0.73 0.73
➢ ROC-AUC Curves:
18
➢ CONFUSION MATRIX:
19
➢ CONCLUSION:
The given table presents the results of various classification models that are
tested on the dataset. The following points can be deduced based on their
performance from the ROC curve, accuracy, precision,recall and F1 score.
● SVM and Logistic Regression presents the highest overall performance it
demonstrates the best balance between training(76.63% for both) and test
accuracy sets(79.34% for SVM and 78.57% for Logistic Regression) with
high ROC-AUC values indicating robust generalization ability.
● Gradient Boosting also shows as a strong performer with an accuracy of
76.28% with balanced precision and recall scores across classes. Also with
an ROC-AUC of 0.911, suggests effective class separation.
● Decision Tree and Random Forest exhibit signs of overfitting, with high
training accuracies(99.30%) compared to the test accuracies(66.33% and
75.26%).Both models show lower ROC-AUC scores(0.738 for Decision tree
and 0.903 for Random Forest) which indicates poor discrimination ability.
● While XGBoost and K-Neighbors have reasonable test accuracy and ROC-
AUC scores, but they don’t outperform the SVM.
In summary, the results show that SVM, Logistic Regression, and Gradient
Boosting are the top performing models for our dataset, in terms of accuracy,
precision,recall, and F1 scores across multiple classes, but further analysis is
needed to determine the optimal model based on the preferred balance between
precision and recall specific to this dataset.
20.