Implementing Logistic Regression For Iris Using Sklearn and Checking The Accuracy Using Confusion Matrix
Implementing Logistic Regression For Iris Using Sklearn and Checking The Accuracy Using Confusion Matrix
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
\
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
target
0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sepal length (cm) 150 non-null float64
1 sepal width (cm) 150 non-null float64
2 petal length (cm) 150 non-null float64
3 petal width (cm) 150 non-null float64
4 target 150 non-null float64
dtypes: float64(5)
memory usage: 6.0 KB
None
In [5]: # Summary statistics
print(data.describe())
print(data[['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'p
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
\
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
species
0 setosa
1 setosa
2 setosa
3 setosa
4 setosa
Data Preprocessing
Feature Selection
Splitting Features and Target
In [10]: # Features
X = data.iloc[:, 0:4].values # or data[['sepal length (cm)', 'sepal width (
# Target
y = data['target'].values
Feature Scaling
Parameters Explained:
Out[14]: ▾ LogisticRegression
LogisticRegression(max_iter=200, multi_class='multinomial')
Making Predictions
In [15]: # Predict the classes for the testing set
y_pred = model.predict(X_test)
In [16]: # Create a DataFrame to compare actual and predicted values
comparison = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
print(comparison)
Actual Predicted
0 1.0 1.0
1 0.0 0.0
2 2.0 2.0
3 1.0 1.0
4 1.0 1.0
5 0.0 0.0
6 1.0 1.0
7 2.0 2.0
8 1.0 1.0
9 1.0 1.0
10 2.0 2.0
11 0.0 0.0
12 0.0 0.0
13 0.0 0.0
14 0.0 0.0
15 1.0 1.0
16 2.0 2.0
17 1.0 1.0
18 1.0 1.0
19 2.0 2.0
20 0.0 0.0
21 2.0 2.0
22 0.0 0.0
23 2.0 2.0
24 2.0 2.0
25 2.0 2.0
26 2.0 2.0
27 2.0 2.0
28 0.0 0.0
29 0.0 0.0
print(cm)
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]
In this output:
Row 0 (Actual class 0 - setosa): 10 correctly predicted as setosa.
Interpreting Results
Accuracy Score
In [19]: # Computing the Accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy * 100:.2f}%')
Accuracy: 100.00%
Classification Report
accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30