Capstone Notes-Model
Capstone Notes-Model
Windows User
5/15/2022
Table of Contents
List of Figures:.......................................................................................................................................................................2
1. Model building and interpretation................................................................................................................................3
a) Build various models (You can choose to build models for either or all of descriptive, predictive or prescriptive
purposes):......................................................................................................................................................................... 3
a) Logistic Regression Model using Sklearn:..............................................................................................................3
b) Test your predictive model against the test set using various appropriate performance metrics............................5
c) Interpretation of the model(s):.................................................................................................................................6
2). Model Tuning and business implication...........................................................................................................................6
a) and b) Ensemble modeling, wherever applicable:.....................................................................................................6
i) Decision Tree Model:.............................................................................................................................................6
ii) Random Forest Model:..........................................................................................................................................8
iii) ANN Model:.........................................................................................................................................................10
c) Interpretation of the most optimum model and its implication on the business....................................................13
Strategy for the upcoming matches....................................................................................................................................13
1 Test match with England in England. All the match are day matches. In England, it will be rainy season at the
time to match..................................................................................................................................................................13
Output of the Model:..........................................................................................................................................15
2 T20 match with Australia in India. All the match are Day and Night matches. In India, it will be winter season at
the time to match............................................................................................................................................................15
Output of the Model:..........................................................................................................................................16
2 ODI match with Sri Lanka in India. All the match are Day and Night matches. In India, it will be winter season at
the time to match............................................................................................................................................................17
Output of the Model:..........................................................................................................................................18
1
List of Figures:
2
3
1. Model building and interpretation
a) Build various models (You can choose to build models for either or all of descriptive,
predictive or prescriptive purposes):
PRECAP:
a) In continuation with Notes-1, in this notes we will be creating a model that predict the performance of Team
Indai against the Opponents.
b) Based on the inputs from the EDA performed, it is decided to remove the unwanted variables likes “Game
Number, ‘Wicket_Keeper” and “ Audience_Number” (Based on the Boxplot and EDA Analysis, we found
audience number has no considerable impact on the Result.)
c) So in this section we will build four models “Decision Tree, Random Forest, ANN and Logistic Regression(both
sklearn and stats) and will evaluate the best model based on the model metrics.
d) All the ‘Object’ variables are encoded using ‘One hot encoding method’ and the target variable is encoded using
‘Label Encoder’ method.
e) For the model building, performed train test split is done in the ratio of 70:30
Imp Note: I have built the model on the data without splitting the dataset based on the Match format type.
This is because, in one of the problem statements, it asked to provide the winning strategy of team India
against Australia in T20. But as per the source data set we don’t have any records of India playing with
India so splitting the data set based on format wise will not give the accurate the results. So, build model
without splitting the data.
Using the best Params, found the best model and below are the best parameters. L2-penalty, saga-solver and
tolerance of 1e-05 are the best parameters and the prediction is made using this model.
4
Figure 2 – Logistic Model – Best model Parameters
Classification Report:
The model has an accuracy of 87% on the train data. Correspondingly, precision= 0.88, recall =0.98, f1= 0.9
The AUC of the model on Train data is 84.36% on the train data. Below is the ROC curve of the Train data
5
b) Test your predictive model against the test set using various appropriate
performance metrics
Classification Report:
The model has an accuracy of 87% on the train data. Correspondingly, precision= 0.89, recall =0.97, f1= 0.93
The AUC of the model on Train data is 84.32% on the test data. Below is the ROC curve of the Test data
6
c) Interpretation of the model(s):
Based on the metrics, from train and test data looks like model is stable with an accuracy of 87%. So model doesn’t
look like overfit or underfit. Even the precision values are high with 88%. Hence the model seems good. But we can
cross validate the metrics by building some more models.
'min_samples_split': [150,300,450],
After multiple iterations best Parameters are identified to build the model and below are the best parameters..
The model has an accuracy of 85% on the train data. Correspondingly, precision= 0.87, recall =0.97, f1= 0.91
The AUC of the model on Train data is 78.70% on the train data. Below is the ROC curve of the Train data
Classification Report:
The model has an accuracy of 84% on the train data. Correspondingly, precision= 0.86, recall =0.97, f1= 0.91
8
Figure 15 – Decision Tree Model – Classification Report of Test Data
The AUC of the model on Train data is 75.40% on the Test data. Below is the ROC curve of the Test data
9
Performance of the Random Forest Model on the Train Data:
Confusion Matrix:
Classification Report:
The model has an accuracy of 84% on the train data. Correspondingly, precision= 0.84, recall =1, f1= 0.91
The AUC of the model on Train data is 84.98% on the train data. Below is the ROC curve of the Train data
10
Classification Report:
The model has an accuracy of 83% on the train data. Correspondingly, precision= 0.83, recall =1, f1= 0.91
The AUC of the model on Train data is 83.11% on the Test data. Below is the ROC curve of the Test data
Classification Report:
The model has an accuracy of 87% on the train data. Correspondingly, precision= 0.88, recall =0.98, f1= 0.93
The AUC of the model on Train data is 84.30% on the train data. Below is the ROC curve of the Train data
12
Performance of the ANN Model on the Test Data:
Confusion Matrix:
Classification Report:
The model has an accuracy of 84% on the train data. Correspondingly, precision= 0.89, recall =0.97, f1= 0.93
The AUC of the model on Train data is 84.07% on the train data. Below is the ROC curve of the Train data
13
c) Interpretation of the most optimum model and its implication on the business
As mentioned above all the models are built on the train and test dataset. The metrics of each model are
compared based on the Accuracy, precision and recall values. Below Figure is the comparison of metrics
between the models.
On comparison, it is observed that Accuracy is high in ANN and Logistic Regression models. Also, the precision is
high in these two models compared to Decision Tree and Random Forest models. So being a binomial target
variable, I have opted for Logistic Regression to build the strategy.
In the excel sheet, since the one-hot encoding is done on the object variables, the parameters mentioned in the
problem statement are marked as ‘1’. Rest of them is marked with 0’s and 1’s as per the strategy plan.
Opponent_England England 1
Match_Format_Test Test 1
Match_light_Type_Day Day 1
Offshore_Yes England 1
Season_Rainy Rainy 1
14
The variables used for the model building are show in the below Figure.
By fixing the problem variables, rest of the variables are changed to build enough strategy and a csv test file is
built to predict the output using the Logistic Regression model.
Match_li Max_wic Max_wic Max_wic
Max_run Extra_bo Min_run_ Min_run_ Max_run extra_bo player_hi ght_type Match_li Match_fo Bowlers_ Bowlers_ Bowlers_ Bowlers_ All_roun All_roun All_roun First_sel Opponen Opponen Opponen Opponen Opponen Opponen Opponen ket_take ket_take ket_take Players_s Players_s Players_s player_hi player_hi player_hi player_hi
Avg_tea _scored_ wls_bowl given_1o scored_1 _given_1 wls_opp ghest_ru _Day and ght_type Match_fo rmat_Tes in_team_ in_team_ in_team_ in_team_ der_in_t der_in_t der_in_t ection_B t_Bangla t_Englan Opponen t_Pakista t_South t_Srilank t_West t_Zimbab Season_S Season_ Offshore n_1over_ n_1over_ n_1over_ cored_ze cored_ze cored_ze ghest_wi ghest_wi ghest_wi ghest_wi
m_Age 1over ed ver over over onent n Night _Night rmat_T20 t 2.0 3.0 4.0 5.0 eam_2.0 eam_3.0 eam_4.0 owling desh d t_Kenya n Africa a Indies we ummer Winter _Yes 2 3 4 ro_2 ro_3 ro_4 cket_2 cket_3 cket_4 cket_5
30 11 24 3 2 6 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0
50 18 22 2 3 12 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 1 0 0 0 1 0
50 20 27 2 2 6 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 1 0
50 15 10 2 2 10 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 1 0 0 0
50 19 10 6 4 6 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 1 0 0
50 13 8 1 3 6 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0
50 20 6 0 3 6 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0
50 15 10 3 2 10 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 0 0
50 20 17 2 3 17 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0
50 20 27 2 2 6 15 10 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 1 0
Figure 33 – Actual Test Data to predict the Winning Strategy against England
15
Output of the Model:
The model has predicted the output given an array of results. The value ‘1’ represents Team India wins and ‘0’
represents loss.
Ma tch_l i g
ht_type_D Ma tch_l i g Opponent Opponent Opponent Opponent Bowl ers _i Bowl ers _i Bowl ers _i Bowl ers _i Al l _round Al l _round Al l _round Fi rs t_s el ec Ma x_run_s Mi n_run_s extra _bow Ma x_wi cke Ma x_wi cke Ma x_wi cke Pl a yers _s c Pl a yers _s c Pl a yers _s c pl a yer_hi g pl a yer_hi g pl a yer_hi g pl a yer_hi g
Res ul ts _Pr a y a nd ht_type_Ni Ma tch_for Ma tch_for _Ba ngl a de Opponent Opponent Opponent _South Opponent _Wes t _Zi mba bw Sea s on_Su Sea s on_W Offs hore_Y n_tea m_2. n_tea m_3. n_tea m_4. n_tea m_5. er_i n_tea er_i n_tea er_i n_tea ti on_Bowl i Extra _bowl Avg_tea m_ cored_1ov Mi n_run_g cored_1ov Ma x_run_g l s _oppone pl a yer_hi g t_ta ken_1 t_ta ken_1 t_ta ken_1 ored_zero ored_zero ored_zero hes t_wi ck hes t_wi ck hes t_wi ck hes t_wi ck
ed Ni ght ght ma t_T20 ma t_Tes t s h _Engl a nd _Kenya _Pa ki s ta n Afri ca _Sri l a nka Indi es e mmer i nter es 0 0 0 0 m_2.0 m_3.0 m_4.0 ng s _bowl ed Age er i ven_1over er i ven_1over nt hes t_run over_2 over_3 over_4 _2 _3 _4 et_2 et_3 et_4 et_5
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 24 30 11 3 2 6 15 10 0 0 1 0 0 1 0 0 1 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 22 50 18 2 3 12 15 10 0 0 1 0 1 0 0 0 1 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 27 50 20 2 2 6 15 10 0 0 1 1 0 0 0 0 1 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 10 50 15 2 2 10 15 10 0 1 0 1 0 0 1 0 0 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 10 50 19 6 4 6 15 10 0 0 1 1 0 0 0 1 0 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 8 50 13 1 3 6 15 10 0 0 0 1 0 0 1 0 0 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 6 50 20 0 3 6 15 10 0 1 0 0 0 1 1 0 0 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 10 50 15 3 2 10 15 10 0 0 1 0 1 0 1 0 0 0
0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 17 50 20 2 3 17 15 10 0 0 0 1 0 0 0 1 0 0
1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 27 50 20 2 2 6 15 10 0 0 1 1 0 0 0 0 1 0
2 T20 match with Australia in India. All the match are Day and Night matches. In
India, it will be winter season at the time to match..
To build the strategy against the Australia, an excel sheet is developed as the actual test data.
In the excel sheet, since the one-hot encoding is done on the object variables, the parameters mentioned in the
problem statement are marked as ‘1’. Rest of them is marked with 0’s and 1’s as per the strategy plan.
Opponent_England Australia 1
Match_Format_T20 T20 1
Match_light_Type_Day Day and Night 1
Offshore_Yes India 0
Season_Winter Winter 1
By fixing the problem variables, rest of the variables are changed to build enough strategy and a csv test file is
built to predict the output using the Logistic Regression model.
16
Match_li
Max_run Extra_bo Min_run_ Min_run_ Max_run extra_bo player_hi ght_type Match_li Match_fo Bowlers_ Bowlers_ Bowlers_ All_roun All_roun All_roun First_sel Opponen Opponen Opponen Opponen Opponen Opponen Opponen
Avg_tea _scored_ wls_bowl given_1o scored_1 _given_1 wls_opp ghest_ru _Day and ght_type Match_fo rmat_Tes Bowlers_in_te in_team_ in_team_ in_team_ der_in_t der_in_t der_in_t ection_B t_Bangla t_Englan Opponen t_Pakista t_South t_Srilank t_West t_Zimbab Season_S Season_
m_Age 1over ed ver over over onent n Night _Night rmat_T20 t am_2.0 3.0 4.0 5.0 eam_2.0 eam_3.0 eam_4.0 owling desh d t_Kenya n Africa a Indies we ummer Winter
30 24 31 0 2 29 10 83 1 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1
30 12 6 3 2 6 4 48 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1
30 17 20 6 3 6 0 60 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
30 16 5 1 3 6 3 62 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1
30 13 6 2 3 6 2 93 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1
30 22 21 3 3 6 16 55 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1
30 13 10 0 1 6 3 80 1 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1
30 12 6 2 3 6 3 42 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1
30 12 12 2 3 6 0 66 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1
30 11 1 5 3 6 0 32 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1
30 14 9 2 2 7 7 87 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
30 12 35 2 2 9 8 39 1 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1
30 23 28 0 3 26 15 69 1 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1
30 11 5 3 2 6 2 95 1 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1
30 12 4 2 2 6 2 83 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1
Figure 35 – Actual Test Data to predict the Winning Strategy against Australia
Result Avg_team_Age
Max_run_scored_1over
Extra_bowls_bowled
Min_run_given_1over
Min_run_scored_1over
Max_run_given_1over
extra_bowls_opponent
player_highest_run
Match_light_type_Day
Match_light_type_Night
Match_format_T20
and Night
Match_format_Test
Bowlers_in_team_2.0
Bowlers_in_team_3.0
Bowlers_in_team_4.0
Bowlers_in_team_5.0
All_rounder_in_team_2.0
All_rounder_in_team_3.0
All_rounder_in_team_4.0
First_selection_Bowling
Opponent_Bangladesh
Opponent_England
Opponent_Kenya
Opponent_Pakistan
Opponent_South
Opponent_Srilanka
Africa
Opponent_West
Opponent_Zimbabwe
Indies
Season_Summer
Season_Winter
Offshore_Yes
Max_wicket_taken_1over_2
Max_wicket_taken_1over_3
Max_wicket_taken_1over_4
Players_scored_zero_2
Players_scored_zero_3
Players_scored_zero_4
player_highest_wicket_2
player_highest_wicket_3
player_highest_wicket_4
player_highest_wicket_5
1 30 24 31 0 2 29 10 83 1 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1
1 30 12 6 3 2 6 4 48 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0
1 30 17 20 6 3 6 0 60 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0
1 30 16 5 1 3 6 3 62 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0
1 30 13 6 2 3 6 2 93 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
1 30 22 21 3 3 6 16 55 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1
1 30 13 10 0 1 6 3 80 1 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0
1 30 12 6 2 3 6 3 42 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0
1 30 12 12 2 3 6 0 66 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0
1 30 11 1 5 3 6 0 32 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0
1 30 14 9 2 2 7 7 87 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0
1 30 12 35 2 2 9 8 39 1 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 1 0
1 30 23 28 0 3 26 15 69 1 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0
1 30 11 5 3 2 6 2 95 1 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0
17
Avg_team_Age:30 – Team average age should be 30
player_highest_run:48 – Player Highest run should be above 40
Players_scored_zero_2: 1 – Duck out’s should be at most 2
2 ODI match with Sri Lanka in India. All the match are Day and Night matches. In
India, it will be winter season at the time to match.
To build the strategy against the Srilanka, an excel sheet is developed as the actual test data.
In the excel sheet, since the one-hot encoding is done on the object variables, the parameters mentioned in the
problem statement are marked as ‘1’. Rest of them is marked with 0’s and 1’s as per the strategy plan.
Opponent_Srilanka Srilanka 1
Match_Format_ODI ODI 1
Match_light_Type_Day_Night Day and Night 1
Offshore_Yes India 0
Season_Winter Winter 1
By fixing the problem variables, rest of the variables are changed to build enough strategy and a csv test file is
built to predict the output using the Logistic Regression model.
Match_li Max_wic Max_wic Max_wic
Max_run Extra_bo Min_run_ Min_run_ Max_run extra_bo player_hi ght_type Match_li Match_fo Bowlers_ Bowlers_ Bowlers_ Bowlers_ All_roun All_roun All_roun First_sel Opponen Opponen Opponen Opponen Opponen Opponen Opponen ket_take ket_take ket_take Players_s Players_s Players_s player_hi player_hi player_hi player_high
Avg_tea _scored_ wls_bowl given_1o scored_1 _given_1 wls_opp ghest_ru _Day and ght_type Match_fo rmat_Tes in_team_ in_team_ in_team_ in_team_ der_in_t der_in_t der_in_t ection_B t_Bangla t_Englan Opponen t_Pakista t_South t_Srilank t_West t_Zimbab Season_S Season_ Offshore n_1over_ n_1over_ n_1over_ cored_ze cored_ze cored_ze ghest_wi ghest_wi ghest_wi est_wicket_
m_Age 1over ed ver over over onent n Night _Night rmat_T20 t 2.0 3.0 4.0 5.0 eam_2.0 eam_3.0 eam_4.0 owling desh d t_Kenya n Africa a Indies we ummer Winter _Yes 2 3 4 ro_2 ro_3 ro_4 cket_2 cket_3 cket_4 5
30 11 9 3 3 9 7 82 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 1 0 0 0
30 20 3 4 2 6 2 45 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
30 14 7 4 3 7 7 60 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
30 12 1 3 3 6 0 75 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0
30 15 11 3 3 10 8 76 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0
30 13 6 2 4 6 2 91 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
30 14 8 5 3 6 3 62 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0
30 14 2 3 3 6 1 57 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0
30 11 11 2 4 8 7 81 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0
30 14 5 2 3 6 2 71 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
30 15 1 2 3 6 0 45 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0
30 12 1 3 3 6 0 75 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0
30 12 1 2 2 6 1 59 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0
30 14 7 4 3 7 7 60 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
30 14 7 1 3 6 2 96 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0
30 12 10 2 3 10 0 79 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0
30 12 4 2 3 6 0 61 1 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
30 20 5 2 3 6 3 94 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
30 11 10 2 3 10 8 89 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0
30 11 11 2 4 8 7 81 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0
30 13 6 2 4 6 2 91 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
Figure 37 – Actual Test Data to predict the Winning Strategy against Srilanka
18
Output of the Model:
The model has predicted the output given an array of results. The value ‘1’ represents Team India wins and ‘0’
represents loss. The model prediction has combination of both 1’s and 0’s as highlighted in the figure.
Results_Pred
Avg_team_Age
Max_run_scored_1over
Extra_bowls_bowled
Min_run_given_1over
Min_run_scored_1over
Max_run_given_1over
extra_bowls_opponent
player_highest_run
Match_light_type_Day
Match_light_type_Night
Match_format_T20
and NightMatch_format_Test
Bowlers_in_team_2.0
Bowlers_in_team_3.0
Bowlers_in_team_4.0
Bowlers_in_team_5.0
All_rounder_in_team_2.0
All_rounder_in_team_3.0
All_rounder_in_team_4.0
First_selection_Bowling
Opponent_Bangladesh
Opponent_England
Opponent_Kenya
Opponent_Pakistan
Opponent_South
Opponent_Srilanka
Africa
Opponent_West
Opponent_Zimbabwe
Indies
Season_Summer
Season_Winter
Offshore_Yes
Max_wicket_taken_1over_2
Max_wicket_taken_1over_3
Max_wicket_taken_1over_4
Players_scored_zero_2
Players_scored_zero_3
Players_scored_zero_4
player_highest_wicket_2
player_highest_wicket_3
player_highest_wicket_4
player_highest_wicket_5
1 30 11 9 3 3 9 7 82 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 1 0 0 0
1 30 20 3 4 2 6 2 45 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
1 30 14 7 4 3 7 7 60 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
1 30 12 1 3 3 6 0 75 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0
1 30 15 11 3 3 10 8 76 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0
1 30 13 6 2 4 6 2 91 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
1 30 14 8 5 3 6 3 62 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0
1 30 14 2 3 3 6 1 57 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0
1 30 11 11 2 4 8 7 81 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0
0 30 14 5 2 3 6 2 71 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 30 15 1 2 3 6 0 45 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0
1 30 12 1 3 3 6 0 75 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0
1 30 12 1 2 2 6 1 59 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0
1 30 14 7 4 3 7 7 60 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
1 30 14 7 1 3 6 2 96 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0
1 30 12 10 2 3 10 0 79 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0
1 30 12 4 2 3 6 0 61 1 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
1 30 20 5 2 3 6 3 94 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
1 30 11 10 2 3 10 8 89 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0
1 30 11 11 2 4 8 7 81 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0
1 30 13 6 2 4 6 2 91 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0
19