Why Do We Use Cross Validation Set in Our Models?
Why Do We Use Cross Validation Set in Our Models?
Initially while building a model we will train our model based on the known or past data, so
once we train our data we need to know how well our model is working, if we apply our
model with the train data or known data our model will works fine, but in reality we will
encounter with different data which is different from train data and our model may not
work well, in order to avoid these and to build our model more efficiently we will use cross
validation technique, cross validation technique will test the that it has never seen before.
Cross validation is also used for avoiding the problem of over-fitting, by using cross validation we
are able to use all our dataset for both training and testing.
Answer - a, b, c
4.Which train-test based splitting should be used for time series data and why?
TimeSeriesSplit is used to split time series data at fixed intervals, in train-test sets, In time
series we cannot split our data randomly since our observations are not independent so
when dealing with time related data we need to use time based splitting.
5. What are different cross validation techniques used for regression problems?
Leave p out cross-validation
Leave one out cross validation – LOOCV
K-fold cross-validation
Stratified k-fold cross-validation
6. How training and CV scores help you to find an optimum hyperparameter for your
model?
optimization procedure follows these steps:
Select a new set of model hyperparameters, then train the data on training subset from selected
parameter
Then apply the model in test data and generate corresponding predictions, then evaluate the
predictions using score metric compare all the score metrics and choose the hyper parameter
that yields the best metric score