0% found this document useful (0 votes)
24 views

06 Regression With Simple Data Preparation

This document outlines an experiment for a course on applied machine learning. The experiment involves predicting a response variable using regression analysis. Students will load a dataset, prepare the data by dropping unuseful features and scaling variables, and use scikit-learn to build pipelines with two regression models - Linear Regression and SVR. Students will evaluate the models using 3-fold cross validation and report the mean RMSE for each model to determine the best regressor for the problem.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

06 Regression With Simple Data Preparation

This document outlines an experiment for a course on applied machine learning. The experiment involves predicting a response variable using regression analysis. Students will load a dataset, prepare the data by dropping unuseful features and scaling variables, and use scikit-learn to build pipelines with two regression models - Linear Regression and SVR. Students will evaluate the models using 3-fold cross validation and report the mean RMSE for each model to determine the best regressor for the problem.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Developing Curricula for Artificial Intelligence and Robotics (DeCAIR)

618535-EPP-1-2020-1-JO-EPPKA2-CBHE-JP

Course Title Applied Machine Learning

Experiment Number 6

Experiment Name Regression with Simple Data Preparation

Objectives The students learn basic skills in data preparation and machine learning to evaluate two
regressor models using Python and Scikit-Learn.

Introduction This is also an introductory experiment in machine learning. The student learns to inspect
the dataset, drop not useful features, separate features from response, and use pipelines of
the standard scaler and a regressor. These techniques are used to evaluate two models in
solving a regression problem.

Materials Computer with Python integrated development environment (IDE) software installed
(PyCharm is recommended).
Dataset file: E6_Regression.csv
Procedure The following Python code loads a dataset that has three features (x1, x2, and x3) and a
response (y). You need to evaluate the two regressors defined below for predicting this
response from the features. Note that you need to inspect this dataset and appropriately
prepare it before performing machine learning. Using Scikit-Learn pipelines and 3-fold cross
validation evaluation technique, which regressor is best?

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import cross_val_score

f = pd.read_csv('E6_Regression.csv')
lr = LinearRegression()
svr = SVR(kernel="poly", C=100, gamma="auto", degree=2,
epsilon=0.1, coef0=1)

scores = cross_val_score(…, X, y,
scoring="neg_mean_squared_error", cv=3)
rmse = np.sqrt(-scores).mean()
print('RMSE', rmse)

Data Collection Capture the output of your code for the two regressors.

Data Analysis None

The European Commission's support for the production of this publication does not constitute an endorsement of the contents, which reflect
the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained
therein.

2
Developing Curricula for Artificial Intelligence and Robotics (DeCAIR)
618535-EPP-1-2020-1-JO-EPPKA2-CBHE-JP

Required Reporting Submit your code used to prepare the dataset and to evaluate the two regressors, give the
mean RMSE for both regressors, and specify the best regressor for this problem.

Safety Considerations Standard safety precautions related to using computer.

References 1. Applied Machine Learning presentation titled “End-to-End Machine Learning Project.”
2. Aurélien Géron, Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow,
O’Reilly, 3rd Edition, 2022.

The European Commission's support for the production of this publication does not constitute an endorsement of the contents, which reflect
the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained
therein.

You might also like