Unit 3 5

The document outlines a project to predict house prices using Simple Linear Regression based on house size. It includes steps for data loading, visualization, model implementation, and performance evaluation using Mean Squared Error (MSE) and R². The results indicate a moderate fit with an R² score of 0.596 and a MSE of 36.36, suggesting potential for improvement by including additional features.

Uploaded by

mcanarender

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views4 pages

Unit 3 5

Uploaded by

mcanarender

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Predicting House Prices

Objective: Use Simple Linear Regression to predict house prices based on their size
Database: https://www.kaggle.com/c/boston-housing.
Tasks:
1. Load and explore the dataset.
2. Create a scatter plot to visualize the relationship between house size and price.
3. Implement Simple Linear Regression to predict prices.
4. Evaluate the model's performance using R² and Mean Squared Error (MSE).
import pandas as pd

data_path = '/content/drive/MyDrive/nkphd/bostan/'
# Load the datasets
submission_example = pd.read_csv(os.path.join(data_path,
'submission_example.csv'))
train = pd.read_csv(os.path.join(data_path, 'train.csv'))
test = pd.read_csv(os.path.join(data_path, 'test.csv'))

# Display first few rows to confirm

print("Submission Example:")
print(submission_example.head())

print("\nTrain Dataset:")
print(train.head())

print("\nTest Dataset:")
print(test.head())
from google.colab import drive
drive.mount('/content/drive')

import pandas as pd

# Define dataset path

data_path = '/content/drive/MyDrive/nkphd/bostan/'

# Load train dataset

train = pd.read_csv(data_path + 'train.csv')

# Scatter plot for relationship between house size ('rm') and price
('medv')
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(10, 6))
sns.scatterplot(x=train['rm'], y=train['medv'])
plt.title('Relationship Between House Size (RM) and Price (MEDV)',
fontsize=14)
plt.xlabel('Average Number of Rooms per Dwelling (RM)', fontsize=12)
plt.ylabel('Median Value of Owner-Occupied Homes (MEDV)', fontsize=12)
plt.grid(True)
plt.show()
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

import pandas as pd

# Load the train dataset

data_path = '/content/drive/MyDrive/nkphd/bostan/'
train = pd.read_csv(data_path + 'train.csv')

# Implement Simple Linear Regression

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Prepare the data

X = train[['rm']] # Average number of rooms per dwelling
y = train['medv'] # Median value of owner-occupied homes

# Split data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Initialize and fit the linear regression model

linear_regressor = LinearRegression()
linear_regressor.fit(X_train, y_train)

# Predict on the test set

y_pred = linear_regressor.predict(X_test)

# Model evaluation
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

# Display results
print("Mean Squared Error (MSE):", mse)
print("R-squared (R^2):", r2)
print("Coefficient (Slope):", linear_regressor.coef_[0])
print("Intercept:", linear_regressor.intercept_)
Drive already mounted at /content/drive; to attempt to forcibly remount,
call drive.mount("/content/drive", force_remount=True).
Mean Squared Error (MSE): 36.361622515889756
R-squared (R^2): 0.5959747117709422
Coefficient (Slope): 8.584424490365215
Intercept: -30.96185860010203
from sklearn.metrics import mean_squared_error, r2_score
# Calculate Mean Squared Error
mse = mean_squared_error(y_test, y_pred)

# Calculate R²
r2 = r2_score(y_test, y_pred)

# Print the results

print("Mean Squared Error (MSE):", mse)
print("R-squared (R²):", r2)

Mean Squared Error (MSE): 36.361622515889756

R-squared (R²): 0.5959747117709422

Evaluation of performance using R² and Mean Squared Error (MSE).

The Simple Linear Regression model for predicting house prices was evaluated using MSE and
R². The MSE was 36.36, indicating the average squared error in predictions. The R² score of
0.596 shows that 59.6% of the variance in house prices is explained by the number of rooms per
dwelling. The slope of 8.58 indicates an increase in house price by 8.58 units per additional
room. While the model shows a moderate fit, including more features could improve accuracy.

T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
Data Mining Final Assignment
No ratings yet
Data Mining Final Assignment
4 pages
Ds 4 Linears Boston
No ratings yet
Ds 4 Linears Boston
2 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
AD-22053227 Lab 401, 402
No ratings yet
AD-22053227 Lab 401, 402
4 pages
Housing Price Prediction with Regression
No ratings yet
Housing Price Prediction with Regression
5 pages
EXPNO5
No ratings yet
EXPNO5
2 pages
Regression Analysis On The Boston House Price Dataset For House Price Prediction
No ratings yet
Regression Analysis On The Boston House Price Dataset For House Price Prediction
2 pages
DA Lab2
No ratings yet
DA Lab2
5 pages
Integrated System Lab
No ratings yet
Integrated System Lab
25 pages
AD-22053227 Lab 401, 402
No ratings yet
AD-22053227 Lab 401, 402
4 pages
ML Record
No ratings yet
ML Record
19 pages
House Price Prediction Full Report-2
No ratings yet
House Price Prediction Full Report-2
5 pages
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
No ratings yet
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
20 pages
Predicting Housin Main Project Ediglobe
No ratings yet
Predicting Housin Main Project Ediglobe
4 pages
Boston Housing Price Prediction
No ratings yet
Boston Housing Price Prediction
3 pages
Home Price Prediction in Coimbatore
No ratings yet
Home Price Prediction in Coimbatore
3 pages
Project
No ratings yet
Project
10 pages
Simple-Linear-Regression-Model-using-Python-for-Beginners - Ipynb - Colab
No ratings yet
Simple-Linear-Regression-Model-using-Python-for-Beginners - Ipynb - Colab
4 pages
DL Assignment 1ms24rai03
No ratings yet
DL Assignment 1ms24rai03
10 pages
Lab 2 Linear Regression Representation
No ratings yet
Lab 2 Linear Regression Representation
6 pages
Wa0009.
No ratings yet
Wa0009.
4 pages
7 A
No ratings yet
7 A
2 pages
DSBDAL - Assignment No 4
No ratings yet
DSBDAL - Assignment No 4
15 pages
House Price Prediction
No ratings yet
House Price Prediction
2 pages
7th ExP
No ratings yet
7th ExP
4 pages
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 1
No ratings yet
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 1
6 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
Lab ML
No ratings yet
Lab ML
26 pages
Experiment 4 ML
No ratings yet
Experiment 4 ML
9 pages
Linear Regression Analysis of Boston Housing
No ratings yet
Linear Regression Analysis of Boston Housing
13 pages
Data Scientists' Guide to Predicting House Prices
No ratings yet
Data Scientists' Guide to Predicting House Prices
9 pages
Practice Exercise 4
No ratings yet
Practice Exercise 4
2 pages
Linear Regression for Data Science
No ratings yet
Linear Regression for Data Science
30 pages
ML Assignment 1ipynb
No ratings yet
ML Assignment 1ipynb
10 pages
Exp4 (Linear Regression)
No ratings yet
Exp4 (Linear Regression)
2 pages
Expt 7
No ratings yet
Expt 7
3 pages
Real Estate Price Prediction Guide
No ratings yet
Real Estate Price Prediction Guide
10 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Exercise4 Solution
No ratings yet
Exercise4 Solution
20 pages
Lasso Regression Aim: Roll Number: 160122733094 Date
No ratings yet
Lasso Regression Aim: Roll Number: 160122733094 Date
8 pages
ML Manual
No ratings yet
ML Manual
30 pages
SML - Week 3
No ratings yet
SML - Week 3
5 pages
A
No ratings yet
A
2 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
1 - Lab Manual (ML)
No ratings yet
1 - Lab Manual (ML)
42 pages
Linear Regression - Py
No ratings yet
Linear Regression - Py
2 pages
ML Practical 5
No ratings yet
ML Practical 5
10 pages
Python File
No ratings yet
Python File
5 pages
ML Record
No ratings yet
ML Record
21 pages
Regression (1) - 1-4
No ratings yet
Regression (1) - 1-4
4 pages
California Housing Price Prediction .
No ratings yet
California Housing Price Prediction .
1 page
Phase 5
No ratings yet
Phase 5
5 pages
ML Practical 5
No ratings yet
ML Practical 5
10 pages
Predicting House Prices
No ratings yet
Predicting House Prices
9 pages
New Opendocument Text
No ratings yet
New Opendocument Text
7 pages
Lab6 Hoursing Price Regression
No ratings yet
Lab6 Hoursing Price Regression
2 pages
Regression Model Training Guide
No ratings yet
Regression Model Training Guide
13 pages
Coding Question
No ratings yet
Coding Question
6 pages
Comparative - Analysis - With - Performance - Metrics 5
No ratings yet
Comparative - Analysis - With - Performance - Metrics 5
3 pages
RM Good
No ratings yet
RM Good
8 pages
New Slide Data
No ratings yet
New Slide Data
3 pages
On The Insert Tab
No ratings yet
On The Insert Tab
1 page
PhD in Computer Science Syllabus 2025
No ratings yet
PhD in Computer Science Syllabus 2025
24 pages
Covid 19
No ratings yet
Covid 19
12 pages
Question Bank For Research Methodology
No ratings yet
Question Bank For Research Methodology
1 page
Panel Data Methods in Stata
100% (1)
Panel Data Methods in Stata
39 pages
Introduction To Data Quality Assessment Training Course: Instructor Notes
No ratings yet
Introduction To Data Quality Assessment Training Course: Instructor Notes
12 pages
Academic Stress Perception Analysis
No ratings yet
Academic Stress Perception Analysis
5 pages
ASEAN-5 E-Money Impact Analysis
No ratings yet
ASEAN-5 E-Money Impact Analysis
16 pages
Lessons in Business Statistics Prepared by P.K. Viswanathan
No ratings yet
Lessons in Business Statistics Prepared by P.K. Viswanathan
27 pages
1-Assign-1-Definition-of-Terms (AMIL)
No ratings yet
1-Assign-1-Definition-of-Terms (AMIL)
3 pages
Regression Analysis in EE4802/IE4213
No ratings yet
Regression Analysis in EE4802/IE4213
13 pages
Output
No ratings yet
Output
6 pages
International Journal of Agricultural and Statistical Sciences
No ratings yet
International Journal of Agricultural and Statistical Sciences
10 pages
I. Profile of The Respondents: Table 1 Frequency and Percentage Distribution of Respondents by Age
No ratings yet
I. Profile of The Respondents: Table 1 Frequency and Percentage Distribution of Respondents by Age
21 pages
Regression Predictions in Excel & R
No ratings yet
Regression Predictions in Excel & R
20 pages
Document Analysis and Insights
No ratings yet
Document Analysis and Insights
31 pages
QBA Chapter-4 Regression-Models
No ratings yet
QBA Chapter-4 Regression-Models
70 pages
Experiment 7 ML Vtu
No ratings yet
Experiment 7 ML Vtu
5 pages
ApplStats Spring2022 Final Practice
No ratings yet
ApplStats Spring2022 Final Practice
5 pages
Durbin, J., & Watson, G. S. (1951) - Testing For Serial Correlation in Least Squares Regression. II. Biometrika, 38 (12), 159.
No ratings yet
Durbin, J., & Watson, G. S. (1951) - Testing For Serial Correlation in Least Squares Regression. II. Biometrika, 38 (12), 159.
20 pages
Lakens2013-Calculating and Reporting Effect Sizes To Facilitate Cumulative Science PDF
No ratings yet
Lakens2013-Calculating and Reporting Effect Sizes To Facilitate Cumulative Science PDF
12 pages
BCA (AIDS) - 3rd Sem - TBD303 - Statistical Methods For Data Science-JBK
No ratings yet
BCA (AIDS) - 3rd Sem - TBD303 - Statistical Methods For Data Science-JBK
2 pages
Ehsan Zamanzade
No ratings yet
Ehsan Zamanzade
7 pages
(Ebook) Sampling Statistics (Wiley Series in Survey Methodology) by Wayne A. Fuller ISBN 9780470454602, 0470454601 Full Chapters Included
No ratings yet
(Ebook) Sampling Statistics (Wiley Series in Survey Methodology) by Wayne A. Fuller ISBN 9780470454602, 0470454601 Full Chapters Included
27 pages
Kruskal-Wallis Analysis of Feedback Attitudes
No ratings yet
Kruskal-Wallis Analysis of Feedback Attitudes
4 pages
Cse-Ai & DS
No ratings yet
Cse-Ai & DS
2 pages
LSD
No ratings yet
LSD
7 pages
Math 3339 Final Exam Review
100% (6)
Math 3339 Final Exam Review
8 pages
Methods of Field Experimentation
No ratings yet
Methods of Field Experimentation
40 pages
Simple and Linear Regression Guide
No ratings yet
Simple and Linear Regression Guide
12 pages
Markov Chain Monte Carlo
No ratings yet
Markov Chain Monte Carlo
6 pages
Intraday Electricity Demand Forecasting
100% (1)
Intraday Electricity Demand Forecasting
11 pages
Chapter Three 3.0 Methodology 3.1 Source of Data
No ratings yet
Chapter Three 3.0 Methodology 3.1 Source of Data
10 pages
Exercise 4.3 (Diez)
No ratings yet
Exercise 4.3 (Diez)
2 pages