Skip to content

Median house prices for California districts derived from the 1990 census.

Notifications You must be signed in to change notification settings

yamil-abraham/california-housing-prices-analyzing-dataset

Repository files navigation

california-housing-prices-analyzing-dataset

Median house prices for California districts derived from the 1990 census.

About Dataset Context This is the dataset used in the second chapter of Aurélien Géron's recent book 'Hands-On Machine learning with Scikit-Learn and TensorFlow'. It serves as an excellent introduction to implementing machine learning algorithms because it requires rudimentary data cleaning, has an easily understandable list of variables and sits at an optimal size between being to toyish and too cumbersome.

The data contains information from the 1990 California census. So although it may not help you with predicting current housing prices like the Zillow Zestimate dataset, it does provide an accessible introductory dataset for teaching people about the basics of machine learning.

Content The data pertains to the houses found in a given California district and some summary stats about them based on the 1990 census data. Be warned the data aren't cleaned so there are some preprocessing steps required! The columns are as follows, their names are pretty self explanitory:

longitude

latitude

housing_median_age

total_rooms

total_bedrooms

population

households

median_income

median_house_value

ocean_proximity

Based on >> https://www.kaggle.com/datasets/camnugent/california-housing-prices

About

Median house prices for California districts derived from the 1990 census.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published