0% found this document useful (0 votes)
8 views

Day 6 Introduction To Machine Learning

Uploaded by

Deep gaichor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Day 6 Introduction To Machine Learning

Uploaded by

Deep gaichor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Day 6: Introduction to Machine Learning

What is Machine Learning?

Definition:

 Machine Learning (ML): A subset of artificial intelligence that enables systems to


automatically learn and improve from experience without being explicitly programmed.

Types of ML:

1. Supervised Learning: Learn from labeled data.


2. Unsupervised Learning: Learn from unlabeled data.
3. Reinforcement Learning: Learn through trial and error by interacting with an environment.

The ML Pipeline: Data Collection, Preprocessing, Model Building, Evaluation

ML Pipeline Stages:

1. Data Collection: Gather relevant data from various sources.


2. Data Preprocessing: Clean, transform, and prepare data for modeling.
3. Model Building: Select and train appropriate machine learning models.
4. Model Evaluation: Assess model performance using appropriate metrics.

Example ML Pipeline:
1. Data Collection:
- Gather customer demographic data from a database.
- Collect transaction history from online sales.

2. Data Preprocessing:
- Handle missing values.
- Normalize numerical features.
- Encode categorical variables.

3. Model Building:
- Choose a classification algorithm like Logistic Regression.
- Train the model on the preprocessed data.

4. Model Evaluation:
- Evaluate the model's accuracy, precision, recall, and F1 score.
- Validate the model's performance using cross-validation.

Introduction to Scikit-Learn

Scikit-Learn:

 Open-source machine learning library for Python


 Provides simple and efficient tools for data mining and data analysis
 Supports various machine learning algorithms and tools for model evaluation and selection

Installing Scikit-Learn:
bash
Copy code
pip install scikit-learn

Example Usage:
python
Copy code
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Split data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Initialize and train a logistic regression model


model = LogisticRegression()
model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# Evaluate the model


accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

This concludes the note for Day 6: Introduction to Machine Learning.


Day 6: Introduction to Machine Learning

What is Machine Learning? Types of ML: Machine Learning (ML) is a subset of artificial
intelligence (AI) that focuses on the development of algorithms and statistical models to
enable computers to perform tasks without explicit instructions. There are three main types of
ML:

1. Supervised Learning: In supervised learning, the model is trained on a labeled dataset,


meaning that each input data point is associated with a corresponding target variable.
The goal is to learn a mapping from input to output.
2. Unsupervised Learning: Unsupervised learning involves training the model on an
unlabeled dataset, where the algorithm tries to find patterns or intrinsic structures in
the data. It's often used for clustering and dimensionality reduction tasks.
3. Reinforcement Learning: Reinforcement learning is a type of ML where an agent
learns to make decisions by interacting with an environment. It receives feedback in
the form of rewards or penalties, allowing it to learn the optimal behavior through
trial and error.

The ML Pipeline: Data Collection, Preprocessing, Model Building, Evaluation: The ML


pipeline outlines the typical workflow of a machine learning project:

1. Data Collection: Gathering relevant data from various sources, ensuring data quality,
and understanding the problem domain.
2. Data Preprocessing: Cleaning the data by handling missing values, encoding
categorical variables, scaling features, and splitting the data into training and testing
sets.
3. Model Building: Selecting an appropriate machine learning algorithm based on the
problem type and dataset, training the model on the training data, and tuning
hyperparameters to optimize performance.
4. Evaluation: Assessing the model's performance on unseen data using evaluation
metrics such as accuracy, precision, recall, F1-score, etc. It involves comparing the
model's predictions with the actual labels to measure its effectiveness.

Introduction to Scikit-Learn: Scikit-Learn is a popular machine learning library in Python


that provides simple and efficient tools for data mining and data analysis. It offers various
algorithms for classification, regression, clustering, dimensionality reduction, and model
selection.

# Example of using Scikit-Learn to build and evaluate a machine learning


model
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load the Iris dataset


iris = load_iris()
X, y = iris.data, iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Initialize the Logistic Regression model
model = LogisticRegression()

# Train the model on the training data


model.fit(X_train, y_train)

# Make predictions on the testing data


predictions = model.predict(X_test)

# Evaluate the model's accuracy


accuracy = accuracy_score(y_test, predictions)
print("Accuracy:", accuracy)

Scikit-Learn provides a user-friendly interface for implementing machine learning


algorithms, making it accessible to both beginners and experts in the field.

You might also like