0% found this document useful (0 votes)
8 views

Introduction To Machine Learning

The document provides an introduction to machine learning, including the differences between machine learning and normal programming, machine learning models, the types of machine learning including supervised, unsupervised and reinforcement learning, and the mandatory machine learning steps of data collection, data science, algorithm selection, model construction and validation.

Uploaded by

arrepio cosmico
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Introduction To Machine Learning

The document provides an introduction to machine learning, including the differences between machine learning and normal programming, machine learning models, the types of machine learning including supervised, unsupervised and reinforcement learning, and the mandatory machine learning steps of data collection, data science, algorithm selection, model construction and validation.

Uploaded by

arrepio cosmico
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Introduction to Machine Learning

Machine Learning vs Normal Programming


Machine Learning Model
How does Machine Learning Learn?
Supervised and Unsupervised Learning
Supervised Learning (Examples)
Unsupervised Learning (Examples)
Semi-Supervised Learning
Online Learning
Active Learning
Transfer Learning
Mandatory Machine Learning Steps
Data Collection
Data Science
Algorithm selection
Model Construction and Validation
Artificial intelligence - Concerns all of the techniques used to make machines capable of
performing complex tasks (usually reserved for humans) and therefore simulate a sort of
intelligence
Machine intelligence - Computation elements collaborate in order to control and manage a
physical entity
- An integration of computation, networking, and physical processes
- Enables a machine to interact with an environment in an intelligent way
- Transportation, Medical Health monitoring, Robotics, Aeronautics, Manufacturing
Machine Learning - Technique for using computers to predict things based on past
observations.

Machine Learning vs Normal Programming


In a traditional piece of software, a programmer designs an algorithm that takes an input,
applies various rules, and returns an output
The algorithm’s internal operations are planned out by the programmer and implemented
explicitly through lines of code.
To predict breakdowns in a factory machine, the programmer would need to understand
which measurements in the data indicate a problem and write code that checks for them.
To create a machine learning program:
● A programmer feeds data into a special kind of
algorithm and lets an algorithm discover the rules.
This means that as programmers, we can create
programs that make predictions based on complex
data.
● The machine learning algorithm builds a model of
the system based on the date we provide, through
an processes we call training
● The model is a type of computer program. We run
data through this model to make predictions, in a
process called interference
There are many different approaches to machine learning. One of the most popular is deep
learning, which is based on a simplified idea of how the human brain might work.

Machine Learning Model


A model h can learn a task T based on the experience E
Thus we introduce the input and output matrices X ∈ R N∗L and the scalar function h(X)

Parameter Based Models - Uses the training


dataset to create model based on a parameterization

Instance Based Models - Uses the complete data set to create model
without parameterization

How does Machine Learning Learn?

Classifying Iris Subtipe Learning time Patterns Games

- Uses input data - Uss input data - without input data


- Class labels known - Without class labels - interacts with player
- Task: Classification - Pattern unknown - learns by playing
In other words, there are 3 types of learning:
● Supervised Learning
● Unsupervised Learning
● Reinforcement Learning

Supervised and Unsupervised Learning

Supervised learning Unsupervised Learning

Definition: Adjusts a model by minimizing Definition: Finds relationships (similarities,


the error between the model output and the statistical relationships…) between the
labels inputs without labeled input

Examples Classification: Fault detection, Example: Clustering, anomaly detection,


image classification recommender systems

Example Regression: Forecasting, Example dimensionality reduction:


prediction, estimation, remaining useful Visualization, data comprehension, feature
lifetime predictions engineering
There are 4 Learning Categories:
● Classification
● Regression
● Clustering
● Density estimation

Supervised Learning (Examples)

Both have labels therefore fall under the supervised learning problems.

Unsupervised Learning (Examples)

More examples:
● Clustering
○ Automatically separate customers for better marketing campaign
○ Clustering as exploration tool to understand data to make informed decisions
● Dimensionality Reduction
○ Compress data
○ Visualize dataset in reduced space
○ Learn with missing labels, used for search engines, used for recommender
systems
● anomaly detection
○ concerns learning which data point as an outlier
Agent (software) Environment Reward and State

Learning policy taking Changes for each agent The reward is a problem
actions to maximize a action to a new state dependent function that
cumulative reward penalizes undesirable
actions and rewards
desirable actions

Has no teacher showing him Updates for each action the The state are problem
which the best action is new cumulative reward specific

Learns by interaction with Can be modeled as Markov


the environment Decision Process

Semi-Supervised Learning
Definition:
● Has both labeled and unlabeled samples
● Learn model for complete data set
Approaches:
● Mixing supervised and unsupervised learning
● Train mode, Predict missing and retrain
Example:

● in object detection class labels


● Manually labeling is time consuming
● Typical scenario: Labeled and unlabeled data
available
● some images come with class labels, others do
not
Online Learning
Definition:
● Model updates after each sample/batch
● For large data sets (eg. fraud detection)
● Only a few algorithms can learn online

Active Learning
Definition
● Learn model via interaction with teacher
● Teacher can be human or program
● Reinforcement learning subset
● Aget cannot alter the environment

Transfer Learning
Definition:
● Transfers knowledge from one learning task to another
● Typical application: Learning more efficiently from small
dataset when a large different dataset is available
○ model trained on large data
○ Pre-trained model a initializes model B
○ Train model B on small data
Why use transfer learning:
● Better overall performance in less training iterations
● For small training datasets

Mandatory Machine Learning Steps

Steps:
● Collect data X
● Analyse, clean and process data. Carry out Feature Engineering
● Choose algorithm suitable to solve the task
● Construct and validate the model. Adapt model if required
● Implement the model and predict ^y =h(x)
Data Collection
Rich datasets can be crucial for a machine learning workflow and impact the final success.
Data collection requires a lot of time, resources and preparation:
● Data collection design
● Which features?
● How many samples?
● How can we measure (sensors = automatic collection)?
● How can we record the data?
● Where do we save the data? On chip?

Data Science
Steps transforming data towards being usable for machine learning
● Data Cleaning
○ Missing values: Delete/fill missing values (using interpolation)
○ Up and Downsampling
○ Filtering the data
● Data integration
○ merging data from different sources
● Data transformation
○ normalization → rescalling attributes to a joint scale
● Future extraction
○ Mathematical part
○ Selecting structured data from unstructured datasets.
○ Selecting structured data from unstructured datasets. Features created from
existing measurements (most common: statistical features like mean, max,
min, and standard deviation
○ we need it because:
■ Learning data without extracting features may require a high number
of variables
■ a high number of variables
● means a high amount of memory and computational power
● may cause overfitting: the model has so many degrees of
freedom that it will perfectly match the training data, but
perform poorly on the test data
● Feature selection
○ keeping features that improve the trained model and discarding the rest
● Feature reduction
○ Transforming high-dimensional into lower dimensional feature space
Algorithm selection

to choose an algorithm we need to ask ourselves the following questions:


● Is the data labeled or not, or can we interact with an environment?
○ Labeled data - Supervised
○ Unlabeled data - Unsupervised
○ interaction with the environment - reinforcement learning
● Which task do we want to solve?
○ Classification, regression, clustering, prediction, anomaly detection,
dimensionality reduction, real time decisions
○ depends on the output we want
● How many features and how many samples do we have?
○ Some algorithms can deal with large samples and feature numbers
■ example: neural networks
○ some algorithms cant:
■ example: support vector machines
● Do we have numerical, categorical or mixed data sets?
○ Not all algorithms can deal with categorical data (some require one-hot-
encoding)
● What are other considerations?
○ Training and prediction speed, memory constraints using historical data,
privacy preservation issues and explainability

Model Construction and Validation


● Divide the data into
○ Training part
○ Test part
● Machine Learning constructs model
● Test model Y ^ =h( X )
test test

● ^ (calculates the
Validation compares Y test , Y test
error)
● Note: Here still neglected
○ Hyper Parameter-tuning
○ model selection
■ using a validation dataset

You might also like