Data Management and Data Transformation, Introduction To Machine Learning
Data Management and Data Transformation, Introduction To Machine Learning
Transformation, Introduction to
Machine Learning
By
Dr. Amod Kumar Tiwari
Asso. Professor, CSED
Rajkiya Engineering College, Sonbhadra
1
Outline
1. Preface
2. Definition
3. Introduction to Machine Learning (ML)
4. Need for ML
5. Types of Learning in ML
6. Applications of ML
7. Limitations of ML
8. ML and Data Management
9. Data Transformation in ML
2
Preface
DATA, DATA EVERYWHERE…
Widespread use of personal computers and wireless communication
leads to “big data”.
We are both producers and consumers of data.
Data is not random, it has structure, e.g., customer behavior. We
need “big theory” to extract that structure from data for
Understanding the process
Making predictions for the future
It is a biggest challenge to store and process such a huge data.
More challenging to extract meaningful insight from the data pile.
Extracted information is of high significance & aids in decision
making.
But is the data always valuable? 3
Cont.
4
Definition
DATA What is it ?
Data is a collection of raw facts and figures having no meaning on its
own but when processed lead to meaningful information.
5
Cont.
DATA EVERYWHERE…
Widespread use of personal computers and wireless communication
leads to “big data”.
We are both producers and consumers of data.
Data is not random, it has structure, e.g., customer behavior. We
need “big theory” to extract that structure from data for
Understanding the process
Making predictions for the future
It is a biggest challenge to store and process such a huge data.
More challenging to extract meaningful insight from the data pile.
Extracted information is of high significance & aids in decision
making.
But is the data always valuable? 6
Data Can Toil/Spoil…
7
Introduction To Machine Learning (ML)
8
Cont.
9
Cont.
1
1
Need For ML
For tasks that are easily performed by humans but are complex for
computer systems to emulate for example … So that machines can take
charge of humans)
Vision: Identify faces in a photograph, objects in a video or still
image, etc.
Natural language Processing: Translate a sentence from Hindi to
English, question answering, identify sentiment of text, etc.
Speech Recognition: Recognize spoken words, speaking sentences
naturally
Game playing: Play games like chess, Go, Dota.
Robotics: Walking, jumping, displaying emotions, driverless car etc.
Cont.
Fields where there are very few (almost no) human experts
Industrial/manufacturing control
Testing and Quality Assurance
Mass spectrometer analysis,
Drug design
Astronomic discovery
Cont.
In this case, we teach or train the machine using data that are
properly or correctly labeled.
Unlike supervised learning, this ML type does not need labels and
corresponding outputs to be provided. Instead, unsupervised
learning uses unlabeled input data and determines the structure of
the set.
ii. Nominal: Any categorical data that has no order is called nominal
categorical data. Examples include gender and country.
There are some algorithms that can work well with categorical
data, such as decision trees.
But most machine learning algorithms cannot operate directly
with categorical data. These algorithms require the input and
output both to be in numerical form.
If the output to be predicted is categorical, then after prediction
we convert them back to categorical data from numerical data.
Let’s discuss some key challenges that we face while dealing with
categorical data:
Data Transformation in ML
ii. Rare occurrences: These data columns might have variables that
occur very rarely and therefore would not be significant enough to
have an impact on the model.
Using inplace=True
Data Transformation in ML
Data Transformation in ML
Resources
Books
E. Alpaydin, Introduction to Machine Learning, 3rd Edition, MIT Press, 2014.
C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2016.
Lecture Notes
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT
Press (V1.1)
https://www.javatpoint.com/applications-of-machine-learning
Websites
Geekforgeeks.com
Medium.com
Towardsdatascience.com
https://en.wikipedia.org/wiki/Data_transformation