0% found this document useful (0 votes)
11 views30 pages

Data Science - Lecture - 4

Uploaded by

45 Nandini Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views30 pages

Data Science - Lecture - 4

Uploaded by

45 Nandini Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction

Data Science

Swati Chopade

VJTI, Mumbai

September 14, 2022

Swati Chopade Data Science


Introduction Modelling Data

Outline

1 Introduction
Modelling Data

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data

Modelling data
let us deal with some kind of patient data where we have
dierent readings about patients, these could be blood sugar
level, cholesterol level and so on.
Let’s say blood sugar level is stored under the column named
‘x1’ and we have sorted the data in the ascending order.

Swati Chopade Data Science


Introduction Modelling Data

Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data

Modelling data
This data is sort of clustered around the region which is close
to 117 and we have some smaller values and some very big
values.so, this data looks like normal distribution.

Swati Chopade Data Science


Introduction Modelling Data

Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data
Statistical Modelling: Underlying data distribution
For example, we are dealing with patient data where you are
having dierent reading of patient such as, blood sugar level,
pulse rates, colesterol level. For example, suppose there is a
new drug in the market for certain medical conditions such as
suger level, colesterol level.
As a doctor, he is interested in knowing what is the
eectiveness of this new drug or how eective this drug is? To
know answer for this question and do the experiment
thoroughly, it is required to to randomized control experiment.
You take sample of patients means you take some subjects,
you are interested in administering this drug for the sample of
patients. In statistical modelling, we assume simple statistical
models which allows robust statistical analysis and give
statistical guarantees (ρ-values, goodness-of-t tests)
Swati Chopade Data Science
Introduction Modelling Data

Data Modelling

Figure: Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Data Modelling

Figure: Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data

Statistical Modelling: Underlying data distribution


administer this new drug for a time duration of say 3 months
and we note down the readings again and say it again follows
a normal distribution as depicted in the below image and now
we can see that the mean has shifted towards the lower side:

Swati Chopade Data Science


Introduction Modelling Data

Data Modelling

Figure: Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data
Statistical Modelling: Underlying data distribution
The data scientist says that the data follows a normal
distribution.So, mean and variance are sufficient to describe
this data. We are looking for a robust argument such as I am
99% sure that the drug is eective.
You have to nd out the underlying relationships of the data
such as what is the relationship between the blood sugar level
and number of days of treatment? The data scientist can say
that there is a liner relationship between the blood sugar leval
and no of days. It is decreasing as the number of days are
inctreasing.
The data scientist should say that I am 99% sure that the
sugar level drops by 3 points for each day of the treatment.
This robust statement is used to advocate whether this drug
is used or not?
Swati Chopade Data Science
Introduction Modelling Data

Data Modelling

Figure: Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Data Modelling

Figure: Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Data Modelling

Figure: Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Data Modelling

Figure: Data Modelling

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data

Algorithmic Modelling
Alternative approach to modelling is algorithmic modelling. It
is loosely machine learning modelling. In statistical modelling,
we made very simplied models of data. But, due to statistical
guarantees you are limited to the models you can use. You
can not use a very very complex models such as we can not
say the relationship between input and output is log of cube
of sin of e raise to x. I can not do statistical analysis on it.
But, in a real world, the relationship is much more complex
and depends on many factors which we are not considering. In
such cases, we need the alternative approach; build the
complex models.

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Data Modelling

Figure: Algorithm Modelling

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data

Algorithmic Modelling
Machine learning allows a large family of very very complex
functions. This allows to model the relationships with very
very complex functions.
What is the goal of machine learning? The goal is to estimate
the function f using data and optimization techniques. ML
allows to choose very complex functions to represent the
relationships between the variables of the data. For a new
patint, plug-in the value of x (age, weight, blood-pressure) to
get y. The focus in on the prediction (don’t care about the
underlying phenomena). I am not interested in knowing that
how much blood sugar depends on age, weight, blood
pressure. Finally, what answer I get should be very close to
the true answer. Prediction should be very very true.

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Modelling

Figure: Algorithmic Modelling

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Modelling

Figure: Algorithmic Modelling

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Modelling

Figure: Algorithmic Modelling


Swati Chopade Data Science
Introduction Modelling Data

Dierence between Statistical Modelling and Algorithmic


Modelling

Figure: Algorithmic Modelling


Swati Chopade Data Science
Introduction Modelling Data

Statistical Modelling

Figure: Statistical Modelling

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Modelling

Swati Chopade Data Science


Introduction Modelling Data

Modelling Data

Algorithmic Modelling: DL
When you have large amounts of high dimensional data and
you want to learn very comples relationships between the input
and the output use a specic class of complex ML models and
algorithms collectively referred to as Deep learning.
consider the image of retina which is 256*256 and you want
to predict whether the patient is suering from diabetic
retinopathy. Use Deep learning. Why DL popular? A large
amount of data available in many scenarios in the form of
text, speech, image and the relationships between the
variables are very complex.
Good software frameworks such as pyTorch and much better
computers.

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Modelling

Figure: Algorithmic Modelling

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Modelling

Figure: Algorithmic Modelling

Swati Chopade Data Science


Introduction Modelling Data

Algorithmic Modelling

Figure: Algorithmic Modelling

Swati Chopade Data Science


Introduction Modelling Data

Thank You

Swati Chopade Data Science

You might also like