Lecture 1 Introduction
Lecture 1 Introduction
Lecture 1
2
Environmental Problems and Statistics
3
Environmental Problems and Statistics
4
Structure of course teaching
• Introduction
– engineering problem and statistical method.
• Case study
– introduce a specific environmental example with
real world data
• Method
– give a brief explanation of statistical method that is
used to prepare the solution.
• Analysis
– show how the data suggest and influence the
method of analysis and give the solution.
5
Brief Review of Statistics
6
Brief Review of Statistics
•DESCRIPTIVE Statistics
– For one variable ("univariate analysis"):
– Measures of "CENTRAL TENDENCY" (averages) and
of DISPERSION or variance around that average.
– Examples: Means, Modes, Medians, Standard
Deviation, quartiles etc
•INFERENTIAL Statistics
– Measures of the SIGNIFICANCE of the relationship
between two or more variables. Significance refers to
the probability that the findings could be attributed to
sampling error.
– Appropriate statistics depend on the LEVEL OF
MEASUREMENT OF THE DEPENDENT VARIABLE
(and of the independent variable).
– Example: t-Test, ANOVA (F-ratio)
8
Let’s get Familiar with Excel Advanced
Tools
• Formula in Excel
• Hidden Developer functions in Excel.
• Practice calculations in Excel data example
• Good practice in using Excel
9
Excel basics I
• Use of formula
• Use of $
• Use of shortcut to go to cells
• Note the black and white cross
• Plot
• Use of Ctrl + Shift + Enter for array calculation
• Developer tool
• ActiveX
10
Statistical distribution measures
• Central values
– Arithmetic mean, Geometric mean
– Mode, Median
• Measures of spread
– The range
– The interquartile range (IQR)
– Standard deviation, variance
– Coefficient of variation (CoV)
• Quartiles, Quantiles and percentiles
11
Statistical distribution measures
• Central values
– Arithmetic mean Average(a,b,c)
– Geometric mean
–
Geomean(a,b,c)
– Mode: value with highest probability of occurrence
– The median: central value of the ordered data
Median(a,b,c)
• Trimmed mean:
– e.g. 5 percent trimmed mean is the average of the
data between 5th and 95th percentiles 12
Statistical distribution measures
• Right skewed
• Higher df leads to
normal dist.
13
• Bimodal distribution in nature
• The implications
14
Statistical distribution measures
• Measures of spread
– The range (MIN and MAX)
– The interquartile range (IQR)
Percentile (array, k)
Quartile (array, 0/1/2/3/4)
IQR=0.7413*(Q3-Q1)
– The standard deviation
15
Statistical distribution measures
• Measures of spread
– Variance
VAR(array)
16
Statistical distribution measures
• Measures of spread
– Quartiles, quantiles and percentiles
Quartile (array, 0/1/2/3/4)
Percentile (array, 0.05/0.10/0.95)
– Skewness:
• measure of symmetry of data distribution
Skew (array)
0 is symmetric; <0, left skewed; >0, right
skewed.
17
Statistical distribution measures
• Frequency distributions
– Identify cutting points to divide the data into
categories. The cutoff points should be chosen to
divide the data fairly evenly.
Frequency (data_array,bin_array)
PRESS SHIFT/CTRL/ENTER
Bin Frequency
1 10 2
2 20 0
3 30 2
4 40 3
5 50 5
6 60 4
7 70 2
8 80 0
9 90 1
10 100 1 18
Statistical distribution measures
19
Probability distributions
20
Read and type Greek letters correction
• Alt 956
• Alt 963
• Alt 961
• Alt 960
https://www.thespruceeats.com/the-greek-
21
alphabet-1705558
Probability distributions
22
Probability distributions
• Cumulative or not?
• Left tailed or right tailed?
• How to generate a normal distribution in excel?
23
Probability distributions
• Examples
– A normal distribution with η=8mg/L and σ=1 mg/L;
– Look for the value with 95% of data below?
– Look for the probability that the value is read
below 6.4mg/L?
24
Probability distributions
• t distribution
– In normal distributions, both η and σ are known;
– In practice, σ is often not known and we use Se to
replace σ:
Guinness brewer
Gosset, 1908 25
“Student” as pen name
Probability distributions
26
Probability distributions
• Example
– What is the 97.5th percentile of a t distribution with
degree of freedom 24 ?
– T.INV.2T(0.05, 24)=2.06
OR -T.INV(0.025,24)
– T.DIST.2T(2.064,24)
28
Distribution of average and variance
29
Distribution of average and variance
30
Distribution of average and variance
31
Distribution of average and variance
• Example:
32
Tutorial session