0% found this document useful (0 votes)
21 views3 pages

Understanding Statistical Models and Tests

The document discusses statistical models and how they represent data. A mean can act as a simple model but does not provide all desired information. Parameters estimate population values while statistics estimate sample values. To assess how well a model fits the data, we consider the margin of error - the less error, the better the fit. Confidence intervals contain the true population mean 95% of the time. The p-value from Fisher's test shows whether results are likely due to chance or represent a real effect in the data. Assumptions like normality and homogeneity of variance should be checked before interpreting statistical tests.

Uploaded by

shireensaghir05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views3 pages

Understanding Statistical Models and Tests

The document discusses statistical models and how they represent data. A mean can act as a simple model but does not provide all desired information. Parameters estimate population values while statistics estimate sample values. To assess how well a model fits the data, we consider the margin of error - the less error, the better the fit. Confidence intervals contain the true population mean 95% of the time. The p-value from Fisher's test shows whether results are likely due to chance or represent a real effect in the data. Assumptions like normality and homogeneity of variance should be checked before interpreting statistical tests.

Uploaded by

shireensaghir05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Statistical model kinda represents our data.

Mean is the most basic way to understand data. Building block.

Model is a representation of the data

Mean can be a model that represents data but it wont tell what I exactly want to know

Average can be a parameter says a fundamental truth ab a population

Paramter to Population

Statistic for sample b0 is a parameter

Cap aal B = claiming its an estimate

In parameter we get a rough estimate of what it may be bc we have errors and we don’t have all the
population

To see if the model is a good fit it depends on the margin of error. Less error = better fit

Less error = more representative it is. => Deviance = outcome – model

SD for 1 sample

SE for multiple

Confidence Intervals: mnetaaleun thru upper w lower boundries

Boundary where item, mean, score lies

95/100 will contain the true mean

Spss sets the boundaries.

Fishers p value shows iza the rlt is significant or based on chance, results may not be due t the reality of
the data

Reject the null hypothesis = no rlt

P value tells the likelihood the results aren’t due to chance

If p value is more than 5% then the data can be due to randomness or error

So if bellow it is truly a real effect not due to randomness

P=0.01 = 1% chance this result is due to error

Alternative hyp is the real hyp, H1

Null hypothesis is H0

Chapter 6 + SPSS
 How we run assumptions, how to check if normally distributed, homo of variance is assumed,
 How to spot outliers:
1. By graph
2. Statistically
 When very extreme = easy to plot
 Histogram or boxplot to spot outliers
 1.96 huwe l cutoff
 Nb of particpants is 50, 5% Is 2.5, so only 2.5 outliers out of 50
 If we get more outliers we just report it
 1.96 not an outlier

Ma lezem netrok shi bel dataset missing,


1. To check for outliers, analyze - descriptive statistics - statistics - save standardized values,, (z
scores)
5% of 810 = 40. So 40 z scores a3la men 1.96
To see in it a table, descriptive statistics – frequencies - put z scores – ok –
Fo2 huwe awta raem , tahet a3la raem
Awta shi is 1.88 < 1.96 so not outlier
Bottom of the table 2 > 1.96 so its an outlier
Second row is the nb of times this raem bayyan so we have to count that section when
calculating the outliers

Normality Checking:
Analyzing normality we focus on histograms and KS Test. KS test is the clearest. Shapiro wilk test
we don’t report it
Based on the analysis they get a normally distributed sample and they compare it to the sample
we have here. Normal compared to our own variable
KS test compares the normally distributed graph to our graph. Based on the comparison we can
know if its normal or not normal distribution.
If our distribution is similar to the normal that means there’s no significant difference
If ours wasn’t normal than its different to it becomes significantly different from the normal
KS test will give a p value that tells if its normally or not distributed. P value tells if something is
significant or not.
Hon focusing if its significantly different or not
If normality is significant it means it is significantly different from the normal (unlike abel)
So normality is not assumed
Significantly different from the normal
When its equal
Cig = p value
P value = 0.097 = not significantly different from the normal. Normality is assumed
D is for normality (degrees of freedom, n/ df) = statistics, p value = xxx or p value <> xx
Italicize D w p R

Homoscedasticity not for groups?


To check:
1. Graphs
2. Significance test to run to check for homo called Levene’s Test we analyze it the same way we
analyze normality
Funnel like graphs assumption of homogen is not met, its heterogenous
Same as normality
If significant, the variances are significantly different hence homogeneity has not been met
Not significant the variances aren’t significantly different
In leven’s test, we only focus on the first row (mean) its only based on the mean.
Looking at the variances between the two groups, if significantly different homo not met.
Normality : D
Levene/Homo Test: F
F(df1, df2)= statistic, p-value

Steps:
1. Check for outliers (analyze – descriptives – select the ones badna l z scores kermelon – save
standardized – options – remove kelshi ela mean laan m elon aaze )
Z scores are the strategy to know how many outliers we have. Cutoff is +-1.96
Number of outliers should not exceed 5% of n(sample
2. Normality analyze – desc stats – explore – add variables to dependent list – plots – normality
plots – histogram remove leaf
We look at KS section ignore shamiro wilk. Report each seperatly.
3. Homogeniety of variance: analyze – desc stats – explore – add grouping variable to factor list –
plots – remove normality and histogram – add untransformed then we get the test of homo of
variance. Only look at first row (mean)
When reporting iza p = 0.000 put it p < .0.05

You might also like