Statistics DGGB 6820 - Excel Techniques
Statistics DGGB 6820 - Excel Techniques
Office 2003: How to check if you have it: o If [Data Analysis] is in your [Tools] menu, then you have it. Installing the Data Analysis Toolpak: o On the [Tools] menu, click [Add-Ins]. o In the Add-Ins available box, select the check box next to *Analysis Toolpak+ and *Analysis Toolpak VBA+ o If you see a message that tells you the Analysis Toolpak is not currently installed on your computer, click [ o Click [Tools] on the menu bar. When you load the Analysis Toolpak, the [Data Analysis] command is added o Note: You may need the internet or the Office Install CD to download or install this package. Office 2007: How to check if you have it: o In the Ribbon (the tabs on the top) click: [Data] On the Data Ribbon check for a button called [Data Analysis] under a box with the label Analysis. ( If you have it, ignore this section, if you do not have it, read on Installing the Data Analysis ToolPak: o Click the top left [Office Button] o Click the button on the bottom called: [Excel Options] o Click the Tab on the left side called [Add-Ins] o Make sure the selection next to the *Go+ button on the bottom says *Excel Add-Ins+ o Click *GO+ at the bottom o Check Box for *Analysis ToolPak+ and *Analysis ToolPak VBA+ o Click [Ok] o Note: You may need the internet or the Office Install CD to download or install this package. Office 2008 - Mac: How to check if you have it: o Um, you dont have to check cause I know you dont have it! o Aha! Macs arent perfect lol (Im a PC, can you tell?) StatPlus LE: o Instead you go here: http://www.analystsoft.com/en/products/statplusmacle/ o Download the free LE version of the software. o The LE version should do everything we need for the purpose of this class.
Note: Im not going to cover the use of the StatPlus LE program, it is 3rd party, but more importantly, I dont have a Mac to pl
so read the instructions and figure it out. It is supposed to be pretty much the same thing as the PC version but in a different SPSS: -
The following link is to the $35 6 month student license of the most basic version which is (as far as I can see, it h http://e5.onthehub.com/WebStore/OfferingDetails.aspx?ws=49c547ba-f56d-dd11-bb6c-0030485a6b08&vsro
pak+ and *Analysis Toolpak VBA+ (if its there), and then click OK. nstalled on your computer, click [Yes] to install it. [Data Analysis] command is added to the [Tools] menu. install this package.
der a box with the label Analysis. (usually the furthermost right of all the boxes)
xcel Add-Ins+
portantly, I dont have a Mac to play around with the damn thing
on which is (as far as I can see, it has all that you need for this class). d-dd11-bb6c-0030485a6b08&vsro=8&o=e9fcd8b9-15c3-de11-886d-0030487d8897
8 6 4 2 0 1 4 3 2 1 0
The Dataset is positively skewed Most of the data is between 0 an There is an outlier towards the r
Frequency
Frequency
Histogram
Histogram
Frequency
Bin2
The Dataset is positively skewed Most of the data is between 0 and 26 There is an outlier towards the right side of the distribution (56)
1 -0.83195 0.909262
1 -0.65672
sitive correlation gative correlation itive correlation ative correlation have a very weak, negative correlation
Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail
Because the absolute value of the t Stat is smaller than t Critical two-tail or Because the Probability that the null hypothesis is true is not smaller than Alpha Therefore, must affirm the Null Hypothesis that there is no statistical difference between Dataset A and 2009 Avg.
Sales Manager of Company X decides to try a new training program for the sales representatives. Compare the average customer satisfaction score for each sales rep. before and after the training program Is there a statistically significant difference? And if so, did the program improve the quality of service these sales reps prov Before 4.2 6.3 5.7 4.8 3.5 3.2 4.4 5.2 3.9 4.3 After 7.3 8.2 5.9 6.4 6.3 5.7 8.2 6.4 5.1 4.1
4.55 6.36 0.936111111 1.667111111 10 10 0.396685546 0 9 -4.507970748 0.000735997 1.833112923 0.001471993 2.262157158
Because the absolute value of the t Stat is greater than t Critical two-tail or Because the Probability that the null hypothesis is true is smaller than Alpha Therefore, we can reject the Null Hypothesis that there is no statistical difference between the two datasets Yes, there is a significant difference between the before and after scores Yes, the training program improved the quality of the customer service
Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail
Based on the Two Tail, there is no significant difference. Since the comparison has no significant difference, neither area is doing much better than the other.
However, notice that Area B has 3 outlyers at the bottom of (98k, 130k, and 150k) So do consider the actual data and not only the anlysis when you're making decisions.
Count 6 6 6
Sum
Average 42 7 18 3 10 1.666666667
df
MS 2 46.22222222 15 1.955555556 17
The F value is well above the F Critical value. The P-value reflects this finding by being significantly smaller than 0.05. Thus we reject the Null Hypothesis, and conclude that there is a significant dif
Although both drugs performed better than the placebo, Drug X clearly perfor It is a good idea at this time to perform a t-Test: Two Sample between Drug X But based on the ANOVA test, I feel safe in saying that Drug X will outperform
F 23.63636364
ebo, Drug X clearly performed better than Drug Y. Sample between Drug X and Drug Y. at Drug X will outperform Drug Y and will be a big hit on the market.
df 5 2 10 17
Recruit 4 was the best performing recruit within the group. However, there is no statistically significant difference between th There is however, a statistically significant difference between the Looking at the test means, we can see that the Rifle Qualification's Thus, either the Rifle Qualification is too hard, or the recruits are b
MS F P-value F crit 36.36667 2.680589681 0.086617815 3.325834529 244.5 18.02211302 0.000483197 4.102821015 13.56667
cruit within the group. ificant difference between the recruits (as indicated by the data for Rows) icant difference between the scores of the three tests. e that the Rifle Qualification's scores is much lower than the other two. too hard, or the recruits are better suited to be Life Guards rather than Marines.
West
aining program for the Marines. fication (SQ), Rifle Qualification (RQ) and Team Qualification (TQ).
SQ 6 120 20 8
West
West
6 102 17 22
East or West by itself does not affect the test scores, in other word There is no significant difference when only considering East or W There is no significant difference when only considering the differe There is however a significant difference when considering East an
We can conclude that West Coast basic training will produce bette We can now conclude that the Rifle Qualification test wasn't too h
aining program for the Marines. fication (SQ), Rifle Qualification (RQ) and Team Qualification (TQ).
RQ 6 69 11.5 19.9
TQ
df
MS F P-value F crit 1 49 1.441647597 0.239267892 4.170876757 2 97.58333333 2.871036286 0.072297426 3.315829501 2 218.0833333 6.41631252 0.004788489 3.315829501 30 33.98888889 35
ot affect the test scores, in other words nce when only considering East or West by itself (as shown by the results from the Sample) nce when only considering the different test types (as shown by the results from the Column) t difference when considering East and West and its relationship to the test types (as shown by the results from the Interaction)
oast basic training will produce better Marines, also based on the last test (ANOVA w/o Rep) e Rifle Qualification test wasn't too hard, and that the East Coast recruits need to improve their skills or become Life Guards.
m (happy face).
Intercept GMAT
RESIDUAL OUTPUT Observation 1 2 3 4 5 6 7 8 9 10 11 12 Predicted GPA 3.390400794 3.080539683 3.833059524 3.567464286 3.169071429 3.833059524 3.080539683 3.390400794 3.611730159 3.124805556 3.257603175 3.788793651 Residuals -0.140400794 0.819460317 0.166940476 -0.237464286 -0.299071429 0.166940476 -0.320539683 0.519599206 -0.061730159 0.275194444 -0.817603175 0.061206349
13 14 15
m (happy face).
4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00 500 550
GPA
600 GMAT
650
700
Lower 95% Upper 95% Lower 95.0% Upper 95.0% -1.504078366 3.150038684 -1.504078366 3.150038684 0.000501319 0.008351856 0.000501319 0.008351856
Based on the regression, I can conclude that because I have a GMAT score of 510, I will get roughly a 3.1 GPA However, because the R Square is low, and the Intercept P-value isn't statistically siginificant I have serious doubts as to how well this regression actually predicts the GPA outcome based on GMAT scor
Predicted GPA
510, I will get roughly a 3.1 GPA when I graduate. cally siginificant A outcome based on GMAT scores.
RESIDUAL OUTPUT Observation 1 2 3 4 5 6 Predicted GPA 3.356961597 3.832326948 4.075829686 3.318215853 2.966514553 4.002163563
7 8 9 10 11 12 13 14 15
Standard Error t Stat P-value 0.178036846 20.37974676 4.36734E-10 0.01298053 -12.21709997 9.67282E-08 0.021603502 3.930765923 0.002348982 0.018110098 -4.046260836 0.001927797
Lower 95% Upper 95% Lower 95.0% 3.236489382 4.020202295 3.236489382 -0.187154385 -0.130014478 -0.187154385 0.037369321 0.132467294 0.037369321 -0.113138238 -0.033418124 -0.113138238
All three independent variables significantly influence the GPA Because the R Square is high, and Because all P-values are smaller than 0.05 I am very confident in the belief that: If I want to do very well on the Final, I should have Only 1 Beer at the party No more than 8.5 hours of sleep No more than 2 cups of coffee
3.00 GPA 2.50 2.00 1.50 1.00 0.50 0.00 0 0.5 1 1.5 2 2.5 Coffee 3 3.5 4 4.5 y = 0.0604x3 - 0.4957x2 + 1.0448x + 3.0194 R = 0.2502
10
2.3226
10
1.0448x + 3.0194