Bda
Bda
Marathon Exam
6. Conduct the t test for the given multiple regression model for 8 df at alpha equal to 0.05 level of significance.
Attitude = 0.33732 + 0.48108 (Duration) + 0.28865 (Importance)
7. Draw the anova table for the given data
Sales Ads Expenditure
50 250
70 280
65 320
75 350
85 360
95 390
8. What is multi co linearity? How do you use ridge regression to remove multi co linearity Problem?
9. You come across a magazine article reporting the following relationship between annual expenditure on prepared
dinner ( PD) and annual income (INC)
PD = 23.4+0.003 INC
The coefficient of INC variable as significant. 1) Does the relationship seem plausible? Is it possible to have a
coefficient that is small in magnitude and yet significant? 2) From the information given, can you tell how good the
estimated model is? 3) what are the expected expenditure on prepared dinners of the family earning $30,000? 4) If a
family earning $40,000 spent $130 annually on prepared dinners, What is the residual? 5) What is the meaning of
negative residual?
10. 1. How do you arrive at the relative weightage for the attributes and the part worth values for the levels?
2. Explain the applications of conjoint analysis.
11. In the conjoint analysis output file the relative weightage column is not given, arrive at the relative weightage for
each attribute and also arrive at the more favorable combinations of attribute and levels
14. In the given table, find the missing value of factor analysis.
15. 1. What is meant by oblique rotation? List the algorithm related to oblique rotation.
2. Explain the “Thurston Princpal.
16. By using the data below, define a binary response variable Z that assumes the value 0 if a firm is bankrupt and 1 if
a firm is not bankrupt.
CF – Cash flow,TD – Total Debt,NI – Net Income, TA - Total Assets, CA – Current assets ,CL – Current Liabilities
1) Develop the discriminant function for the firm
2) Develop confusion matrix
3) Find the Apparent error rate.
Row X1 = X2 = X3 = X4 = POPULATION
CF/TD NI/TA CA/CL CA/NS
2
2 -.45 -.41 1.09 .45 0
3 -.56 -.31 1.50 .16 0
4 .06 .02 1.01 .40 0
5 -.07 -.09 1.45 .26 0
6 -.10 -.09 1.56 .67 0
7 -.14 -.07 .71 .28 0
8 .04 .01 1.50 .71 0
9 -.06 -.06 1.37 .40 0
10 .07 -.01 1.37 .34 0
11 -.13 -.14 1.42 .44 0
12 -.23 -.30 .33 .18 0
13 .07 .02 1.31 .25 0
14 .01 .00 2.15 .70 0
15 -.28 -.23 1.19 .66 0
16 .15 .05 1.88 .27 0
17 .37 .11 1.99 .38 0
18 -.08 -.18 1.51 .42 0
19 .05 .03 1.68 .95 0
20 .01 -.00 1.26 .60 0
21 .12 .11 1.14 .17 0
1 -.28 -.27 1.27 .51 0
2 .51 .10 2.49 .54 1
3 .08 .02 2.01 .53 1
4 .38 .11 3.27 .35 1
5 .19 .05 2.25 .33 1
6 .32 .07 4.24 .63 1
7 .31 .05 4.45 .69 1
8 .12 .05 2.52 .69 1
9 -.02 .02 2.05 .35 1
10 .22 .08 2.35 .40 1
11 .17 .07 1.80 .52 1
12 .15 .05 2.17 .55 1
13 -.10 -.01 2.50 .58 1
14 .14 -03 .46 .26 1
15 .14 .07 2.61 .52 1
16 .15 .06 2.23 .56 1
17 .16 .05 2.31 .20 1
18 .29 .06 1.84 .38 1
19 .54 .11 2.33 .48 1
20 -.33 -.09 3.01 .47 1
21 .48 .09 1.24 .18 1
22 .56 .11 4.29 .45 1
23 .20 .08 1.99 .30 1
24 .47 .14 2.92 .45 1
25 .17 .04 2.45 .14 1
.58 .04 5.06 .13 1
3
17. How do you find partial ‘F’?
18. From the given variance – Co-variance matrix, deduce the correlation matrix.
X1 x2 x3
X2 4.8 2.8
X3 4.6
19. For the given data use a normalization process to find the normalized values
5
4
8
5
7
6
Y Value Ŷ Value
5 7
6 8
4 7
7 4
Y
Ŷ Value
50 48
60 57
63 59
68 59
72 62
73 70
75 73
78 82
85 78
92 85
23. From the given information, find the relative weightages for the attributes
4
PRODUCT
α11 α12 α13 α21 α22 α23 α31 α32 α33 α41 α42
Utility values
α11 -0.1
α12 -0.1
α13 0.2
α21 0.3
α22 0.1
α23 -0.4
α31 0.4
α32 -0.3
α33 -0.1
α41 0.01
α42 -0.01
25. Explain about the Mahalanobis’s D2 test and when and how do you apply this test.
26. Explain the steps involved in stepwise discriminant analysis for four variables.
27. Use the Fischer’s linear discriminant function in the given data set and evaluate the result by
resubstitution ,the probabilities of misclassification. ?
WAIS subsets:
X1=information
X2=similarities
X3=arithmetic
X4=picture completion
Group II
SUBJECT INFORMATION SIMILARITIES ARITHMETIC PICTURECOMPLETION
1 9 5 10 8
2 10 0 6 2
3 8 9 11 1
4 13 7 14 9
5 4 0 4 0
6 4 0 6 0
7 11 9 9 8
8 5 3 3 6
9 9 7 8 6
10 7 2 6 4
11 12 10 14 3
5
12 13 12 11 10
MEAN 8.75 5.33 8.5 4.75
Group I
GROP I
28. The annual financial data listed in table have been analyzed by jhonson with a view toward detecting
influential observations in a discriminant analysis. Consider variables X1=CF/TD and X2=CA/CL.
Row X1=CF/TD X2=CA/CL
1 -0.45 1.09
2 -0.56 1.51
6
3 0.06 1.01
4 -0.07 1.45
5 -0.1 1.56
6 -0.14 0.71
7 0.04 1.5
8 -0.06 1.37
9 0.07 1.37
10 -0.13 1.42
11 -0.23 0.33
12 0.07 1.31
13 0.01 2.15
14 -0.28 1.19
15 0.15 1.88
16 0.37 1.99
17 -0.08 1.51
18 0.05 1.68
19 0.01 1.26
20 0.12 1.14
21 -0.28 1.27
1 0.51 2.49
2 0.08 2.01
3 0.38 3.27
4 0.19 2.25
5 0.32 4.24
6 0.31 4.45
7 0.12 2.52
8 -0.02 2.05
9 0.22 2.35
10 0.17 1.8
11 0.15 2.17
12 -0.1 2.5
13 0.14 0.46
14 0.14 2.61
15 0.15 2.23
16 0.16 2.31
17 0.29 1.84
18 0.54 2.33
19 -0.33 3.01
20 0.48 1.24
21 0.56 4.29
22 0.2 1.99
23 0.47 2.92
24 0.17 2.45
25 0.58 5.06
29. How do you find the error rate from the confusion matrix for three groups.Assume your own data and
show the results?
30. Explain the simple structure principle ?
31. Explain the process of factor rotation with the transformation matrix ?
32. Explain the Orthogonal and Oblique rotation and name the algorithms available under each rotation ?
7
33. Find the Residual matrix from the Exploratory factor analysis model. The following are the related
information ?
correlation coefficients for exploration and conformation
1 2 3 4 5 6 7 8 9
1 0.411 0.479 0.401 0.37 0.393 0.078 0.389 0.411
0.245 1 0.463 0.223 0.198 0.244 -0.042 0.169 0.324
0.418 0.362 1 0.231 0.272 0.357 -0.126 0.153 0.307
0.282 0.217 0.425 1 0.659 0.688 0.215 0.221 0.256
0.257 0.125 0.304 0.784 1 0.649 0.293 0.279 0.324
0.239 0.131 0.33 0.743 0.73 1 0.226 0.298 0.294
0.122 0.149 0.265 0.185 0.221 0.118 1 0.602 0.446
0.253 0.183 0.329 0.021 0.139 -0.027 0.601 1 0.63
0.583 0.147 0.455 0.381 0.4 0.235 0.385 0.462 1
34. A regression model has been established Y=.235X1 +.468X2 for 10 observations. Conduct ‘t’test for X1
and X2 at 5% level of significance
35. There are two discriminate functions for the given problem. Draw the confusion matrix for the given
data.
Group 1
X1 X2 X3 X4
4 3 4 5
2 4 5 3
5 2 4 6
3 4 5 3
Group 2
5 6 7 5
7 6 7 5
7 6 8 5
5 4 7 6
5’ 4 3 2
Group 3
8
9 7 6 5
5 4 8 2
7 6 5 3
4 8 5 2
7 3 5 6
36. How do you arrive at the factor score for one observation ?
F1 F2 F3 ψ
-0.95 -0.1 0.35 .867
-0.99 -0.1 0.39 .579
-0.96 0 0.3 .619
-1.07 0.1 0.3 .672
-1.24 0 0.06 .572
-1.18 0.1 -0.3 .479
-0.83 -0.1 -0.4 .796
-0.97 0.1 -0.5 .883
-1.05 0.1 -0.3 .816
-0.11 -0.6 0.06 .338
-0.03 -0.4 -0.1 .153
-0.5 -0.5 -0.1 .894
38. Find the two Eigen vectors for the following matrix.
A= 3 4
2 5
9
39. From the given data find the Discriminant Loading
Z1 Z2
X1 0.41 0.32
X2 0.51 0.42
X3 0.61 0.35
X1 X2 X3
X1 3 4 2
X2 4 3
X3 2
Correlation matrix:
1 0.25 0.4
1 0.34
40. The data below are selected from a much larger body of data referring to candidates for the General Certificate
of Education who were being considered for a special award. Here, Y denotes the candidate’s total mark, out of
1000, in the G.C.E. examination. Of this mark the subjects selected by the candidate account for a maximum of
800; the remainder, with a maximum of 200, is the mark in the compulsory papers- “General” and “ Use of
English” – this mark is shown as X1. X2 denotes the candidate’s mark out of 100, in the compulsory School
Certificate English Language paper taken on a previous occasion.
Compute the multiple regression of Y on X1 and X2, and make the necessary tests to enable you to
comment intelligently on the extent to which current performance in the compulsory papers may be used to
predict aggregate performance in the G.C.E. examination, and on whether previous performance in School
10
Certificate English Language has any predictive value independently of what has already emerged from the
current performance in the compulsory papers.
Candidate Y X1 X2
1 476 111 68
2 457 92 46
3 540 90 50
4 551 107 59
5 575 98 50
6 698 150 66
7 545 118 54
8 574 110 51
9 645 117 59
10 556 94 97
11 634 130 57
12 637 118 51
13 390 91 44
14 562 118 61
15 560 109 66
42. Explain the difference between interval scale and ratio scale.
46. a. Find the sum of squared error for the following data
11
90 70 6
65 75 8
90 80 10
50. How do you establish a confusion matrix for two groups wherein the first group consists of 20
members and second group consists of 40 members? The apparent error rate is 20%
52. Explain the steps involved in performing maximum likelihood method for the factor extraction.
X Y
5 4
7 8
6 5
4 2
5 4
54 . Find the variable and covariance matrix for the following data.
X Y Z
5 4 3
2 4 6
7 5 3
12
4 5 8
4 2 6
3 5 1
55.From the data given below, do all the working using MATLAB
The two models are arrived for the same data. Explain the reason for different values. How do you interpret
partial regression coefficient? Explain the test which is used for testing the significance of partial regression
coefficient.
58 .A process for making steel wire turns out wire with a mean tensile strength of 200 psi. The process standard
deviation is 20 psi. The quality control engineer wants to design a test that will indicate whether or not there has
been a shift in the process average, using a sample size of 25 and a level of significance of α = 0.05. State Ho and
H1 for this test.
59 . Consider a three group discriminant analysis on two variables (X1 and X2).The number of observations in
each group (drawn at random from the population) is : n1=25, n2=35, n3=25.The within group sum of square
matrices and group centroids are given below:
60. Calculate W-1 and B and the eigen values and eigen vectors of W-1B. Construct Linear Discriminant Analysis. Find
the apparent rate error rate and confusion matrix.
G x1 x2
1 150 15
1 147 13
1 145 14
1 144 16
1 153 13
1 140 15
1 151 14
1 143 14
1 144 14
1 142 15
1 141 13
1 150 15
1 148 13
1 154 15
1 147 14
1 137 14
1 134 15
1 157 14
1 149 13
1 147 13
1 148 14
2 120 14
2 123 16
2 130 14
2 131 16
2 116 16
2 122 15
2 127 15
2 132 16
2 125 14
2 119 13
2 122 13
2 120 15
2 119 14
2 123 15
2 125 15
2 125 14
2 129 14
2 130 13
2 129 13
2 122 12
2 129 15
2 124 15
2 120 13
2 119 16
2 119 14
2 133 13
2 121 15
2 128 14
2 129 14
2 124 13
2 129 14
14
3 145 8
3 140 11
3 140 11
3 131 10
3 139 11
3 139 10
3 136 12
3 129 11
3 140 10
3 137 9
3 141 11
3 138 9
3 143 9
3 142 11
3 144 10
3 138 10
3 140 10
3 130 9
3 137 11
3 137 10
3 136 9
3 140 10
2 4
1 3
15
69.An analysis attempt to identify the factors that determine utility values for computer professionals in a
large corporation. The variables included in the study were
i) Education, defined by the dummy variables E1, E2 where
(1,0) for high school diploma
(E1, E2) = (0,1) for B.S degree
(0,0) for advance degree
ii) Whether the individual has management responsibility- defined by the dummy variable.
MGT = 1 if individual has management responsibility
0 if not
The model considered is
SALARY = 11,032.00 – 2,996.00 E1 + 147.98 E2 + 6,883.50 MGT
Find the part worth values for each level of the corresponding factor?
70. A sample of n=10 observations gives the values in the following table.
Ordered observations (X j)
-1.00
-0.10
0.16
0.41
0.62
0.80
1.26
1.54
1.71
2.30
X 1 2 3 4 5 6 7 8 9 10
Y 2 1 2 3 4 5 5 6 7 8
Given X1X -1
= 0.4666 -0.0666
-0.0666 0.0120
72. Find the Λ (X1/X2) and Λ (X3/X4) using the MATLAB for the given data.(Table 1)
x1 x2 x3 x4 x5
16
3.5 46 0.1 7.81 12.63
2.7 35 0 5.11 9
π2 3 30 0 5.12 10.77
17
7.2 22 1 4.7 3.49
73.Using the MATLAB find the confusion matrix for the data given for the three groups’ problem? (Data given
in problem 72 (Note: Only for the first two groups)
74.Find the d2 value for the given data (Table 2) where d2 = (Xj – X) ’ s -1 (Xj – X), where s is the variance-
covariance matrix and find the outliers by subjective observation.
X1 X2 X3 X4 X1 X2 X3 X4
1889 1651 1561 1778 1954 2149 1180 1281
2403 2048 2087 2197 1325 1170 1002 1176
2119 1700 1815 2222 1419 1371 1252 1308
1645 1627 1110 1533 1828 1634 1602 1755
1976 1916 1614 1883 1725 1594 1313 1646
1712 1712 1439 1546 2276 2189 1547 2111
1943 1685 1271 1671 1899 1614 1422 1477
2104 1820 1717 1874 1633 1513 1290 1516
18
2983 2794 2412 2581 2061 1867 1646 2037
1745 1600 1384 1508 1856 1493 1356 1533
1710 1591 1518 1667 1727 1412 1238 1469
2046 1907 1627 1898 2168 1896 1701 1834
1840 1841 1595 1714 1655 1675 1414 1597
1867 1685 1493 1678 2326 2301 2065 2234
1859 1649 1389 1714 1490 1382 1214 1284
76.Find the variance covariance matrix using MATLAB for the given factor loadings values and specific
variance values given in the following table.
Variate λ1 λ2 λ3 ψ
1 0.664 0.321 0.074 0.450
2 0.689 0.247 -0.193 0.427
3 0.493 0.302 -0.222 0.617
4 0.837 -0.292 -0.035 0.212
5 0.705 -0.315 -0.153 0.381
6 0.819 -0.377 0.105 0.177
7 0.661 0.396 -0.078 0.4
8 0.458 0.296 0.491 0.462
9 0.766 0.427 -0.012 0.231
1 5 4 3
2 6 3 5
3 7 5 7
4 6 5 5
5 7 6 5
19
6 8 4 7
7 4 5 7
X1 .42 .51
X2 .32 .15
X3 .26 .31
X4 .36 .26
X5 .17 .28
87.From the given factory analysis output, fill up the blanks in the table
Variables
20
Common Variance (%) - -
88. A Fast moving consumer product company’s marketing manager thinks there is a strong link between the
advertising and promotional expenditure and the sales in the following week. He collects data from his company
records on sales, advertising expenditure, and promotional (Non-advertising) expenditure for one of the large
territories of his company. The date is shown below.
The marketing manager would like you to perform a regression analysis on the data and advise him on how to use the
regression model to predict sales based on advertising and promotional expenditure. What would you tell him?
21
89. A total of 77 samples were collected and analysed for six physical characteristics namely ,specific gravity,pH
value,Osmolarity(MOSM) urea Concentration (UREA),coductivity(MMHO)and calcium concentration (CALCIUM)
1. Using LOGISTIC procedure determine the possible presence of crystal in urine using the special
characteristics
5. Find a R2 Value
22
no 1.017 7.92 680 25.3 282 1.06
no 1.019 5.98 579 15.5 297 3.93
no 1.017 6.56 559 15.8 317 5.38
no 1.008 5.94 256 8.1 130 3.53
no 1.023 5.85 970 38 362 4.54
no 1.02 5.66 702 23.6 330 3.98
no 1.008 6.4 341 14.6 125 1.02
no 1.02 6.35 704 24.5 260 3.46
no 1.009 6.37 325 12.2 97 1.19
no 1.018 6.18 694 23.3 311 5.64
no 1.021 5.33 815 26 385 2.66
no 1.009 5.64 386 17.7 104 1.22
no 1.015 6.79 541 20.9 187 2.64
no 1.01 5.97 343 13.4 126 2.31
no 1.02 5.68 876 35.8 308 4.49
yes 1.021 5.94 774 27.9 325 6.96
yes 1.024 5.77 698 19.5 354 13
yes 1.024 5.6 866 29.5 360 5.54
yes 1.021 5.53 775 31.2 302 6.19
yes 1.024 5.36 853 27.6 364 7.31
yes 1.026 5.16 822 26 301 14.34
yes 1.013 5.86 531 21.4 197 4.74
yes 1.01 6.27 371 11.2 188 2.5
yes 1.011 7.01 443 21.4 124 1.27
yes 1.011 6.13 364 10.9 159 3.1
yes 1.031 5.73 874 17.4 516 3.01
yes 1.02 7.94 567 19.7 212 6.81
yes 1.04 6.28 838 14.3 486 8.28
yes 1.021 5.56 658 23.6 224 2.33
yes 1.025 5.71 854 27 385 7.18
yes 1.026 6.19 956 27.6 473 5.67
yes 1.034 5.24 1236 27.3 620 12.68
yes 1.033 5.58 1032 29.1 430 8.94
yes 1.015 5.98 487 14.8 198 3.16
yes 1.013 5.58 516 20.8 184 3.3
yes 1.014 5.9 456 17.8 164 6.99
yes 1.012 6.75 251 5.1 141 0.65
yes 1.025 6.9 945 33.6 396 4.18
yes 1.026 6.29 833 22.2 457 4.45
yes 1.028 4.76 312 12.4 10 0.27
yes 1.027 5.4 840 24.5 395 7.64
yes 1.018 5.14 703 29 272 6.63
yes 1.022 5.09 736 19.8 418 8.53
yes 1.025 7.9 721 23.6 301 9.04
yes 1.017 4.81 410 13.3 195 0.58
yes 1.024 5.4 803 21.8 394 7.82
23
yes 1.016 6.81 594 21.4 255 12.2
yes 1.015 6.03 416 12.8 178 9.39
90. Find the Covariance matrix for the given data (Without Matlab)?
1 5 4 3
2 6 3 5
3 7 5 7
4 6 5 5
5 7 6 5
6 8 4 7
7 4 5 7
91. How do you establish a confusion matrix for two groups wherein the first group consists of 60
members and second group consists of 50 members? The apparent error rate is 25%.
24