0% found this document useful (0 votes)
174 views28 pages

Multivariate Data Analysis Techniques

The document summarizes different types of multivariate analysis techniques. It discusses dependence methods where one or more variables are predicted by others, such as multiple regression, ANOVA, discriminant analysis, and conjoint analysis. It also discusses interdependence methods where the relationships among all variables are examined together, including factor analysis and cluster analysis. Specific techniques like multiple regression, ANOVA, discriminant analysis, structural equation modeling, and conjoint analysis are described in more detail.

Uploaded by

Trinh Anh Phong
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
174 views28 pages

Multivariate Data Analysis Techniques

The document summarizes different types of multivariate analysis techniques. It discusses dependence methods where one or more variables are predicted by others, such as multiple regression, ANOVA, discriminant analysis, and conjoint analysis. It also discusses interdependence methods where the relationships among all variables are examined together, including factor analysis and cluster analysis. Specific techniques like multiple regression, ANOVA, discriminant analysis, structural equation modeling, and conjoint analysis are described in more detail.

Uploaded by

Trinh Anh Phong
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Session 7

MULTIVARIATE DATA ANALYSIS

Contents…
1. Introduction to multivariate analysis
2. Dependence methods
3. Interdependence methods

12/08/21 1
I. INTRODUCTION
 Involves the simultaneous analysis of >2 variables
 Advanced statistical techniques
 Powerful in solving complex research problems
 “These techniques are extremely dangerous when being
used by unskilled people” because “there are a number of
problems and statistical assumptions related to each
technique”.

(Kinnear & Taylor, 1987, p.525).


12/08/21 2
 Dependence methods: One or more variables have
been designated as being predicted by a set of
independent variables.
Multiple regression, ANOVA, Conjoint analysis,
Discriminant analysis, Structural Equation Modeling...

 Interdependence methods: No variable(s) are


designated as being predicted by others. It is the
interrelationship among all the variables taken together
that interests the researcher.
Factor analaysis, Cluster, Multidimensional Scaling.

12/08/21 3
II. DEPENDENCE METHODS
Scale requirement
Method Required scale of variable(s)
Dependent Independent
One dependent variable
Multiple regression Interval interval
ANOVA Interval Nominal
Multiple regression with Interval Nominal
dummy variable
Discriminant analysis Nominal Interval
Conjoint analysis Ordinal Nominal
Two or more dependent variables
Canonical analysis Interval Interval
MANOVA Interval Nominal
Network structure including many dependent and independent variables
12/08/21 4
SEM Interval Interval
II.1 Multiple Regression
Y = a1X1 + a2X2 + a3X3 + ... anXn + b
 One DV, two or more IDVs
 All are intervally scaled variables (except dummy variable)

 Three key results to analyze:


 The fitness of the multiple regression equation:
represented by r2 = 0  1 (coefficient of determination)
% of variation of Y explained by the regression.
 Test of the significance level of r2: Use F – test (sig. )
 Test of the significance level of each regression coeficient
(a1, a2, a3,…) : Use t – test (sig.)
12/08/21
(SPSS provides all sig. levels) 5
Assumptions in multiple regression

a. Linearity: relationships between DV and IDVs are linear.

Test by observing the scatter diagram or correlation matrix

b. Multicolinearity: No linear correlation among IDVs.

Test by investigating “Tolerance” or VIF

c. Normality of all variables and of all residuals

d. Constant variance of the error term (Homoscedasticity)

e. Independence of the Error Terms

12/08/21 6
Notes when using multiple regression:

 Applicable when there exist linear correlations among


variables.

 Do not prove causal relationship.

 Can be used for Prediction or Explanation

 There should be more than 10 observations for one IDV


( requird sample size)

 If IDV is nominally scaled, dummy variable regression can


be employed
12/08/21 7
Example:
Identifying the determinants of employee satisfaction in XYZ Co.
DV: Employee satisfaction.
IDVs: Rewards, Working condition, Recognition by managers,
Peer relationship, Promotion Opport., Development Opport.
IDVs Unstandardized Standardi t Sig. Collinearity
Coefficients zed Statistics
Coefficie
nts
B Std. Beta Tole VIF
Error ance
(Constant) 0.540 0.193 2.793 .007

Rewards 0.526 0.081 0.596 6.491 .000 .793 1.062

Recognition 0.205 0.061 0.310 3.380 .001 .793 1.262


12/08/21 8
r2=0.619 F sig. = 0.000
II.2. ANOVA – ANALYSIS OF VARIANCE

 Non-metric IDVs and metric DV


 Used to compare means of DV under the impact of one or
more IDVs.
 Can be used with more than one IDV (factorial ANOVA).
 Principle: “between-group variance > within-group
variance”  significant differences in the means of groups
 Family: ANCOVA / MANOVA / MANCOVA

12/08/21 9
Example of ANOVA:

A survey of 200 companies in garment, cosmetic


and plastic industries about their average expenses
for sales promotion during the last three years.

The researcher wants to explore whether there are


significant differences in the average expenses for
sales promotion among these three industries

12/08/21 10
Company No. Industry SP expenses
(1000 USD)
1 Garment 123
2 Garment 235
3 Cosmetic 1346
4 Plastic 876
.. ..
199 Plastic 68
200 Garment 12
 IDVs: Industry(nominal) (3 treatments)
 DV: Sales Promotion expenses (ratio)
12/08/21 11
Possible method: compare the mean values of DV for
each pair of industries (using t – test).
However, when the No. of treatments increases  the
comparisons become arduous.

In such a situation, ANOVA is the better method:


H0 : 1 = 2 = ... = k = 
Ha : at least one i which is significantly different from the
others.
Where  = population mean
12/08/21 12
II.3. DISCRIMINANT ANALYSIS
 Purpose: to identify the linear combination of IDVs that is
best discriminate among the prespecified groups that are
formed on the basis of a DV.
 Metric IDVs, Nominal DV.
 Outcomes: A linear combination:
Y = v1.X1 + v2.X2 + v3.X3 + …and critical score Ycri
 For a particular subject:
Calculate its Y score,
Compare Y Ycri
 predict which group the subject belongs.
12/08/21 13
Example
 An IT trading company wants to know whether family income
(X1) householder’s education (X2) are useful to discriminate
between PC buyers and non-PC buyers.
 Conduct a survey of n households (with / without a PC).
IDVs: X1 – income, X2 – education : metric variables
DV: with a PC, without a PC: categorical variable
 Analysis results: discriminant function Y= v1X1 + v2X2
v1 , v 2 : discriminant coefficients
Ycri : critical score
 Given a household i (X1i and X2i ) we can predict whether it is
a (potential) buyer.
12/08/21 14
II.4. CONJOINT ANALYSIS
 Derivingthe utility values attached to various attributes of an
object based on respondents’ overall preferences for different
bundles of attributes/ profiles of the object.
 Nonmetric IDVs - Ordinal DV
 The researcher designs a number of test alternatives. Each
alternative represents a combination of treatments.
 Respondents are asked to rank the alternatives according to
their preference.
 Conjoint
analysis derives the utility score for each attribute
representing their relative importance to the overall preference.
 To test a new product with various attributes (quality, packaging,
price...). Each has some treatments (high/medium/low).
To estimate the market shares of different brands.
To segment the market/ Categorize study subjects
12/08/21 15
Example

Test a new product with 3 attributes:


Price: (high, medium, low)
Package size: (small, medium, large)
Features: (simple, complex)

Form 8 test alternatives (instead of 18 combinations).


Ask respondents to rank order

Results:
 contribution of each attribute to overall preference
 preference of each treatment in an attribute.
 identify the most preferred combination.

12/08/21 16
II.5. Structural Equation Modeling - SEM
CUSTOMER MANAGEMENT
ORIENTATION COMPETENCIES

COMPETITOR .80 .73 .33


ORIENTATION

.62

FUNTIONAL .83 MARKET .34 BUSINESS


COORDINATION ORIENTATION PERFORMANCE

.59

PROFIT .83
ORIENTATION

RESPONSIVENESS
III. INTERDEPENDENCE METHODS
III.1. Factor Analysis
For data / variable reduction by grouping them into
representing factors.

Metric variables

Application:
 Developing multi-item scale
 Explore the pattern of a data set
 Reduce the dimensions in a data set
12/08/21 18
Example:
Case X1 X2 X3 …. …. Xm
1
2
3

n
Factor analysis: grouping m variables into k factors
Factor 1 includes X1 X6 X9 Xm
Factor 2 includes X2 X3 X10 Xm - 1
Factor 3 includes X4 X5 X7 X8 ...
Exploratory factor analysis (EFA)
Confirmatory factor analysis (CFA).
12/08/21 19
III.2. Cluster analysis
Segmenting objects into homogeneous groups, given
data for the objects on a variety of characteristics.

Ex: Market Segmentation


Buying behavior Typology
Procedure:
- Identify variables / characteristics for for grouping
- Segmenting based on similarities - distances.
- Labeling clusters based on their shared charateristcs.
- Validation and profiling
12/08/21 20
Example: Segmenting the detergent market
 Metric Scales
 Based on consumer buying behaviors.

 “Please indicate the importance level (from 1 for very important


to 5 for not important at all) of the following factors when you
consider buying detergent powder”

X1 – Product quality ____


X2 – Price ____
X3 – Convenience ____
X4 – Known brand ____
12/08/21
X5 – Sales promotion ____ 21
300 consumers have been surveyed
The data are cluster analyzed to identify different clusters.

Customers within each cluster have similar perception on


the importance of (X1  X5) on their buying decision.

Results:
Cluster 1 – (Young, urban, medium/high income)
 X1, X4, X5 are important;
Cluster 2 – (Industrial / business customers)
 X1, X2, X3 are important.

 From these findings, the company will develop its


targeting strategy and business plan.
III.3. Multidimensional scaling (perceptual mapping)

 Inferring the number / nature of dimensions underlying


respondent perceptions based on their judgements about
objects (brands, products, companies, localities, etc.)

 Metric / nonmetric scale

 Identifying the relative positions (on a map) of competitive


brands based on several dimensions.

12/08/21 23
Example: MDS result for TV brands in HCMC

12/08/21 24
PRACTICE PROJECT
Work on your data set to answer the research objectives:
 Determinants of learning effectiveness?
 How to improve learning effectiveness?

Simplified procedure:
 For each concept in the model, pick up one representative
variable
 Run multiple regression with “learning effectivenss” as
dependent variable and others as independent variables
 Interpret the results to answer the research objectives

12/08/21 25
PRACTICE PROJECT
 A better procedure:
 Assess and refine the scales by using Factor
analysis and Reliability assessment
 Calculate factor scores using the qualified variables
 Multiple regression
 Interpret the results

12/08/21 26
The SEM result
teaching .68
method
assessment
.50
.29

instructor .12
devotion
reference .14
.58
readability
.42 .32
.34
learning
.29
.24 effectiveness
learning
motive
.13

.16

active
participation

78% of the variation of LEARNING EFFECTIVENESS


can be explained by this model
END SESSION 7

12/08/21 28

You might also like