CAMI16 - Data Analytics

The document provides an overview of Principal Component Analysis (PCA) and Factor Analysis, detailing their methodologies, applications, and algorithms. PCA is a statistical technique used for dimensionality reduction by transforming correlated variables into uncorrelated principal components, while Factor Analysis models observed variables based on underlying latent factors. The content includes practical examples, particularly using the iris dataset in R, to demonstrate the implementation of PCA.

Uploaded by

jaspreet19504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views28 pages

CAMI16 - Data Analytics

Uploaded by

jaspreet19504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

CAMI16 – Data

Analytics
Module 4 - contents
• Principal Component Analysis-Extracting Principal Components,
Graphing of Principal Components, Some sampling Distribution
results, Component scores, Large sample Inferences, Monitoring
Quality with principal Components.
• Factor Analysis-Orthogonal Factor Model, Communalities, Factor
Solutions and rotation.
Principal Component Analysis
• Principal component analysis (PCA) is a statistical procedure that is
used to reduce the dimensionality.
• It uses an orthogonal transformation to convert a set of observations
of possibly correlated variables into a set of values of linearly
uncorrelated variables called principal components. It is often used as
a dimensionality reduction technique.
Principal Component Analysis
Principal Component Analysis
Principal Component Analysis
Principal Component Analysis
Principal Component Analysis
PCA Algorithm
• Standardize the data: PCA requires standardized data, so the first step is to
standardize the data to ensure that all variables have a mean of 0 and a
standard deviation of 1.
• Calculate the covariance matrix: The next step is to calculate the covariance
matrix of the standardized data. This matrix shows how each variable is related
to every other variable in the dataset.
• Calculate the eigenvectors and eigenvalues: The eigenvectors and eigenvalues
of the covariance matrix are then calculated. The eigenvectors represent the
directions in which the data varies the most, while the eigenvalues represent
the amount of variation along each eigenvector.
• Choose the principal components: The principal components are the
eigenvectors with the highest eigenvalues. These components represent the
directions in which the data varies the most and are used to transform the
original data into a lower-dimensional space.
• Transform the data: The final step is to transform the original data into the
lower-dimensional space defined by the principal components.
PCA Example
PCA Example
• Standardize the data set using

Or using xnew =
PCA Example
• Calculate Co-variance Matrix
PCA Example
• Calculate Eigen value:
PCA Example
• Calculate Eigen vector
PCA Example
• Sort the Eigen values and their corresponding Eigen vectors
• λ = 2.51579324 , 1.0652885 , 0.39388704 , 0.02503121
• Pick k eigenvalues and form a matrix of eigenvectors
• If we choose the top 2 eigenvectors, the matrix will look like this:
PCA Example
• Transform the original data
PCA Reconstruction
• Transformed data * (top k eigen vector)T = zero mean data =

• Zero mean data+ mean = Original data

•
• + =
Applications of PCA
•Data Visualization/Presentation
•Data Compression
•Noise Reduction
•Data Classification
•Trend Analysis
•Factor Analysis
PCA in R
# display iris dataset
print(iris)

# use dim() to get dimension of dataset Dimension: 150 5

cat("Dimension:",dim(iris))

# use nrow() to get number of rows Row: 150

cat("\nRow:",nrow(iris))

# use ncol() to get number of columns Column: 5

cat("\nColumn:",ncol(iris))

# use names() to get name of variable of dataset Name of Variables: [Link] [Link]
cat("\nName of Variables:",names(iris)) [Link] [Link] Species
PCA in R
# get statistical summary of Sepal length variable [Min. 1st Qu. Median Mean 3rd Qu. Max.
summary(iris$[Link]) 4.300 5.100 5.800 5.843 6.400 7.900

Perform PCA on the iris data Importance of components:

PCA_iris=prcomp(iris[,c(1:4)],scale=TRUE) PC1 PC2 PC3 PC4
Standard deviation 1.7084 0.95600.38309 0.14393
# get statistical summary of PCS results Proportion of Variance 0.7296 0.2285 0.03669 0.00518
summary(PCA_iris) Cumulative Proportion 0.7296 0.9581 0.99482 1.00000

[Link] <- plot(PCA_iris, type="l")

[Link]

library(ggfortify)
[Link] <- autoplot(PCA_iris,
data = iris,
colour = 'Species’)
[Link] <- biplot(PCA_iris)
[Link]
Factor Analysis
• Factor Analysis is a method for modeling observed variables, and
their covariance structure, in terms of a smaller number of underlying
unobservable (latent) “factors.”
• Factor analysis is generally an exploratory/descriptive method that
requires many subjective judgments.
• In factor analysis, we model the observed variables as linear functions
of the “factors.”
• In principal components, we create new variables that are linear
combinations of the observed variables. In both PCA and FA, the
dimension of the data is reduced.
Orthogonal factor Model
Orthogonal factor Model contd …
Model Assumptions
Model Assumptions
Estimating the factors – Principal
Component Method
Estimating the factors – Principal
Component Method contd…

PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
PCA
100% (1)
PCA
33 pages
Qrm2024 Topic5 Pca Fa
No ratings yet
Qrm2024 Topic5 Pca Fa
67 pages
Program 3
No ratings yet
Program 3
7 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Pca 1
No ratings yet
Pca 1
3 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
03 Principal Components Analysis
No ratings yet
03 Principal Components Analysis
3 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
17 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
12 pages
PCA - Principal Component Analysis: Step by Step Computation of PCA
No ratings yet
PCA - Principal Component Analysis: Step by Step Computation of PCA
2 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
PCA: Data Reduction Techniques
No ratings yet
PCA: Data Reduction Techniques
32 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
MDA PrincipalComponentAnalysis
No ratings yet
MDA PrincipalComponentAnalysis
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
Principal Component Analysis - Wikipedia
No ratings yet
Principal Component Analysis - Wikipedia
28 pages
Pca 1692550768
No ratings yet
Pca 1692550768
13 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
Principal Component Analysis - (Pca) : Its Mechanics & Relevance To Modelling
No ratings yet
Principal Component Analysis - (Pca) : Its Mechanics & Relevance To Modelling
5 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Factor Analysis and PCA Overview
No ratings yet
Factor Analysis and PCA Overview
28 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
28 pages
PCA Theory
No ratings yet
PCA Theory
13 pages
Practical Guide To Principal Component N R
No ratings yet
Practical Guide To Principal Component N R
43 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
PCA Guide and R Implementation
No ratings yet
PCA Guide and R Implementation
11 pages
Principal Component Analysis Guide
No ratings yet
Principal Component Analysis Guide
23 pages
UploadFile 9116
No ratings yet
UploadFile 9116
21 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
4 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
PCA ChrisDing4
No ratings yet
PCA ChrisDing4
74 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Mat 211 - 7
No ratings yet
Mat 211 - 7
14 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
Principal Components Analysis (PCA)
No ratings yet
Principal Components Analysis (PCA)
27 pages
Pca Topic
No ratings yet
Pca Topic
12 pages
Understanding Principal Component Analysis
100% (1)
Understanding Principal Component Analysis
18 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
15 pages
PCA Guide for B.Tech Students
No ratings yet
PCA Guide for B.Tech Students
10 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
PCA for Data Scientists
No ratings yet
PCA for Data Scientists
45 pages
10 ASAP Advanced Statistics Dimension Reduction
No ratings yet
10 ASAP Advanced Statistics Dimension Reduction
8 pages
Module 3
No ratings yet
Module 3
41 pages
Principal Component Analysis: Term Paper For Data Mining & Data Warehousing
No ratings yet
Principal Component Analysis: Term Paper For Data Mining & Data Warehousing
11 pages
What Is PCA?: Image Source
No ratings yet
What Is PCA?: Image Source
17 pages
CAMI16 - Data Analytics
No ratings yet
CAMI16 - Data Analytics
55 pages
Cami16 Data Analytics
No ratings yet
Cami16 Data Analytics
37 pages
Machine Learning
No ratings yet
Machine Learning
52 pages
CAMI16 - Data Analytics
No ratings yet
CAMI16 - Data Analytics
15 pages
Your Question Has Been Answered: Question: 3. Draw The 11-Item Hash Table Resulting From Hashing The Key
No ratings yet
Your Question Has Been Answered: Question: 3. Draw The 11-Item Hash Table Resulting From Hashing The Key
5 pages
Numerical Differential
No ratings yet
Numerical Differential
12 pages
Implement BST with Linked List
No ratings yet
Implement BST with Linked List
10 pages
DSP Assignments 1 Solution Spring 2020 PDF
100% (1)
DSP Assignments 1 Solution Spring 2020 PDF
3 pages
Linear Arrays: Memory Representation Traversal Insertion Deletion Linear Search Binary Search Merging 2D Array: Memory Representation
No ratings yet
Linear Arrays: Memory Representation Traversal Insertion Deletion Linear Search Binary Search Merging 2D Array: Memory Representation
34 pages
Composite Simpson's Rule Derivation
No ratings yet
Composite Simpson's Rule Derivation
2 pages
Dimitri Bertsekas - Nonlinear Programming (Google Books Preview) (2016, Athena Scientific) - Libgen - Li
No ratings yet
Dimitri Bertsekas - Nonlinear Programming (Google Books Preview) (2016, Athena Scientific) - Libgen - Li
64 pages
PCA on Satellite Images: A Guide
No ratings yet
PCA on Satellite Images: A Guide
10 pages
Introduction To Algorithm
No ratings yet
Introduction To Algorithm
7 pages
Lecture 3
No ratings yet
Lecture 3
15 pages
Moving Target Range Detection Algorithm For FMCW Radar: Eugin Hyun, Jonghun Lee
No ratings yet
Moving Target Range Detection Algorithm For FMCW Radar: Eugin Hyun, Jonghun Lee
4 pages
A Comprehensive Survey On Feature Selection in The Various Fields of Machine Learning
No ratings yet
A Comprehensive Survey On Feature Selection in The Various Fields of Machine Learning
39 pages
Bloomberg - LeetCode
No ratings yet
Bloomberg - LeetCode
10 pages
Leetcode Questions - Public
No ratings yet
Leetcode Questions - Public
26 pages
Presentation: Sorting Algorithms (Heap, Shell, Radix, Bucket)
No ratings yet
Presentation: Sorting Algorithms (Heap, Shell, Radix, Bucket)
36 pages
Differential Equations for Engineers
No ratings yet
Differential Equations for Engineers
2 pages
Phase & Group Delay in DSP Filters
No ratings yet
Phase & Group Delay in DSP Filters
17 pages
LADA ISYE6189037 DeterministicOptimizationStochasticProcesses-Question
No ratings yet
LADA ISYE6189037 DeterministicOptimizationStochasticProcesses-Question
13 pages
SciPy for Scientists & Engineers
No ratings yet
SciPy for Scientists & Engineers
19 pages
SampleFINAL KEY 205 Taylor
No ratings yet
SampleFINAL KEY 205 Taylor
9 pages
Deep Learning Laboratory Viva Ques& Ans
No ratings yet
Deep Learning Laboratory Viva Ques& Ans
5 pages
DL Unit-2 - Deep Learning Unit 2 Material DL Unit-2 - Deep Learning Unit 2 Material
No ratings yet
DL Unit-2 - Deep Learning Unit 2 Material DL Unit-2 - Deep Learning Unit 2 Material
37 pages
Digital Audio Compression Techniques
No ratings yet
Digital Audio Compression Techniques
71 pages
Assignment Problem
No ratings yet
Assignment Problem
20 pages
Lecture 27
No ratings yet
Lecture 27
7 pages
Cbot
No ratings yet
Cbot
6 pages
Advantages of Delta Modulation
No ratings yet
Advantages of Delta Modulation
65 pages
BoGE Algo TD3 Solution
No ratings yet
BoGE Algo TD3 Solution
6 pages
Gradient Descent for ML Practitioners
No ratings yet
Gradient Descent for ML Practitioners
2 pages
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization (Week 3) Quiz
No ratings yet
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization (Week 3) Quiz
6 pages

CAMI16 - Data Analytics

Uploaded by

CAMI16 - Data Analytics

Uploaded by

CAMI16 – Data

• Zero mean data+ mean = Original data

# use dim() to get dimension of dataset Dimension: 150 5

# use nrow() to get number of rows Row: 150

# use ncol() to get number of columns Column: 5

Perform PCA on the iris data Importance of components:

[Link] <- plot(PCA_iris, type="l")

You might also like