0% found this document useful (0 votes)

24 views11 pages

Understanding Correlation in Statistics

statistics in nursing

Uploaded by

naresh.soni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views11 pages

Understanding Correlation in Statistics

statistics in nursing

Uploaded by

naresh.soni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Correlation Meaning and Need

A correlation is a statistical measure of the relationship between two variables. The

measure is best used in variables that demonstrate a linear relationship between each
other. The fit of the data can be visually represented in a scatterplot. Using a scatterplot,
we can generally assess the relationship between the variables and determine whether
they are correlated or not.

The correlation coefficient is a value that indicates the strength of the relationship
between variables. The coefficient can take any value from -1 to 1. The interpretations
of the values are:

The scatter plot explains the correlation between the two attributes or variables. It
represents how closely the two variables are connected. There can be three such
situations to see the relation between the two variables –

 Positive Correlation (+1) – when the values of the two variables move in the
same direction so that an increase/decrease in the value of one variable is
followed by an increase/decrease in the value of the other variable.
 Negative Correlation (-1)– when the values of the two variables move in the
opposite direction so that an increase/decrease in the value of one variable is
followed by a decrease/increase in the value of the other variable.
 No Correlation (0) – when there is no linear dependence or no relation between
the two variables.

Correlation Formula

Correlation shows the relation between two variables. The correlation coefficient shows
the measure of correlation. To compare two datasets, we use the correlation formulas.

Pearson Correlation Coefficient Formula

The most common formula is the Pearson Correlation coefficient used for linear
dependency between the data sets. The value of the coefficient lies between -1 to
+1. When the coefficient comes down to zero, then the data is considered as not related.
While, if we get the value of +1, then the data are positively correlated, and -1 has a
negative correlation.

Where n = Quantity of Information

Σx = Total of the First Variable Value

Σy = Total of the Second Variable Value

Σxy = Sum of the Product of first and second Value

Σx2 = Sum of the Squares of the First Value

Σy2 = Sum of the Squares of the Second Value

Rank-order correlation

Definition

A rank-order correlation is a correlation between two variables whose values are ranks.

Description

When variables are measured at least on ordinal scales, units of observation (e.g.,
individuals, nations, organizations, values) can be ranked. A ranking is an ordering of
units of observations with respect to an attribute of interest. For example, nations can
be ranked with respect to their quality of life, their freedom, their tightness, or
looseness, etc. A rank is the position of a unit of observation (e.g., nation) in the ranking.
Units of observation with higher ranks show the attribute of interest to a higher degree.
If one is interested in the association between two rankings (e.g., quality of life and
freedom of nations), rank-order correlations can be calculated.

Scatter diagrams
Scatter diagrams are used when you want to demonstrate the relationship between two
variables or when you have to identify data patterns.
A simple scatter plot can be used to see the difference in outdoor temperatures
compared to ice cream sales. The two variables would be outside temperature and ice
cream sales. This data could be collected and organized into a table. Once the data is
organized into a table, it can be turned into ordered pairs.

Product-moment correlation
The Pearson product-moment correlation coefficient (or Pearson correlation coefficient,
for short) is a measure of the strength of a linear association between two variables and
is denoted by r. Basically, a Pearson product-moment correlation attempts to draw a
line of best fit through the data of two variables, and the Pearson correlation
coefficient, r, indicates how far away all these data points are to this line of best fit (i.e.,
how well the data points fit this new model/line of best fit).

Values can be used for product-moment correlation

The Pearson correlation coefficient, r, can take a range of values from +1 to -1. A value of
0 indicates that there is no association between the two variables. A value greater than
0 indicates a positive association; that is, as the value of one variable increases, so does
the value of the other variable. A value less than 0 indicates a negative association; that
is, as the value of one variable increases, the value of the other variable decreases. This
is shown in the diagram below:

Simple linear regression analysis and prediction

When there is only one predictor variable, the prediction method is called simple
regression. In simple linear regression, the topic of this section is the predictions of Y
when plotted as a function of X form a straight line.

Comparison in Pairs
A paired comparison scale presents the respondent with two choices and calls for a
preference. For example, the respondent is asked which color he or she likes better, red
or blue, and a similar process is repeated throughout the scale items.
The pairwise comparison method (sometimes called the ‘paired comparison method’) is
a process for ranking or choosing from a group of alternatives by comparing them
against each other in pairs, i.e. two alternatives at a time. Pairwise comparisons are
widely used for decision-making, voting and studying people’s preferences.
Pairwise Comparison Steps:
1. Compute a mean difference for each pair of variables.
2. Find the critical mean difference.
3. Compare each calculated mean difference to the critical mean.
4. Decide whether to retain or reject the null hypothesis for that pair of means.

Randomized block design is a type of experiment where participants who share

certain characteristics are grouped together to form blocks, and then the treatment (or
intervention) gets randomly assigned within each block.
The objective of the randomized block design is to form groups where participants are
similar, and therefore can be compared with each other.

Latin square design

Latin square design is a general version of the dye-swapping design for samples
from more than two biological conditions. The Latin square design requires that
the number of experimental conditions equals the number of different labels.
The same number of experimental runs as the number of treatment conditions
is also used. The treatment conditions are labelled once using each label and
sampled once under each experimental run.

A Latin Square design is a specific type of experimental design commonly

used in statistics and research studies. It is particularly useful in situations
where there are multiple factors or variables that need to be controlled or
balanced.
In a Latin Square design, the experimental units (e.g., participants, test
subjects, treatments) are arranged in a square grid-like pattern, where each
row and column contains a unique combination of treatments. The Latin
Square design ensures that each treatment appears exactly once in each row
and column, providing a balanced distribution of treatments across the
factors or variables being studied.

The Latin Square design is commonly used in situations where there are
constraints or limitations on the experimental conditions. For example:

1. Time-based Experiments: When conducting experiments over a

period of time, the Latin Square design can be used to ensure that
each treatment is equally distributed across different time intervals
or days.
2. Control of Confounding Variables: The Latin Square design
allows researchers to control for the influence of certain variables by
ensuring that each treatment appears once in each row and column.
This helps reduce the potential bias caused by uncontrolled factors.
3. Comparative Studies: In studies comparing different treatments or
interventions, the Latin Square design can be employed to ensure
that each treatment has an equal chance of being tested with
different participants or under different conditions.
4. Resource Optimization: The Latin Square design can be useful
when resources are limited, such as in clinical trials or laboratory
experiments, as it allows for a balanced distribution of treatments
within the available resources.
Pa
rametric Test
Parametric test in statistics refers to a sub-type of the hypothesis test . Parametric
hypothesis testing is the most common type of testing done to understand the
characteristics of the population from a sample.
While there are many parametric test types, and they have certain differences, few
properties are shared across all the tests that make them a part of ‘parametric tests’.
These properties include-
1. When using such tests, there needs to be a deep or proper understanding of the
population.
2. An extension of the above point is that to use such tests, several assumptions
regarding the population must be fulfilled (hence a proper understanding of the
population is required). A common assumption is that the population should be
normally distributed (at least approximately).
3. The outputs from such tests cannot be relied upon if the assumptions regarding
the population deviate significantly.
4. A large sample size is required to run such tests. Theoretically, the sample size
should be more than 30 so that the central limit theorem can come into effect,
making the sample normally distributed.
5. Such tests are more powerful, especially compared to their non-parametric
counterparts for the same sample size.
6. These tests are only helpful with continuous/quantitative variables.
7. Measurement of the central tendency (i.e., the central value of data) is typically
done using the mean.
8. The output from such tests is easy to interpret; however, it can be challenging to
understand their workings.

Non Parametric Test

There is no requirement for any distribution of the population in the non-parametric
test. Also, the non-parametric test is a type of hypothesis test that is not dependent on
any underlying hypothesis. In the non-parametric test, the test depends on the value of
the median. This method of testing is also known as distribution-free testing. Test
values are found based on the ordinal or the nominal level. The parametric test is
usually performed when the independent variables are non-metric. This is known as a
non-parametric test.
Differences Between The Parametric Test and The Non-Parametric Test

Properties Parametric Test Non-Parametric Test

Assumptions Yes, assumptions are made No, assumptions are not made

Value for central The mean value is the The median value is the
tendency central tendency central tendency

Correlation Pearson Correlation Spearman Correlation

Probabilistic Normal probabilistic Arbitrary probabilistic

Distribution distribution distribution

Population Population knowledge is Population knowledge is not

Knowledge required required

Used for Used for finding interval data Used for finding nominal data

Applicable to variables and

Application Applicable to variables
attributes

Examples T-test, z-test Mann-Whitney, Kruskal-Wallis

Other Differences
Advantages and Disadvantages of Parametric and Nonparametric Tests
A lot of individuals accept that the choice between using parametric or nonparametric
tests relies upon whether your information is normally distributed. The distribution can
act as a deciding factor in case the data set is relatively small. Although, in a lot of cases,
this issue isn't a critical issue because of the following reasons:

 Parametric tests help in analyzing non normal appropriations for a lot of

datasets.
 Nonparametric tests when analyzed have other firm conclusions that are harder
to achieve.

The appropriate response is usually dependent upon whether the mean or median is
chosen to be a better measure of central tendency for the distribution of the data.
 A parametric test is considered when you have the mean value as your central
value and the size of your data set is comparatively large. This test helps in
making powerful and effective decisions.
 A non-parametric test is considered regardless of the size of the data set if the
median value is better when compared to the mean value.

Ultimately, if your sample size is small, you may be compelled to use a nonparametric
test. As the table shows, the example size prerequisites aren't excessively huge. On the
off chance that you have a little example and need to utilize a less powerful
nonparametric analysis, it doubly brings down the chances of recognizing an impact.

The non-parametric test acts as the shadow world of the parametric test. In the table
that is given below, you will understand the linked pairs involved in the statistical
hypothesis tests.

Brief Explanation of Parametric and Non Parametric Test

When you need to compare the sample’s mean with a

hypothesized value (which often refers to the population mean),
then one sample z-test is used.
The test has major requirements, such as the sample size should
be more than 30, and the population’s standard deviation should
be known
 Z-Test

If either of the requirements mentioned above cannot be met,

then you can use another type of parametric test known as the
one-sample t-test.
Here if the sample size is at least more than 15 and the standard
deviation of the sample is known, then you can use this test.
Here the sample distribution should be approximately normal
 One Sample t-Test

Paired t-test is used when from the same subject data is

collected; typically before and after an event—for example, the
weight of a group of 10 sportsmen before and after a diet
program.
Here to compare the mean of the before and after group, you can
use the paired t-test. The assumptions here include groups being
independent, the values of before and after belonging to the
 Paired same subjects, and the differences between the groups should be
(dependent) t- normally distributed
Test

 Two Sampled In situations where there are two separate samples, for example,
the house prices in Mumbai v/s house prices in Delhi, and you
have to check if the mean of both these samples is statistically
significantly different not, then a two-sampled t-test can be used.
It assumes that each sample’s data distribution should be
roughly normal, values should be continuous, the variance
should be equal in both the samples, and they should be
(Independent) t- independent of each other
Test

An extension of two sampled t-tests is one-way ANOVA, where

we compare more than two groups. Suppose someone asks you
if that is ANOVA a parametric test, the answer to that is a
definitive yes.
ANOVA analyses the variance of the groups and requires the
population distribution to be normal, variance to be
 One-way Analysis homogeneous, and groups to be independent
of Variance

To understand the association between two continuous numeric

variables, you can use a person’s coefficient of correlation.
It produces an ‘r’ value where a value closer to -1 and 1 indicates
a strong negative and positive correlation respectively.

A value close to 0 indicates no major correlation between the

 Pearson’s variables. A part of its assumption is that both the variables in
Coefficient of question should be continuous.
Correlation

Common types of non-parametric tests include-

 Wilcoxon signed-rank
test
It is used as an alternative to the one-sample t-test

 Mann-Whitney U-test /
Wilcoxon rank-sum test
They can be used as an alternative to the two-sample t-test

 Kruskal-Wallis test
It is an alternative to the parametric test – one-way ANOVA

 Spearman’s rank You can use this test as an alternative to pearson’s

correlation correlation coefficient. It’s important when the data is not
continuous but in the form of ranks (ordinal data)

 Signed-rank test It is an alternative to the parametric test – paired t-test

Correlation Analysis: Pearson & Spearman
No ratings yet
Correlation Analysis: Pearson & Spearman
8 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
19 pages
Wub Ante
No ratings yet
Wub Ante
8 pages
The Significance of Correlation
No ratings yet
The Significance of Correlation
6 pages
Correlation Analysis
No ratings yet
Correlation Analysis
52 pages
Correlation and Regression Guide
No ratings yet
Correlation and Regression Guide
9 pages
Unit 17 Correlation and Regression
100% (1)
Unit 17 Correlation and Regression
13 pages
Understanding Correlational Research
No ratings yet
Understanding Correlational Research
4 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
Understanding Correlation Coefficients
No ratings yet
Understanding Correlation Coefficients
4 pages
Correlation of Experimental Data Analysis
No ratings yet
Correlation of Experimental Data Analysis
8 pages
Correlation and Regression Analysis Explained
No ratings yet
Correlation and Regression Analysis Explained
2 pages
Unit 2
No ratings yet
Unit 2
44 pages
Understanding Correlation Analysis
No ratings yet
Understanding Correlation Analysis
24 pages
Unit 3
No ratings yet
Unit 3
27 pages
Correlational Research Guide
No ratings yet
Correlational Research Guide
14 pages
Correlation of Research Design
No ratings yet
Correlation of Research Design
15 pages
Correlational Research Guide
100% (1)
Correlational Research Guide
6 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
5 pages
Correlation & Regression Guide
No ratings yet
Correlation & Regression Guide
29 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Correlational Research Design Overview
100% (1)
Correlational Research Design Overview
62 pages
Correlation and Regression Analysis PDF
No ratings yet
Correlation and Regression Analysis PDF
11 pages
Bivariate Data and Correlation Analysis
No ratings yet
Bivariate Data and Correlation Analysis
8 pages
Regression and Correlation Analysis Guide
No ratings yet
Regression and Correlation Analysis Guide
25 pages
Presentation On: Correlation and Rank Correlation: Submitted To
100% (4)
Presentation On: Correlation and Rank Correlation: Submitted To
23 pages
Understanding Correlation and Pearson's Coefficient
No ratings yet
Understanding Correlation and Pearson's Coefficient
8 pages
Correlation and Regression Analysis Explained
No ratings yet
Correlation and Regression Analysis Explained
8 pages
Correlational Design
100% (2)
Correlational Design
7 pages
Lecture VII Bivariate Data
No ratings yet
Lecture VII Bivariate Data
8 pages
Unit 3 Covariance and Correlation
No ratings yet
Unit 3 Covariance and Correlation
7 pages
Analyzing Data Relationships and Correlations
No ratings yet
Analyzing Data Relationships and Correlations
48 pages
Linear Correlation Analysis Guide
No ratings yet
Linear Correlation Analysis Guide
11 pages
ch7 - CORELATION
No ratings yet
ch7 - CORELATION
16 pages
Understanding Correlation Types and Examples
No ratings yet
Understanding Correlation Types and Examples
44 pages
Understanding Bivariate and Partial Correlation
No ratings yet
Understanding Bivariate and Partial Correlation
8 pages
Encyclopedia (Pp. 269-271) : Us/nam/music-In-The-Social-And-Behavioral-Sciences/book240878
No ratings yet
Encyclopedia (Pp. 269-271) : Us/nam/music-In-The-Social-And-Behavioral-Sciences/book240878
4 pages
Correlation Analysis Techniques Explained
No ratings yet
Correlation Analysis Techniques Explained
7 pages
Understanding Correlation Analysis
No ratings yet
Understanding Correlation Analysis
18 pages
Understanding Correlation Analysis
No ratings yet
Understanding Correlation Analysis
102 pages
Online Class Etiquette and Correlation Analysis
No ratings yet
Online Class Etiquette and Correlation Analysis
49 pages
Statistics & Probability Q4 - Week 7-8
No ratings yet
Statistics & Probability Q4 - Week 7-8
15 pages
Simple Linear Correlation Overview
No ratings yet
Simple Linear Correlation Overview
21 pages
Correlational Research Methods Explained
No ratings yet
Correlational Research Methods Explained
39 pages
Econmetrics Chapter 3
No ratings yet
Econmetrics Chapter 3
20 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
36 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
23 pages
Correlation Techniques and Analysis
No ratings yet
Correlation Techniques and Analysis
6 pages
Lesson 6.2 Correlation and Regression Analysis Final Edition
No ratings yet
Lesson 6.2 Correlation and Regression Analysis Final Edition
8 pages
Understanding Correlation Analysis
No ratings yet
Understanding Correlation Analysis
48 pages
Correlation and Its Significance
No ratings yet
Correlation and Its Significance
15 pages
Understanding Correlation Types and Coefficients
100% (2)
Understanding Correlation Types and Coefficients
46 pages
Correlation: 1. Definition of Correlational Research
No ratings yet
Correlation: 1. Definition of Correlational Research
4 pages
Subject-Verb Agreement Guide 2024
No ratings yet
Subject-Verb Agreement Guide 2024
15 pages
NetAct Partner Workshop PDF
100% (1)
NetAct Partner Workshop PDF
107 pages
Creating Entrepreneurs: Introducing Queen of Tamarind Powder!
No ratings yet
Creating Entrepreneurs: Introducing Queen of Tamarind Powder!
4 pages
Agile Team Charter
No ratings yet
Agile Team Charter
6 pages
Answer ALL Questions in This Section
No ratings yet
Answer ALL Questions in This Section
10 pages
Amtgard 8 Snowy-Final
No ratings yet
Amtgard 8 Snowy-Final
77 pages
2nd Pathways Ls 1 Answer Key
100% (1)
2nd Pathways Ls 1 Answer Key
33 pages
Mulan Essay Thesis
100% (3)
Mulan Essay Thesis
8 pages
PDP Assignment 2 Workbook Page1 To 22
No ratings yet
PDP Assignment 2 Workbook Page1 To 22
21 pages
Business Communication Set 4
No ratings yet
Business Communication Set 4
6 pages
Tutorial 2 - Power Series, Taylor's Series and Maclaurin's Series
No ratings yet
Tutorial 2 - Power Series, Taylor's Series and Maclaurin's Series
1 page
I The Babbler Beater: Things To Remember
No ratings yet
I The Babbler Beater: Things To Remember
3 pages
Maintenance Key Performance Indicators: Download
No ratings yet
Maintenance Key Performance Indicators: Download
6 pages
Key Terms in Document Examination
No ratings yet
Key Terms in Document Examination
2 pages
The Last Lesson: Franco-Prussian War Impact
No ratings yet
The Last Lesson: Franco-Prussian War Impact
20 pages
Human-Computer Interaction Overview
No ratings yet
Human-Computer Interaction Overview
10 pages
HA200
No ratings yet
HA200
2 pages
Curriculum Vitae Daniel George Evans 9 Waterworks Rd. Kingston 8. Tel: (876) 509-0313/ 366-7593
No ratings yet
Curriculum Vitae Daniel George Evans 9 Waterworks Rd. Kingston 8. Tel: (876) 509-0313/ 366-7593
6 pages
Job Opening - Field Core Drilling Supervisor - Oman
No ratings yet
Job Opening - Field Core Drilling Supervisor - Oman
2 pages
BOSCH Lbb4432 Keypad
No ratings yet
BOSCH Lbb4432 Keypad
2 pages
The Science of Selling
No ratings yet
The Science of Selling
14 pages
HSLC Examination Results 2025 - Assam State School Education Board - ASSEB - Divi
No ratings yet
HSLC Examination Results 2025 - Assam State School Education Board - ASSEB - Divi
2 pages
Employee Distrust at Hewlett-Packard: 1995-2010
No ratings yet
Employee Distrust at Hewlett-Packard: 1995-2010
30 pages
Electrical Engineer Resume Summary
No ratings yet
Electrical Engineer Resume Summary
2 pages
Year 8 Light Notes
86% (7)
Year 8 Light Notes
16 pages
Indices and Logarithms Explained With Worked Examples - SSZakari
No ratings yet
Indices and Logarithms Explained With Worked Examples - SSZakari
78 pages
Lighting Solutions for Educational Spaces
No ratings yet
Lighting Solutions for Educational Spaces
1 page
Universiti Teknologi Mara Final Examination: Confidential Hs/Oct Hs/Oct Confidential
No ratings yet
Universiti Teknologi Mara Final Examination: Confidential Hs/Oct Hs/Oct Confidential
5 pages
Release Available Delta Functionality Key Capability / Application Area
No ratings yet
Release Available Delta Functionality Key Capability / Application Area
51 pages
CSci 1100 Test 3 Solutions
No ratings yet
CSci 1100 Test 3 Solutions
8 pages