The Stack

The stack() function is used to stack the prescribed level(s) from columns
to index.
Before the ~ sign is the column that you want it to predict, and then the
columns used to predict it
predict command is to predict the data according to the formula built

from model1
• The p-value is a statistical measurement used to validate a hypothesis

against observed data.
• The p-value measures the probability of the observed outcomes,
assuming that the null hypothesis is true.
• The lower the p-value, the greater the statistical significance of the
observed difference.
Logistic regression and ANOVA.

Logistic regression: to investigate the relationship between components
factors and age in day with the strength of cocrete
ANOVA: to compare the efficiency of linear regression model built
Lowess line then it's the local regression of those dots

It kind of shows us the pattern of those dots
WHY DO WE USE ONE – WAY ANOVA?

Answer:
+ The one-way analysis of variance (ANOVA) is used to determine whether there are
any statistically significant differences between the means of two or more
independent (unrelated) groups (although you tend to only see it used when there is
a minimum of three, rather than two groups).
+ For example, you could use a one-way ANOVA to understand whether exam
performance differed based on test anxiety levels amongst students, dividing
students into three independent groups (e.g., low, medium and high-stressed
students).
WHY DO WE USE TWO, MORE WAY ANOVA?

Answer:
+ A two-way ANOVA is used to estimate how the mean of a quantitative
variable changes according to the levels of two categorical variables.
+ Use a two-way ANOVA when you want to know how two independent variables, in
combination, affect a dependent variable.
THE DIFFERENCE BETWEEN ONE WAY AND TWO WAY?

Answer:
The key differences between one-way and two-way ANOVA are summarized
clearly below.
A one-way ANOVA is primarily designed to enable the equality testing

between three or more means. A two-way ANOVA is designed to assess the
interrelationship of two independent variables on a dependent variable.
A one-way ANOVA only involves one factor or independent variable, whereas

there are two independent variables in a two-way ANOVA.
In a one-way ANOVA, the one factor or independent variable analyzed has

three or more categorical groups. A two-way ANOVA instead compares
multiple groups of two factors.
One-way ANOVA need to satisfy only two principles of design of experiments,

i.e. replication and randomization. As opposed to Two-way ANOVA, which
meets all three principles of design of experiments which are replication,
randomization, and local control.
WHY DO WE USE CHI-SQUARED?

Answer:
The Chi Square statistic is commonly used for testing relationships between
categorical variables. The null hypothesis of the Chi-Square test is that no
relationship exists on the categorical variables in the population; they are
independent. An example research question that could be answered using a Chi-
Square analysis would be:
Is there a significant relationship between voter intent and political party
membership?
The difference between ANOVA and Chi-squared?
Answer:
A chi-square is only a nonparametric criterion. You can make comparisons for each
characteristic. You can also use Factorial ANOVA. In Factorial ANOVA, you can
investigate the dependence of a quantitative characteristic (dependent variable) on
one or more qualitative characteristics (category predictors). If the number of groups
is large, you do not see which caused a significant deviation of the average. It is
necessary to make a posterial comparison of the average (post-hoc analysis) Tukey
HSD criterion. The Tukey criterion has the same applicability conditions as the
variance analysis, i.e., the normality of the data distribution and the uniformity of
the group dispersions. The uniformity of the group dispersions is checked by Levene
's test index.
Can we use chi-squared to replace ANOVA?
Này bí :V
HOW TO READ A BOX-PLOT?
Answer:
A boxplot is a standardized way of displaying the distribution of data based on a five

number summary (“minimum”, first quartile (Q1), median, third quartile (Q3), and
“maximum”). It can tell you about your outliers and what their values are. It can also
tell you if your data is symmetrical, how tightly your data is grouped, and if and how
your data is skewed.
The mean, median, sd, maxium, minium?

Answer:
+ The sample mean is the average and is computed as the sum of all the observed
outcomes from the sample divided by the total number of events. We use x as the
symbol for the sample mean. In math terms
+ An alternative measure is the median. The median is the middle score. If we have
an even number of events we take the average of the two middles. The median is
better for describing the typical value. It is often used for income and home prices.
+ The mean, mode, median, and trimmed mean do a nice job in telling where the
center of the data set is, but often we are interested in more. For example, a
pharmaceutical engineer develops a new drug that regulates iron in the blood.
Suppose she finds out that the average sugar content after taking the medication is
the optimal level. This does not mean that the drug is effective. There is a possibility
that half of the patients have dangerously low sugar content while the other half
have dangerously high content. Instead of the drug being an effective regulator, it is
a deadly poison. What the pharmacist needs is a measure of how far the data is
spread apart. This is what the variance and standard deviation do. First we show
the formulas for these measurements.
WHY DO WE USE LINEAR MODEL ?
Answer:
A linear regression is a statistical model that analyzes the relationship between a
response variable (often called y) and one or more variables and their interactions
(often called x or explanatory variables). You make this kind of relationships in your
head all the time, for example when you calculate the age of a child based on her
height, you are assuming the older she is, the taller she will be.

The Stack

Uploaded by

The Stack

Uploaded by

The stack() function is used to stack the prescribed level(s) from columns

predict command is to predict the data according to the formula built

• The p-value is a statistical measurement used to validate a hypothesis

Logistic regression and ANOVA.

Lowess line then it's the local regression of those dots

WHY DO WE USE ONE – WAY ANOVA?

WHY DO WE USE TWO, MORE WAY ANOVA?

THE DIFFERENCE BETWEEN ONE WAY AND TWO WAY?

A one-way ANOVA is primarily designed to enable the equality testing

A one-way ANOVA only involves one factor or independent variable, whereas

In a one-way ANOVA, the one factor or independent variable analyzed has

One-way ANOVA need to satisfy only two principles of design of experiments,

WHY DO WE USE CHI-SQUARED?

A boxplot is a standardized way of displaying the distribution of data based on a five

The mean, median, sd, maxium, minium?

You might also like