Explanation of BMI Data Using Linear Regression Model in R
Explanation of BMI Data Using Linear Regression Model in R
https://doi.org/10.22214/ijraset.2022.40640
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
Abstract: This paper describes the regression analysis between different variable like Weight & BMI, Weight & Height, and
Height & BMI using Linear Regression Model & data visualization techniques in R Programming from a sample data of 68
students of BCA. The collected data were analyzed for underweight, overweight, obese personalities by using conditional
statements. The result of the model will give Residual Standard Error, Multiple R2, Adjusted R2, F-statistic and p-value. There is
visualization of data using ggplot() and geom() in last steps.
Keywords: BMI, Multiple R2, Adjusted R2, F-statistic, p-value, R, ggplot, geom.
II. BMI
BMI stands for Body Mass Index. It gives us the information about our weight category as per given in Table 2.1.The mathematical
formula for the calculation of BMI is
BMI = Weight / (Height) 2
(Weight is in Kg and Height in m)
331 331
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
read_csv()
read_tsv()
read_delim()
read_fwf()
read_table()
read_log()
The above code represents 68 and 4, means data of 68 persons with 4 parameters (Age, Gender, Height and Weight).
The next step is calculation of BMI and addition of BMI column to the above data.
By using mutate we can add column to the existing data. Let the data is stored in the new variable “d2” now.
BMI=Weight/(Height/100)^2
332 332
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
Height is divided by 100 because it’s required in meters and in collected data it was in centimeters.
Now using Table 2.1 conditional statements can be applied to display the result column.
We have to apply conditions on BMI column of data “d2”. Let’s save this in variable T : T <- d2$BMI
Now ifelse condition can be applied to implement the conditions given in Table 2.1
Let d3 is the new variable to save the updated data, mutate can be used to add new column to the existing data
In the above Linear Regression Model, Height is the explanatory variable (or the independent variable) and Weight is the response
variable (or the dependent variable).
333 333
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
334 334
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
In this model, Height is the explanatory variable (or the independent variable) and BMI (Body Mass Index) is the response variable
(or the dependent variable).
The regression line represents how much and in what direction dependent variable changes with respect to independent
variable.
The line closely approximates all the points.
The purpose of regression line is make predictions.
335 335
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
Weight is the explanatory variable (or the independent variable) and BMI (Body Mass Index) is the response variable (or the
dependent variable).
336 336
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
337 337
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
338 338
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
339 339
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue III Mar 2022- Available at www.ijraset.com
V. RESULTS
The results of above regression models is [1]
1) Weight = -96.2453 + Height*0.9226
2) BMI = 8.32208 + Height*0.07426
3) BMI = 6.19345 + Weight*0.24734
A. Explanation of Summary
1) Call is the feature in R that represents what function & parameters were used to create the model[2]
2) Residuals represents the difference between observed data of the dependent variable (y) and the fitted values(ŷ) ŷ = a + bx,
where a is y intercept, b is slope of the line and x is independent variable [1]
3) In Coefficients four parts are there[2]
o Estimate : gives us intercept and slope regression line
o Std Error : RSE/sq root of sum of squares of x variable
o t value : Estimate/SE
o Pr(>|t|) : Probability of occurrence of t-value
4) Calculation of Residual Standard Error, Multiple R-Squared, Adjusted R-Squared & F-Statistic for each model.
5) In Plot 4
o Count of Females with Normal Weight are less than that of Males
o No female is there in obese category
o Overweight male candidates are more than those of female candidates
REFERENCES
[1] https://www.learnbymarketing.com/tutorials/explaining-the-lm-summary-in-r/
[2] https://www.learnbymarketing.com/tutorials/explaining-the-lm-summary-in-r/
[3] Chan YH. Biostatistics 201: Linear regression analysis. Age (years). Singapore Med J 2004;45:55-61.
[4] Gaddis ML, Gaddis GM. Introduction to biostatistics: Part 6, correlation and regression. Ann Emerg Med 1990;19:1462-8.
[5] Elazar JP. Multiple Regression in Behavioral Research: Explanation and Prediction. 2nd ed. New York: Holt, Rinehart and Winston; 1982.
[6] Schneider A, Hommel G, Blettner M. Linear regression analysis: Part 14 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2010;107:776-82.
[7] https://www.ncbi.nlm.nih.gov/books/NBK535456/figure/article-18425.image.f1/
[8] https://www.youtube.com/watch?v=XAnilMY-ILs&list=PLpApktzwiFX9UZk5ZijcDuTa9q9MLgWZD
[9] https://cran.r-project.org/web/packages/readr/readme/README.html
340 340
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |