iNurture SAGE University, Indore
BCA V Sem
Advanced R Programming Lab Course Code CAPDSARP005P
L T P N
Total Credits 2
0 0 4 0
Data Science
Syllabus
Part
1. INTRODUCTION TO COMPUTING
a) Installation of R
b) The basics of R syntax, workspace
c) Matrices and lists
d) Subsetting
e) System-defined functions; the help system
f) Errors and warnings; coherence of the workspace
2. GETTING USED TO R: DESCRIBING DATA
a) Viewing and Manipulating Data
b) Plotting Data
c) Reading the Data from console, file (.csv) local disk and Web
d) Working with larger datasets
3. VISUALIZING DATA
a) Tables, charts and plots.
b) Visualizing Measures of Central Tendency, Variation, and Shape.
c) Box plots, Pareto diagrams.
d) Find the mean, media, standard deviation and quantiles of a set of observations.
e) Note: Experiment with real as well as artificial data sets.
4. PROBABILITY DISTRIBUTIONS
a) Random number generation Distributions, the practice of simulation
b) Generate and Visualize Discrete and continuous distributions using the statistical environment.
c) Demonstration of CDF and PDF uniform and normal, binomial Poisson distributions.
d) Generate artificial data using and explore various distribution and its properties.
5. EXPLORATORY DATA ANALYSIS
Demonstrate Range, summary, mean, variance, median, sd, histogram, box plot, scatterplot
6. DENSITIES OF RANDOM VARIABLES
a) Distributions in R
b) Matching a Density to Data
c) Making Histograms
7. BINOMIAL DISTRIBUTION
a) Study of binomial distribution.
b) Plots of density and distribution functions.
c) Normal approximation to the Binomial distribution.
8. CORRELATION
a) How to calculate the correlation between two variables.
b) How to make scatter plots.
c) Use the scatter plot to investigate the relationship between two variables
9. TESTS OF HYPOTHESES
a) Perform tests of hypotheses about the mean when the variance is known.
b) Compute the p-value.
c) Explore the connection between the critical region, the test statistic, and the p-value
10. ESTIMATING A LINEAR RELATIONSHIP
Demonstration on a Statistical Model for a Linear Relationship
a) Least Squares Estimates
b) The R Function lm
c) Scrutinizing the Residuals
11. APPLY-TYPE FUNCTIONS
a) Defining user defined classes and operations, Models and methods in R
b) Customizing the user's environment
c) Conditional statements
d) Loops and iterations
12. STATISTICAL FUNCTIONS IN R
a) Demonstrate Statistical functions in R
b) Statistical inference, contingency tables, chi-square goodness of fit, regression, generalized linear
models,
advanced modeling methods
List of Experiments
Introduction to Computing (Program 1)
Experiment 1: Create a matrix and perform various operations on it (addition,
subtraction, multiplication, transposition).
Experiment 2: Create a list and access its elements using different indexing methods and
Demonstrate how to extract specific elements from matrices and lists using indexing.
Experiment 3: Write a function to calculate the factorial of a number and test it with
different input values.
Experiment 4: Create a script to automate a repetitive task (e.g., generating random
numbers, performing calculations).
Getting Used to R: Describing Data (Program 2)
Experiment 5: Import a dataset from a CSV file and explore its structure using str().
Experiment 6: Create a scatter plot to visualize the relationship between two variables
and add a regression line.
Experiment 7: Read a large dataset from a text file and perform basic data cleaning (e.g.,
removing missing values, handling outliers).
Visualizing Data (Program 3)
Experiment 8: Create a histogram and a box plot for a numerical variable and interpret
the results.
Experiment 9: Calculate the mean, median, standard deviation, and quantiles of a dataset
and compare them to the visualization.
Experiment 10: Create a Pareto chart to analyze the frequency of different categories in a
dataset.
Probability Distributions (Program 4)
Experiment 11: Generate random numbers from a normal distribution and visualize the
distribution using a histogram.
Experiment 12: Calculate the probability of a value falling within a certain range for a
binomial distribution.
Experiment 13: Simulate the rolling of a dice 1000 times and calculate the frequency of
each outcome.
Exploratory Data Analysis (Program 5)
Experiment 14: Explore a dataset using summary statistics, histograms, and box plots to
identify patterns and outliers.
Experiment 15: Calculate the correlation between different variables in a dataset and
interpret the results.
Densities of Random Variables (Program 6)
Experiment 16: Fit a probability distribution (e.g., normal, exponential) to a dataset and
assess the goodness of fit.
Experiment 17: Create a kernel density estimate for a dataset and compare it to a
histogram.
Binomial Distribution (Program 7)
Experiment 18: Calculate the probability of a certain number of successes in a binomial
experiment.
Experiment 19: Explore the relationship between the sample size and the shape of the
binomial distribution.
Correlation (Program 8)
Experiment 20: Calculate the correlation between two variables and test for significance.
Experiment 21: Create a scatter plot matrix to visualize the relationships between
multiple variables.
Tests of Hypotheses (Program 9)
Experiment 22: Perform a t-test to compare the means of two groups.
Experiment 23: Calculate the p-value for a hypothesis test and interpret the result.
Estimating a Linear Relationship (Program 10)
Experiment 24: Fit a linear regression model to a dataset and interpret the coefficients.
Experiment 25: Assess the model's assumptions using residual plots and diagnostic tests.
Apply-Type Functions (Program 11)
Experiment 26: Create a custom function to calculate the mean and standard deviation of
a dataset.
Experiment 27: Write a function to perform a specific statistical analysis (e.g., ANOVA,
chi-square test).
Statistical Functions in R (Program 12)
Experiment 28: Perform a chi-square test for independence on a contingency table.
Experiment 29: Fit a generalized linear model (e.g., logistic regression) to a dataset.
Experiment 30: Explore advanced modeling techniques like time series analysis or
survival analysis.
Instructions
Instructions for Computer Practical File (Handwritten)
1. Number of Experiments: There are 30 Experiments. Write a minimum of 20 experiments
in your practical file. There are 12 parts in your syllabus, so ensure at least one program
from each part is included.
2. Labelling: Clearly label each experiment with a heading, such as "Experiment 1: [Title of
Experiment]".
3. Outputs/Results: Note down the outputs or results of each code. Where possible, include
hand-drawn diagrams or sketches of expected output for visual reference.
4. Writing Style: Write clearly and legibly, using a pen with dark ink for readability.
5. Code: Include the R code for each experiment. Ensure the code is properly formatted.
6. Comments: Add comments within the code to explain its functionality.
Last date of Practical file submission
is 11 Nov. 2024