Vietnam National University of HCMC
International University
School of Computer Science and Engineering
Data Analysis
(IT137)
Nguyen Trung Ky, PhD
[email protected] 1
Basic Information about course
● Instructor: Dr. Nguyen Trung Ky.
▪ Ph.D. Grenoble Alpes University 2019; third year at IU.
▪ Research on Computational Linguistics (Natural Language Processing,
Natural Language Generation) and Machine Learning.
▪ Office: O1.610
▪ Ask immediately after class or by appointment via email
[email protected]
● Every Saturday, 8:00 – 10:30 from 13/09/2025 – 21/12/2025, at room A1. 206
● Previous course: Intro to data science
● Course credit: 4
○ Lecture: 3 (from 13/09/2025 – 21/12/2025)
○ Laboratory: 1
■ (Group 1 from from 06/10/2025 – 30/11/2025)
2
Online DA Group – Microsoft Team
3
Data Analysis
● Introduce yourself (name, hometown)
● What you want to become in next five year?
● What are you expected from this course?
4
WHAT IS DATA SCIENCE?
…solving problems with data…
scientific, use data to
data collect & clean &
social, or create
problem understand format data
business solution
data
problem f
…which step is most challenging?
use data to data analysis
create or
solution machine learning (or
both)
3
WHAT IS DATA ANALYSIS?
…using data to discover useful information…
• data: anything you can measure or record
• statistics: summarize (and visualize) main
Statistics
characteristics of the data
• algorithms: apply algorithms to find patterns in
Algorithm
s
the data
4
What types of jobs related data in job market today?
What skills and tools are needed?
More on job of data scientist
Learning outcomes
1. Understand fundamental concepts of data analysis.
2. Explain how to perform data analysis with descriptive statistics and inferential
statistics.
3. Apply data analysis techniques and tools to some practical cases in
business/engineering.
10
Topics to be covered in this course
Week 1 Course Overview
Week 2 Basic of R
Week 3 Data types & wrangling
Week 4 Data types & wrangling (continue)
Week 5 Summary statistics
Week 6 Summary statistics (continue)
Week 7 Data Plotting
Week 8 Data Plotting (continue)
11
Topics to be covered in this course
Week 9 Probability Basics
Week 10 Models & parameter inference
Week 11 Hypothesis testing
Week 12 Hypothesis testing (continue)
Week 13 Model comparison
Week 14 Linear regression
Week 15 Linear regression (continue)
12
Materials/Books
[1] . Anil Maheshwari, Data Analytics, 2022
[2]. Migrant & Seasonal Head Start Technical Asistance Center. Introduction to
Data Analysis Handbook, non-commercial ues only.
[3]. Hadley Wickham & Garret Grolemund, R for Data Science. O’reilly 2023.
13
Some useful websites for R
[1] . https://www.w3schools.com/r/
[2]. https://www.tutorialspoint.com/r/index.htm
[3]. https://www.r-bloggers.com/2021/04/tidyverse-in-r-complete-tutorial/
[4]. https://www.datacamp.com/tutorial/tidyverse-tutorial-r
14
Why we choose to learn R for Data Analysis?
Applications of R
15
Why we choose to learn R for Data Analysis?
16
Why we choose to learn R for Data Analysis?
1. Who Uses R? Companies That Use R and What R Is Used
For
https://careerkarma.com/blog/who-uses-r/
2. Why Top Companies are using R Programming
https://data-flair.training/blogs/r-applications/
17
Why we choose to learn R for Data Analysis?
Please share your
experiences on R
Language using
the following QR
code
18
Blackboard
● Course information, announcements
○ (KyNguyen)
● Upload lectures, quizzes or homework
19
Grading policies
1. Quizzes + Lab Assignments or Project : 35%
2. Midterm: 25%
3. Final: 40%
20
Thank you for your listening!
22