Syllabus BigData EN
Syllabus BigData EN
Software Department
Basic references:
1. Big Data and Big Data Analytics: Concepts, Types and Technologies /
https://www.researchgate.net/publication/328783489_Big_Data_and_Big_Data_Analytics_Concepts_T
ypes_and_Technologies/related
2. Understanding Big Data. Analytics for Enterprise Class. Hadoop and Streaming Data /
https://www.immagic.com/eLibrary/ARCHIVES/EBOOKS/I111025E.pdf?__cf_chl_jschl_tk__=pmd_224e9
9ec9955bf0d1a8e4ab864b26016568b9271-1628851175-0-gqNtZGzNAfijcnBszQdi
3. Programming with Databases – Python / https://swcarpentry.github.io/sql-novice-survey/10-
prog/index.html
4. Рandas / https://www.w3schools.com/python/pandas/pandas_intro.asp
5. Plotly Python Open Source Graphing Library / https://plot.ly/python/
6. Getting Started with Plotly in Python / https://plot.ly/python/getting-started/
7. Microsoft R Application Network / https://mran.microsoft.com/documents/what-is-rDecision Tree /
https://www.geeksforgeeks.org/decision-tree/
8. Apache Hadoop / http://hadoop.apache.org/
9. HDFS / https://www.ibm.com/analytics/hadoop/hdfs
10. About the Cassandra File System (CFS) – deprecated / https://docs.datastax.com/en/dse/5.1/dse-
dev/datastax_enterprise/ analytics/ cfsAbout.html
Additional references:
1. Extract, transform, and load (ETL) / https://docs.microsoft.com/en-
us/azure/architecture/data-guide/relational-data/etl
Educational content
5. Methodology
№ Type of training session Description of the lesson
Topic 1. Sources and types of big data. Preparation and data analysis in Python.
1 Lecture 1. Sources of big data. History of machine learning. History of machine
Internet of Things. Definition of Big learning development. Predictive analysis and tasks of
Data. machine learning. Stages of scientific research. Errors
in predictive analysis. Evaluation of the results of
machine learning models. Types of machine learning.
2 Lecture 2. Development of software Possibilities of data analysis tools. The role of Python
for the analysis of websites that in data analysis. Traditional big data analytics and next
provide open data using Python generation analytics. Data analysis life cycle. Open
Pandas. Open data, their formats and data, their formats and processing means. Web
processing means. scraping. Extract, convert and download data.
Score Grade
100-95 Excellent
94-85 Very good
84-75 Good
74-65 Satisfactory
64-60 Sufficient
Below 60 Fail
Course requirements are not met Not Graded