100% found this document useful (1 vote)
886 views4 pages

Python Programming and ML Syllabus

The document outlines a 240-hour Python training course offered by Shazana Infotech Education and Training Academy. The course covers Python programming, data analysis with Python using NumPy and Pandas, machine learning, and deep learning. Some of the key modules included are Python syntax, object-oriented programming, data visualization, linear regression, logistic regression, KNN, clustering, natural language processing, convolutional neural networks, and recurrent neural networks. The course also includes several hands-on projects.

Uploaded by

Sanjivee Sachin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
886 views4 pages

Python Programming and ML Syllabus

The document outlines a 240-hour Python training course offered by Shazana Infotech Education and Training Academy. The course covers Python programming, data analysis with Python using NumPy and Pandas, machine learning, and deep learning. Some of the key modules included are Python syntax, object-oriented programming, data visualization, linear regression, logistic regression, KNN, clustering, natural language processing, convolutional neural networks, and recurrent neural networks. The course also includes several hands-on projects.

Uploaded by

Sanjivee Sachin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
  • Python Programming
  • Data Analysis with Python
  • Machine Learning
  • Deep Learning

SHAZANA INFOTECH EDUCATION AND TRAINING ACADEMY

ADDRESS :35,KEEZHA THERU, MELACAVERY,KUMBAKONAM-612002


Short term vocational training course

Python syllabus Hours: 240:00

Python Programming
Module 1 - Installation of Python on Anaconda Platform
 Introduction to installation of Anaconda
 Introduction to Python Editors & IDE's (Anaconda, Jupyter etc...)
 Understand Jupyter notebook.
 Overview of Python- Starting with Python
 Exploring different IDE for Python
Module 2 - Data Types and Variables
 Primitive and Core Datatype
 Mutable and Immutable Data Types
 Core built-in data structures – Lists, Tuples, Dictionaries, Sets
 Working with list, tuple, dictionaries and sets and explore different functions
and methods.
Module 3 – String and Conditional Statement in Python
 String
 String built-in methods
 String formatting using format function
 If – Else Statement | Single Hand if else
Module 4 – Loops and Functions in Python
 What are Loops
 While Loop
 For Loop
 What are Functions and its use?
 Types of Functions
 Examples
Module 5 – File Handling and Comprehension
 What is file handling?
 Open file, read file, modify, delete file.
 List Comprehension, Set Comprehension
 Lambda Function
Module 7 – Object Oriented Programming
 What is OOPS
 Concept of Class, Objects, Methods
 Inheritance and its types | MRO – Method Resolution Order
Data Analysis with Python

Module 1 - NumPy
 Installing Numpy | Introduction to Numpy Arrays
 Different operation and functions on Numpy Arrays
 Numpy operations and different functions
 Creating different arrays using Numpy
 Array Indexing and Slicing | Array Functions and Methods
 Different Mathematical Functions | Different Matrix Operations
 Random Numbers | Generate Numbers between a range.
Module 2 - Introduction to Pandas for Data Analysis
 Installing Pandas | Pandas Introduction
 Use of Pandas for Data Analysis
 Series vs DataFrames
 Reading data from CSV and TXT file and databases
Module 3 - Pandas (Series)
 Concept Of Series in Pandas | Creating Series using Pandas
 Series vs List, Series Operations
 Different Functions in Series | Sorting of Series
 Extracting Values from Series
 .value_counts() method, .apply() methods
 Data Cleaning - Dealing with Duplicates and Missing Values
Module 4 - Pandas (DataFrame)
 What is DataFrame and its use | Creating DataFrame
 DataFrame different functions
 Dropping columns/rows from DataFrame
 Display Particular Columns from DataFrame (Subset of DataFrame)
 Add New Column to DataFrame
 Broadcasting Operations in DataFrame | Dropping and Filling Null Values

Module 5 - DataFrame - Continued


 Different Sorting Algorithms in Dataframe
 Filtering Data in Dataframe
 Filterning Data based on Condition.
 Filter Data with AND & OR Operations
 .set_index() & .reset_index() in Pandas
 Retrieve Row Values Using loc and iloc in Pandas
 Set New/Multiple Values for a Specific Cell or Row
 Rename Index Labels or Columns in Pandas
 Delete Rows or Columns in Pandas
 The .nsmallest() and .nlargest(), .where(),.query(),.apply() methods in Pandas

Module 6 - Data Visualization


 Introduction to Data Visualization | Installing Matplotlib and Seaborn
 Line Plot, Scatter Plot, Bar Plot, Histogram, Pie Chart
 Box Plot
 Detecting Outliers
 Heatmap
Machine Learning

Module 1 – Introduction
 Introduction to Machine Learning
 Introduction to Data Science
 Data Science vs AI vs ML vs Deep Learning
 Flow of Machine Learning & its applications.

Module 1 – Machine Learning Types


 Types of Machine Learning
 Supervised Learning with Examples
 Unsupervised Learning with Examples
 Reinforcement Learning with Examples
 Regression vs Classification
 Different ML Algorithms

Module 2 - Simple Linear Regression


 Regression Problem Analysis
 Mathematical modelling of Regression Model
 Use cases, Regression Table,
 R-Square | Mean Squared Error
 Model Specification, Data sources for Linear regression.
 Project – Employee Salary Prediction / House Price Prediction
Module 3 - Logistic Regression and KNN
 Logistic Regression Working and Mathematical Equation
 Model Specification, Model Parameter Significance Evaluation, Confusion Matrix
 Different Classification Evaluation Metrics (Accuracy Score, Precision, Recall, F1-
Score)
 Concept of KNN Model
 Project – IRIS Flower Classification / Titanic Passenger Survival Prediction
Module 4 – Unsupervised Learning - Clustering
 Unsupervised Learning, Clustering Introduction,
 K-Means Clustering, Handling K-Means Clustering,
 Maths behind K means Clustering – Centroids
 Hierarchical Clustering, Dendrogram
 Project – Customers Clustering
Module 5 – Natural Language Processing
 Introduction to NLP
 Concept of Stemming, Lemmatization
 Bag Of Words, TF-IDF
 Text Cleaning and Implementation on Python
 Sentiment Analysis
 Entity Recognition
 Examples
Module 6 – Azure Machine Learning
 Creating Regression model on Azure ML Studio
 Creating Classification Model on Azure ML Studio
Deep Learning

Module 1 - Introduction
 Introduction to Deep Learning
 Deep Learning vs Machine Learning
 Different techniques
 Installing Tensorflow | Keras

Module 2 - Introduction to Artificial Neural Network (ANN)


 Artificial Neural Networks (ANNs): Concept
 Activation Functions
 Feed Forward Neural Networks
 Back Propagation
 Cost Functions

Module 3 - Introduction to Convolutional Neural Network (CNN)


 Introduction to CNN
 Working of CNN
 Convolutional Layer | Pooling | Flatten

Module 4 - Introduction to Recurrent Neural Network (RNN)


 Introduction to RNN
 Introduction to LSTM
 RNN vs LSTM

Module 5 - Model Performance Metrics


 Confusion Matrix
 Precision Score | Recall Score | F1 – Score
 Overfitting and Underfitting
 Learning rate | Batch Size
 Feature Scaling
 Outliers

Projects
Random Password Generator Guess the Number Game
Dice Roll Simulator Weather App
Covid – 19 Data Analysis IPL Data Analysis
Titanic Passenger Data Analysis Basic Website with Django
Amazon/Flipkart Review Web Scraping Text Cleaning and Implementation Using NLP
Review Sentiment Analysis House Price Prediction Project
Titanic Passenger Survival Prediction Project Customer Clustering
Image Resizer Facial Recognition Attendance System
MNIST Handwritten Digit Prediction Google Stock Price Prediction
Spam Email Detection Face / Smile / Eye Detection
Car Detection Image Background Changer

Common questions

Powered by AI

Normalization and managing missing values are critical steps in data cleaning that significantly affect the accuracy and reliability of data analysis. Normalization involves scaling numerical features so that they share a common scale without distorting differences in the ranges of values. This process is essential for algorithms that compute distances between data points, such as K-Means clustering, ensuring that no particular feature dominates due to its scale . Handling missing values is equally crucial, as they can skew data analysis, leading to biased or inaccurate results. Processes such as filling missing values with mean, median, or mode, or removing data points with missing values altogether (a process called imputation), help maintain the integrity of the dataset. For example, in Pandas, methods like .fillna() or .dropna() can be used to address missing data. Proper management of these aspects ensures that the dataset is as complete and representative as possible, thereby enhancing the quality of insights derived .

Imbalanced datasets can severely skew the performance of classification models, leading them to favor the majority class and overlook the minority class, which is often more important in many practical scenarios like fraud detection or rare disease diagnosis. Common evaluation metrics such as accuracy become less informative in these contexts, as models may achieve high accuracy simply by predicting the majority class . To address this issue, several techniques can be employed: resampling methods like oversampling the minority class (e.g., SMOTE - Synthetic Minority Over-sampling Technique) or undersampling the majority class balance the class distribution. Additionally, algorithmic approaches such as cost-sensitive learning, which assigns a higher penalty to misclassifications of the minority class, can be used. Moreover, using evaluation metrics like precision, recall, and F1-score provides a more comprehensive view of the model's performance on imbalanced data. Implementing these strategies helps in developing robust models that generalize well to unseen data across all classes .

Hierarchical clustering and K-Means clustering are two widely used methods in unsupervised learning, but they differ significantly in methodology. Hierarchical clustering builds a multilevel hierarchy of clusters that can be visualized as a dendrogram. It doesn’t require specifying the number of clusters beforehand and offers flexibility as it can be agglomerative (bottom-up) or divisive (top-down). It is computationally more intensive due to its complexity and is less scalable to larger datasets . K-Means clustering, conversely, requires the user to specify the number of clusters (K) before running the algorithm. It is iterative, assigning data points to the closest cluster centroid and then updating these centroids until convergence. K-means is generally faster and simpler to implement than hierarchical clustering, making it more suitable for larger datasets but less informative in terms of cluster structure due to its flat clustering structure. Choosing between these methods depends on data size, the desired granularity of clusters, and the computational resources available .

Lists and tuples in Python are both used to store collections of items, but they have different performance and usage implications. Lists are mutable, which means they can be modified after their creation, allowing for operations like appending and removing items. This flexibility comes at a cost, as lists generally consume more memory and have slower performance in certain operations compared to tuples, which are immutable. Tuples are fixed in size and faster due to their immutability, making them suitable for read-only collections. Dictionaries provide key-value pairing, allowing for fast lookups based on keys, which is beneficial in situations where data needs to be queried using identifiers. Sets, on the other hand, are ideal for when operations involve membership testing, uniqueness, and mathematical set operations. They are unordered and mutable, but unlike lists, do not allow duplicate elements. Choosing the right data structure depends on the operations that need to be performed and the characteristics of the data .

Supervised learning involves learning a function from labeled training data, which makes predictions or classifications based on new, unknown data based on past experiences. It's commonly used in situations where historical data with correct outcomes is available, such as in linear regression and classification scenarios like spam detection. Examples include using past housing prices to predict future prices or classifying emails into spam and non-spam categories . Unsupervised learning, in contrast, does not use labeled data, aiming instead to infer the natural structure present within a set of data points. It's often employed in clustering and association problems, such as customer segmentation or market basket analysis, where the goal is to discover patterns that were not previously identified. The main algorithms used include clustering techniques like K-Means and hierarchical clustering . While both approaches leverage algorithms to detect patterns, supervised learning uses both input and output data for training, whereas unsupervised learning only uses input data and any output is based on the hidden patterns identified .

Jupyter Notebook is highly favored in data analysis and scientific research environments for its ability to integrate code, comments, multimedia, and visualizations into a single, interactive document. This environment supports quick iteration and prototyping, which is ideal for data exploration and visualization tasks . It also allows sharing of reproducible research via notebooks that include code execution outputs. In contrast, traditional Integrated Development Environments (IDEs) like PyCharm or VSCode are tailored for software development, offering features like integrated debugging, version control, and project management tools. These environments support longer software development life cycles and complex project layouts where modular code development is crucial. While Jupyter is excellent for prototyping, traditional IDEs provide a robust framework for developing production-level applications .

Activation functions are vital components in the architecture of neural networks as they introduce non-linearity into the model, enabling it to learn complex patterns from data. Without activation functions, a neural network would behave like a linear regression model, regardless of the number of layers present, and would be unable to handle non-linear decision boundaries . Different activation functions, such as ReLU, Sigmoid, and Tanh, are chosen based on the specific requirements of a problem. The ReLU function, for example, is currently popular for hidden layers because it helps mitigate the issue of vanishing gradients, which can impede model training. By allowing networks to learn multiple layers of representation, activation functions extend the capacity of neural networks to model intricate relationships within data, leading to better generalization and performance on unseen data .

Feature scaling is a preprocessing step used in regression models to standardize the range of independent variables or features. It is crucial because many machine learning algorithms that use distance measures, like linear regression, are sensitive to the scale of the input data. Model performance can degrade significantly if features vary greatly in scale, as those with larger scales could disproportionately dominate the cost function optimization . Scaling methods like standardization (z-score normalization) or min-max scaling bring features into the same range or distribution, which stabilizes the model's convergence behavior. For instance, during gradient descent optimization, properly scaled features help in faster and more robust convergence to a minimum since each feature contributes equally to the computation of the cost function gradients. This process enhances the efficiency and reliability of the model, leading to more accurate predictions .

Lambda functions, also known as anonymous functions, and list comprehensions simplify Python code by allowing concise expression of logic and iteration, respectively. A lambda function is a small, unnamed function defined with the 'lambda' keyword, typically used for short operations: (e.g., sorting a list of tuples by the second element with Lambda: `sorted(list_of_tuples, key=lambda x: x[1])`). List comprehensions provide a more readable and succinct syntax to create lists than using traditional loops, allowing operations in a single line (e.g., `[x**2 for x in range(10)]` generates a list of squares from 0 to 9). Utilizing these tools enhances code efficiency and readability, significantly reducing the amount of verbose code needed to accomplish tasks and making the code more Pythonic by leveraging built-in language capabilities to express powerful concepts succinctly .

Inheritance in Python's object-oriented programming allows a class, known as a child class, to inherit the attributes and methods of another class, referred to as the parent class. This facilitates code reusability by allowing new classes to use existing behaviors without rewriting code, leading to a cleaner organization of code and the ability to extend functionality. For instance, in a simple Python program, a parent class 'Vehicle' can outline general methods like 'start' and attributes such as 'fuel_type'. A child class 'Car' can inherit these attributes and methods while also adding specific methods like 'honk' or additional attributes like 'num_doors'. This reduces code duplication and improves maintainability, as changes made in the parent class automatically propagate to child classes .

You might also like