0% found this document useful (0 votes)
15 views

ml_cheatsheet

Uploaded by

Lenara
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

ml_cheatsheet

Uploaded by

Lenara
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Machine Learning Algorithms Cheat Sheet

Table of Contents

1. Supervised Learning
Linear Regression
Logistic Regression
Decision Trees
Random Forests
Support Vector Machines (SVM)
k-Nearest Neighbors (k-NN)
Naive Bayes
Gradient Boosting Machines (GBM)
Neural Networks
2. Unsupervised Learning
k-Means Clustering
Hierarchical Clustering
Principal Component Analysis (PCA)
Independent Component Analysis (ICA)
Association Rules
Autoencoders
3. Reinforcement Learning
Q-Learning
Deep Q-Networks (DQN)
Policy Gradients
Actor-Critic Methods
4. Semi-Supervised and Self-Supervised Learning
Self-Training
Co-Training
5. Ensemble Methods
Bagging
Boosting
Stacking

Supervised Learning

1. Linear Regression

Purpose: Predict continuous target variables.


Key Concept: Models the relationship between input features and output as a linear combination.
Equation: ( y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \dots + \beta_nx_n + \epsilon )

2. Logistic Regression

Purpose: Binary classification.


Key Concept: Uses the logistic function to model the probability of a class.
Equation: ( P(Y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x_1 + \dots + \beta_nx_n)}} )

3. Decision Trees

Purpose: Classification and regression.


Key Concept: Splits data into subsets based on feature values.
Advantages: Easy to interpret, handles both numerical and categorical data.
Disadvantages: Prone to overfitting.

4. Random Forests

Purpose: Classification and regression.


Key Concept: Ensemble of decision trees using bagging.
Advantages: Reduces overfitting, handles large datasets well.
Key Parameters: Number of trees, max depth.

5. Support Vector Machines (SVM)

Purpose: Classification and regression.


Key Concept: Finds the hyperplane that best separates classes with maximum margin.
Kernel Trick: Enables handling non-linear relationships.
Common Kernels: Linear, Polynomial, RBF.

6. k-Nearest Neighbors (k-NN)

Purpose: Classification and regression.


Key Concept: Assigns the output based on the majority label of the k closest training examples.
Advantages: Simple, no training phase.
Disadvantages: Computationally intensive during prediction, sensitive to irrelevant features.

7. Naive Bayes

Purpose: Classification.
Key Concept: Based on Bayes' Theorem with the assumption of feature independence.
Variants: Gaussian, Multinomial, Bernoulli.

8. Gradient Boosting Machines (GBM)

Purpose: Classification and regression.


Key Concept: Builds models sequentially, each new model correcting errors of the previous ones.
Popular Implementations: XGBoost, LightGBM, CatBoost.
Advantages: High predictive performance, handles missing data.

9. Neural Networks

Purpose: Various tasks including classification, regression, and more.


Key Concept: Composed of layers of interconnected nodes (neurons) that can capture complex patterns.
Types: Feedforward Neural Networks, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN).

Unsupervised Learning

1. k-Means Clustering

Purpose: Partition data into k distinct clusters.


Key Concept: Minimizes within-cluster variance.
Parameters: Number of clusters (k), distance metric.

2. Hierarchical Clustering

Purpose: Create a hierarchy of clusters.


Key Concept: Either agglomerative (bottom-up) or divisive (top-down).
Linkage Criteria: Single, complete, average, ward.

3. Principal Component Analysis (PCA)

Purpose: Dimensionality reduction.


Key Concept: Transforms data to a new coordinate system with orthogonal principal components.
Uses: Feature reduction, visualization.

4. Independent Component Analysis (ICA)

Purpose: Separate a multivariate signal into additive, independent components.


Key Concept: Maximizes statistical independence.

5. Association Rules

Purpose: Discover interesting relations between variables in large databases.


Key Concepts: Support, Confidence, Lift.
Algorithms: Apriori, Eclat.

6. Autoencoders

Purpose: Learn efficient codings of input data.


Key Concept: Neural network architecture with encoder and decoder parts.
Uses: Dimensionality reduction, anomaly detection.

Reinforcement Learning

1. Q-Learning

Purpose: Learn the value of actions in states to derive an optimal policy.


Key Concept: Off-policy temporal difference learning.
Equation: ( Q(s, a) \leftarrow Q(s, a) + \alpha [r + \gamma \max_{a'} Q(s', a') - Q(s, a)] )

2. Deep Q-Networks (DQN)

Purpose: Combine Q-Learning with deep neural networks.


Key Concept: Uses neural networks to approximate Q-values.
Features: Experience replay, target networks.

3. Policy Gradients

Purpose: Optimize the policy directly.


Key Concept: Uses gradient ascent on expected rewards.
Algorithms: REINFORCE, Proximal Policy Optimization (PPO).

4. Actor-Critic Methods

Purpose: Combine value-based and policy-based methods.


Key Concept: Actor updates policy, Critic evaluates it.
Examples: A3C, DDPG.

Semi-Supervised and Self-Supervised Learning

1. Self-Training

Purpose: Utilize unlabeled data to improve model performance.


Key Concept: Iteratively label unlabeled data using the current model.

2. Co-Training

Purpose: Use multiple views of data to train models.


Key Concept: Each model trains on different feature sets and labels unlabeled data for each other.

Ensemble Methods

1. Bagging (Bootstrap Aggregating)

Purpose: Reduce variance and prevent overfitting.


Key Concept: Train multiple models on different bootstrap samples and aggregate predictions.
Example: Random Forest.

2. Boosting

Purpose: Reduce bias and build strong predictive models.


Key Concept: Sequentially train models, each focusing on errors of the previous ones.
Examples: AdaBoost, Gradient Boosting, XGBoost.

3. Stacking

Purpose: Combine multiple models to improve performance.


Key Concept: Use a meta-model to aggregate predictions from base models.

Additional Algorithms and Techniques

Support Vector Regression (SVR)

Purpose: Regression using SVM principles.


Key Concept: Fits the best line within a predefined margin.

Elastic Net

Purpose: Regularized regression combining L1 and L2 penalties.


Key Concept: Balances between feature selection and coefficient shrinkage.

Gaussian Mixture Models (GMM)

Purpose: Probabilistic clustering.


Key Concept: Assumes data is generated from a mixture of several Gaussian distributions.

t-Distributed Stochastic Neighbor Embedding (t-SNE)


Purpose: Data visualization.
Key Concept: Reduces dimensions while preserving local structure.

Hidden Markov Models (HMM)

Purpose: Model sequential data.


Key Concept: States are hidden and emit observable events.

Key Concepts and Terms

Overfitting: Model performs well on training data but poorly on unseen data.
Underfitting: Model is too simple to capture underlying patterns.
Bias-Variance Tradeoff: Balance between model complexity and generalization.
Cross-Validation: Technique to assess model performance by partitioning data.
Regularization: Techniques to prevent overfitting (e.g., L1, L2).
Feature Scaling: Standardizing features to improve model performance.

Resources and Libraries

Python Libraries:

Scikit-learn: Comprehensive ML algorithms.


TensorFlow & Keras: Deep learning frameworks.
PyTorch: Flexible deep learning library.
XGBoost, LightGBM, CatBoost: Gradient boosting implementations.

Books:

"Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Géron
"Pattern Recognition and Machine Learning" by Christopher M. Bishop
"The Elements of Statistical Learning" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

Online Courses:

Coursera's Machine Learning by Andrew Ng


edX's MicroMasters in Statistics and Data Science
Udacity's Machine Learning Nanodegree

You might also like