0% found this document useful (0 votes)

55 views81 pages

05 Preprocessing and Sklearn - Slides

This document provides an overview of Lecture 5 topics which include reading datasets from text files, basic data handling in Python, object oriented programming concepts, machine learning with Scikit-learn, preparing training data using the Transformer API, and Scikit-learn pipelines. The lecture covers loading and exploring the Iris dataset, data preprocessing tools in Pandas and MLxtend, and introduces key concepts of Scikit-learn like estimators, the estimator API, and using Scikit-learn for classification/regression tasks in a Pythonic way.

Uploaded by

Ratneswar Saikia

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

55 views81 pages

05 Preprocessing and Sklearn - Slides

Uploaded by

Ratneswar Saikia

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 81

Lecture 05

Data Preprocessing and

Machine Learning with Scikit-Learn
(Computational Foundations Part 3/3)

STAT 451: Intro to Machine Learning, Fall 2020

Sebastian Raschka

http://stat.wisc.edu/~sraschka/teaching/stat451-fs2020/

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 1

Lecture 5 (Data Preprocessing and ML with Scikit-Learn)
Topics
1. Reading a Dataset from a Tabular Text File

2. Basic Data Handling

3. Object Oriented Programming (OOP) & Python Classes

4. Machine Learning with Scikit-learn

5. Preparing Training Data & Transformer API

6. Scikit-learn Pipelines

Code notebook: https://github.com/rasbt/stat451-machine-learning-fs20/blob/master/

L05/code/05-preprocessing-and-sklearn__notes.ipynb
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 2
Where we currently are in this course ...

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 3

Machine Learning Workflow
Feature Extraction and Scaling
Feature Selection
Dimensionality Reduction
Sampling

Labels

Training Dataset
Learning
Final Model New Data
Labels Algorithm

Raw Test Dataset

Data
Labels

Preprocessing Learning Evaluation Prediction

Model Selection
Cross-Validation
Performance Metrics
Hyperparameter Optimization

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 4

Reading a Dataset from a
Tabular Text File

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 5

The Iris Dataset

Iris-Setosa Iris-Versicolor Iris-Virginica

Dataset paper: Fisher, R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, 179-188 (1936);
also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 6

Sometimes Useful: Executing "Bash" Terminal
Commands Via "!"

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 7

A DataFrame Library for Data Wrangling

https://pandas.pydata.org

pandas is short for "PANel DAta S"

Pandas Paper: McKinney, Wes. "Data structures for statistical computing in

python." Proceedings of the 9th Python in Science Conference. Vol. 445.
2010.

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 8

https://pandas.pydata.org

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 9

Many additional options exist ...

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 10

E.g., processing a large file iteratively ...

Source: https://github.com/rasbt/python_reference/blob/master/useful_scripts/large_csv_to_sqlite.py

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 11

For scaling Pandas, also check out

Modin: https://github.com/modin-project/modin

and Dask: https://github.com/dask/dask 

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 12

Lecture 5 (Data Preprocessing and ML with Scikit-Learn)
Topics
1. Reading a Dataset from a Tabular Text File

2. Basic Data Handling

3. Object Oriented Programming (OOP) & Python Classes

4. Machine Learning with Scikit-learn

5. Preparing Training Data & Transformer API

6. Scikit-learn Pipelines

Code notebook: https://github.com/rasbt/stat451-machine-learning-fs20/blob/master/

L05/code/05-preprocessing-and-sklearn__notes.ipynb
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 13
Python Function

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 14

Regular Function vs Lambda Function

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 15

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 16
Column-based Data Processing via Lambda
Functions and ".apply"

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 17

Column-based Data Processing via
Dictionaries and ".map"

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 18

Quick Inspections via "head" and "tail"

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 19

Accessing the Underlying NumPy Array(s) via
the ".values" Attribute

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 20

"Creating*" the Label Vector "y" and Design
Matrix "X"

* why did I put "Creating"

in quotation marks?

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 21

A Library with Additional Data Science  
& Machine Learning-related Functions

MLXTEND http://rasbt.github.io/mlxtend/

Raschka, Sebastian. "MLxtend: Providing machine learning and data science utilities and
extensions to Python’s scientific computing stack." 
The Journal of Open Source Software 3.24 (2018).

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 22

Exploratory Data Analysis (EDA)

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 23

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 24
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 25
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 26
(Later, we will see how to do this more conveniently)

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 27

Lecture 5 (Data Preprocessing and ML with Scikit-Learn)
Topics
1. Reading a Dataset from a Tabular Text File

2. Basic Data Handling

3. Object Oriented Programming (OOP)  

& Python Classes

4. Machine Learning with Scikit-learn

5. Preparing Training Data & Transformer API

6. Scikit-learn Pipelines
Code notebook: https://github.com/rasbt/stat451-machine-learning-fs20/blob/master/
L05/code/05-preprocessing-and-sklearn__notes.ipynb
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 28
Python Classes

To get a better understanding of the scikit-learn API, we need to

understand the main concepts behind Object Oriented
Programming (OOP) & classes in Python

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 29

Python Classes

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 30

Python Classes

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 31

Python Classes

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 32

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 33
Python Classes

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 34

Python Classes

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 35

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 36
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 37
Lecture 5 (Data Preprocessing and ML with Scikit-Learn)
Topics
1. Reading a Dataset from a Tabular Text File

2. Basic Data Handling

3. Object Oriented Programming (OOP) & Python

Classes

4. Machine Learning with Scikit-learn

5. Preparing Training Data & Transformer API

6. Scikit-learn Pipelines
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 38
The "Main" Machine Learning Library for
Python

http://scikit-learn.org

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Weiss, R., Dubourg, V. and Vanderplas, J., 2011. Scikit-learn: Machine learning in Python. the Journal of
Machine Learning Research, 12, pp.2825-2830.

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 39

The Scikit-learn Estimator API (an OOP Paradigm)

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 40

The Scikit-learn Estimator API

Training Training
Data Labels

① est.fit(X_train, y_train)

Model Test Data

② est.predict(X_test)

Predicted
labels

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 41

A 3-Nearest Neighbor Classifier & 2 Iris Features

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 42

Lecture 5 (Data Preprocessing and ML with Scikit-Learn)
Topics
1. Reading a Dataset from a Tabular Text File

2. Basic Data Handling

3. Object Oriented Programming (OOP) & Python Classes

4. Machine Learning with Scikit-learn

5. Preparing Training Data & Transformer API

6. Scikit-learn Pipelines

Code notebook: https://github.com/rasbt/stat451-machine-learning-fs20/blob/master/

L05/code/05-preprocessing-and-sklearn__notes.ipynb
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 43
a sample is inversely proportional to the size of the sample. Let us have a look at an example using

Issues with Random Subsampling ...

the Iris dataset 1 , which we randomly divide into 2/3 training data and 1/3 test data as illustrated in
Figure 1. (The source code for generating this graphic is available on GitHub2 .)

All samples (n = 150)

Training samples (n = 100) Test samples (n = 50)

Figure 1: Distribution of Iris flower classes upon random subsampling into training and test sets.

Sebastian Raschka
1
STAT 451: Intro to ML
https://archive.ics.uci.edu/ml/datasets/iris Lecture 5: Scikit-learn 44
Stratified Splits

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 45

Normalization: Min-Max Scaling

[i]
[i]
x − xmin
xnorm =
xmax − xmin

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 46

Normalization: Min-Max Scaling

[i]
[i]
x − xmin
xnorm =
xmax − xmin

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 47

Normalization: Standardization

[i]
[i]
x − μx
xstd =
σx

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 48

Normalization: Standardization
[i]
[i]
x − μx
xstd =
σx

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 49

Normalization: Standardization

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 50

Sample vs Population Standard Deviation

1 i=1 [i] 2
n−1∑
sx = (x − x̄)
n

i=1
1
(x [i] − μx)2
n∑
σx =
n

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 51

Sample vs Population Standard Deviation

i=1
1
(x [i] − x̄)2
n−1∑
sx =
n

1 i=1 [i]
(x − μx)2
n∑
σx =
n

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 52

Scaling Validation and Test Sets

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 53

Scaling Validation and Test Sets

Given 3 training examples:

- example1: 10 cm -> class 2

- example2: 20 cm -> class 2

- example3: 30 cm -> class 1

Estimate:

mean: 20 cm

standard deviation: 8.2 cm

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 54

Scaling Validation and Test Sets
Given 3 training examples:

- example1: 10 cm -> class 2

- example2: 20 cm -> class 2

- example3: 30 cm -> class 1

Estimate:

mean: 20 cm

standard deviation: 8.2 cm

Standardize:

- example1: -1.21 -> class 2

- example2: 0.00 -> class 2

- example3: 1.21 -> class 1

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 55

Scaling Validation and Test Sets
Given 3 training examples:

- example1: 10 cm -> class 2

- example2: 20 cm -> class 2

- example3: 30 cm -> class 1

Estimate:

mean: 20 cm

standard deviation: 8.2 cm

Assume you have the classification rule:

Standardize (z scores): (
class 2 if z  0.6
- example1: -1.21 -> class 2
h(z) =
- example2: 0.00 -> class 2
class 1 otherwise
- example3: 1.21 -> class 1
<latexit sha1_base64="HAx2k5DyXbF25+3p1q+TLdp/cAo=">AAACaHicbVHBTttAEF27tIVAW1NAqOIyImpFL5GdosKlEoILRyo1gBRH0XozTlas19buGBqsiH/sjQ/g0q/oxliIAk9a6enNvJndt0mhpKUwvPX8Vwuv37xdXGotr7x7/yFY/Xhq89II7Ilc5eY84RaV1NgjSQrPC4M8SxSeJRdH8/rZJRorc/2LpgUOMj7WMpWCk5OGwc1k5/or/IA4wbHUlXCj7AxaUCMm/E2VUNxamEEX4EsjgUydcA2xQgg73yGOX3RED4acJmiupMXZvDFGPWpWDYN22AlrwHMSNaTNGpwMgz/xKBdlhprqNf0oLGhQcUNSKDc+Li0WXFzwMfYd1TxDO6jqoGbw2SkjSHPjjiao1ceOimfWTrPEdWacJvZpbS6+VOuXlO4PKqmLklCL+0VpqYBymKcOI2lQkJo6woWR7q4gJtxwQe5vWi6E6OmTn5PTbif61un+3G0fHDZxLLItts12WMT22AE7ZiesxwS785a9dW/D++sH/qb/6b7V9xrPGvsP/vY/S5OzNw==</latexit>

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 56

Scaling Validation and Test Sets
(
Given 3 training examples: class 2 if z  0.6
h(z) =
- example1: 10 cm -> class 2
class 1 otherwise
<latexit sha1_base64="HAx2k5DyXbF25+3p1q+TLdp/cAo=">AAACaHicbVHBTttAEF27tIVAW1NAqOIyImpFL5GdosKlEoILRyo1gBRH0XozTlas19buGBqsiH/sjQ/g0q/oxliIAk9a6enNvJndt0mhpKUwvPX8Vwuv37xdXGotr7x7/yFY/Xhq89II7Ilc5eY84RaV1NgjSQrPC4M8SxSeJRdH8/rZJRorc/2LpgUOMj7WMpWCk5OGwc1k5/or/IA4wbHUlXCj7AxaUCMm/E2VUNxamEEX4EsjgUydcA2xQgg73yGOX3RED4acJmiupMXZvDFGPWpWDYN22AlrwHMSNaTNGpwMgz/xKBdlhprqNf0oLGhQcUNSKDc+Li0WXFzwMfYd1TxDO6jqoGbw2SkjSHPjjiao1ceOimfWTrPEdWacJvZpbS6+VOuXlO4PKqmLklCL+0VpqYBymKcOI2lQkJo6woWR7q4gJtxwQe5vWi6E6OmTn5PTbif61un+3G0fHDZxLLItts12WMT22AE7ZiesxwS785a9dW/D++sH/qb/6b7V9xrPGvsP/vY/S5OzNw==</latexit>

- example2: 20 cm -> class 2

- example3: 30 cm -> class 1

Given 3 NEW examples:

Estimate: - example4: 5 cm -> class ?

- example5: 6 cm -> class ?

mean: 20 cm
- example6: 7 cm -> class ?

standard deviation: 8.2 cm

Estimate "new" mean and std.:

Standardize (z scores):
- example5: -1.21 -> class 2
- example1: -1.21 -> class 2
- example6: 0.00 -> class 2
- example2: 0.00 -> class 2
- example7: 1.21 -> class 1
- example3: 1.21 -> class 1
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 57
Scaling Validation and Test Sets
(
Given 3 training examples: class 2 if z  0.6
h(z) =
- example1: 10 cm -> class 2
class 1 otherwise
- example2: 20 cm -> class 2

<latexit sha1_base64="HAx2k5DyXbF25+3p1q+TLdp/cAo=">AAACaHicbVHBTttAEF27tIVAW1NAqOIyImpFL5GdosKlEoILRyo1gBRH0XozTlas19buGBqsiH/sjQ/g0q/oxliIAk9a6enNvJndt0mhpKUwvPX8Vwuv37xdXGotr7x7/yFY/Xhq89II7Ilc5eY84RaV1NgjSQrPC4M8SxSeJRdH8/rZJRorc/2LpgUOMj7WMpWCk5OGwc1k5/or/IA4wbHUlXCj7AxaUCMm/E2VUNxamEEX4EsjgUydcA2xQgg73yGOX3RED4acJmiupMXZvDFGPWpWDYN22AlrwHMSNaTNGpwMgz/xKBdlhprqNf0oLGhQcUNSKDc+Li0WXFzwMfYd1TxDO6jqoGbw2SkjSHPjjiao1ceOimfWTrPEdWacJvZpbS6+VOuXlO4PKqmLklCL+0VpqYBymKcOI2lQkJo6woWR7q4gJtxwQe5vWi6E6OmTn5PTbif61un+3G0fHDZxLLItts12WMT22AE7ZiesxwS785a9dW/D++sH/qb/6b7V9xrPGvsP/vY/S5OzNw==</latexit>

- example4: 5 cm -> class ?

- example3: 30 cm -> class 1

- example5: 6 cm -> class ?

- example6: 7 cm -> class ?

Estimate:
Estimate "new" mean and std.:
mean: 20 cm

standard deviation: 8.2 cm

- example5: -1.21 -> class 2

- example6: 0.00 -> class 2
Standardize (z scores):
- example7: 1.21 -> class 1
- example1: -1.21 -> class 2
- example5: -18.37

- example2: 0.00 -> class 2

- example6: -17.15

- example3: 1.21 -> class 1 - example7: -15.92

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 58
The Scikit-Learn Transformer API

Training Test
Data Data

①
est.fit(X_train)

② est.transform(X_train) Model est.transform(X_test)③

Transformed Transformed
Training Data Test Data

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 59

The Scikit-Learn Transformer API

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 60

Working with Categorical Data

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 61

Categorical Data -> Ordinal Data

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 62

Categorical Data -> Nominal Data
(Class Labels)

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 63

One-hot Encoding for Categorical
(Nominal) Features

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 64

One-hot Encoding for Categorical
(Nominal) Features

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 65

Dealing with Missing Data

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 66

Dealing with Missing Data

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 67

Dealing with Missing Data

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 68

Lecture 5 (Data Preprocessing and ML with Scikit-Learn)
Topics
1. Reading a Dataset from a Tabular Text File

2. Basic Data Handling

3. Object Oriented Programming (OOP) & Python Classes

4. Machine Learning with Scikit-learn

5. Preparing Training Data & Transformer API

6. Scikit-learn Pipelines

Code notebook: https://github.com/rasbt/stat451-machine-learning-fs20/blob/master/

L05/code/05-preprocessing-and-sklearn__notes.ipynb
Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 69
Scikit-Learn Pipelines

(Step 1) (Step 2)
Class labels
Training set Test set

pipeline.fit(…) pipeline.predict(…)

Pipeline
.fit(…) &
Scaling
.transform(…)
.transform(…)
Dimensionality
Reduction
.fit(…) &
.transform(…)
Learning Algorithm .transform(…)

.fit(…)
Predictive Model
Class labels
.predict(…)

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 70

Scikit-Learn Pipelines

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 71

Scikit-Learn Pipelines

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 72

Scikit-Learn Pipelines

(Step 1) (Step 2)
Class labels
Training set Test set

pipeline.fit(…) pipeline.predict(…)

Pipeline
.fit(…) &
Scaling
.transform(…)
.transform(…)
Dimensionality
Reduction
.fit(…) &
.transform(…)
Learning Algorithm .transform(…)

.fit(…)
Predictive Model
Class labels
.predict(…)

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 73

Model Selection:
Simple Holdout Method
Original dataset

Training set Test set

Training set Validation set Test set

Change hyperparameters
and repeat

Machine learning
algorithm
Evaluate
Fit

Predictive model
Final performance estimate

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 74

Model Selection:
Simple Holdout Method

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 75

Model Selection:
Simple Holdout Method

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 76

Model Selection: Simple Holdout Method

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 77

Model Selection:
Simple Holdout Method

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 78

Randomized Search

https://github.com/scikit-learn/scikit-learn/pull/13900

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 79

Successive Halving

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 80

Lecture Notes

This time in interactive Jupyter Notebook form:

https://github.com/rasbt/stat451-machine-learning-fs20/blob/
master/L05/code/05-preprocessing-and-sklearn__notes.ipynb

Bonus: Column Transformers for

Heterogenous Data
https://github.com/rasbt/stat451-machine-learning-fs20/blob/master/
L05/code/05-bonus-column-transformer.ipynb

Sebastian Raschka STAT 451: Intro to ML Lecture 5: Scikit-learn 81

DLPMOVHEALTHcot 3
No ratings yet
DLPMOVHEALTHcot 3
3 pages
SK Learn
No ratings yet
SK Learn
9 pages
Scikit
No ratings yet
Scikit
81 pages
Python SciKit Learn Tutorial _ DigitalOcean
No ratings yet
Python SciKit Learn Tutorial _ DigitalOcean
11 pages
Applied Machine Learning in Python: Nikhil Sharma 1710991526 Data Science Batch
No ratings yet
Applied Machine Learning in Python: Nikhil Sharma 1710991526 Data Science Batch
27 pages
ANLP semVI Labmanual
No ratings yet
ANLP semVI Labmanual
33 pages
Python GTU Study Material Presentations Unit-2 24072020062038AM
No ratings yet
Python GTU Study Material Presentations Unit-2 24072020062038AM
18 pages
Machine Learning - Python Libraries
No ratings yet
Machine Learning - Python Libraries
12 pages
Discussion 4 Pytorch
100% (1)
Discussion 4 Pytorch
37 pages
Unbalanced Data Loading For Multi-Task Learning in PyTorch (Blog)
No ratings yet
Unbalanced Data Loading For Multi-Task Learning in PyTorch (Blog)
11 pages
Download Natural Language Processing in Artificial Intelligence 1st Edition Brojo Kishore Mishra ebook All Chapters PDF
100% (1)
Download Natural Language Processing in Artificial Intelligence 1st Edition Brojo Kishore Mishra ebook All Chapters PDF
65 pages
AI project logbook
No ratings yet
AI project logbook
5 pages
21 Powerful Tips Tricks and Hacks For Data Scientists
No ratings yet
21 Powerful Tips Tricks and Hacks For Data Scientists
37 pages
Intro To Scikit Learning
No ratings yet
Intro To Scikit Learning
18 pages
Satya Final Minor Report
100% (1)
Satya Final Minor Report
25 pages
A Recipe For Training Neural Networks
No ratings yet
A Recipe For Training Neural Networks
15 pages
Get Natural Language Processing Projects: Build Next-Generation NLP Applications Using AI Techniques Akshay Kulkarni PDF ebook with Full Chapters Now
100% (3)
Get Natural Language Processing Projects: Build Next-Generation NLP Applications Using AI Techniques Akshay Kulkarni PDF ebook with Full Chapters Now
65 pages
Data Analysis Library: by Muthu Priya J 19MZ06
No ratings yet
Data Analysis Library: by Muthu Priya J 19MZ06
3 pages
Semantic Kernel
100% (1)
Semantic Kernel
162 pages
Natural Language Processing Rahul Sahai
No ratings yet
Natural Language Processing Rahul Sahai
30 pages
Kabir Data Preprocessing Python
No ratings yet
Kabir Data Preprocessing Python
14 pages
PyTorch Custom Datasets
No ratings yet
PyTorch Custom Datasets
1 page
CSE3099-TARP: Automated E-Mail Reply by Chatbot Using Pytorch (Neural Networks)
No ratings yet
CSE3099-TARP: Automated E-Mail Reply by Chatbot Using Pytorch (Neural Networks)
24 pages
School of Computer Science: Python For ML/Al Internship
No ratings yet
School of Computer Science: Python For ML/Al Internship
12 pages
Practical 1to10
No ratings yet
Practical 1to10
32 pages
Image Quality Techniques
No ratings yet
Image Quality Techniques
6 pages
Internship Presentation
No ratings yet
Internship Presentation
18 pages
Artificial Intelligence Lab Manual: (ACADEMIC YEAR: 2018-19) Semester - I
No ratings yet
Artificial Intelligence Lab Manual: (ACADEMIC YEAR: 2018-19) Semester - I
40 pages
DIP Lab Manual No 02
No ratings yet
DIP Lab Manual No 02
24 pages
Unit5_AI_Top AIML Tools
No ratings yet
Unit5_AI_Top AIML Tools
15 pages
PYDS 3150713 Unit-2
No ratings yet
PYDS 3150713 Unit-2
38 pages
DL Notes
No ratings yet
DL Notes
35 pages
Deep Learning Practitioner's Toolbox: DL4CV Weizmann
No ratings yet
Deep Learning Practitioner's Toolbox: DL4CV Weizmann
49 pages
M6-AI-ktustundets - in-CS464 Artificial Intelligence
No ratings yet
M6-AI-ktustundets - in-CS464 Artificial Intelligence
6 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Stable Diffusion For Image Generation
No ratings yet
Stable Diffusion For Image Generation
23 pages
19 Deep Learning
100% (1)
19 Deep Learning
49 pages
Expert System Architecture
No ratings yet
Expert System Architecture
5 pages
Panaversity Cloud Native Applied Generative AI Engineer
No ratings yet
Panaversity Cloud Native Applied Generative AI Engineer
36 pages
Image Classification Using Backpropagation Algorithm (Presentation)
No ratings yet
Image Classification Using Backpropagation Algorithm (Presentation)
23 pages
Ey Natural Language Processing Screen Final
No ratings yet
Ey Natural Language Processing Screen Final
8 pages
ML Placement
No ratings yet
ML Placement
6 pages
Machine Learning Lab Dlihebca6sem
No ratings yet
Machine Learning Lab Dlihebca6sem
25 pages
Done Assignment
No ratings yet
Done Assignment
9 pages
Immediate download (Ebook) Natural Language Processing Recipes: Unlocking Text Data with Machine Learning and Deep Learning Using Python by Akshay Kulkarni, Adarsha Shivananda ISBN 9781484273517, 9781484273500, 1484273508, 1484273516 ebooks 2024
100% (10)
Immediate download (Ebook) Natural Language Processing Recipes: Unlocking Text Data with Machine Learning and Deep Learning Using Python by Akshay Kulkarni, Adarsha Shivananda ISBN 9781484273517, 9781484273500, 1484273508, 1484273516 ebooks 2024
81 pages
Complete Download Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing 1st Edition Taweh Beysolow Ii PDF All Chapters
100% (1)
Complete Download Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing 1st Edition Taweh Beysolow Ii PDF All Chapters
55 pages
Automated Chatbot Implemented Using Natural Language Processing PDF
No ratings yet
Automated Chatbot Implemented Using Natural Language Processing PDF
5 pages
423/723 Natural Language Processing: Assignment 1
No ratings yet
423/723 Natural Language Processing: Assignment 1
4 pages
PG (Purdue) Data Science
No ratings yet
PG (Purdue) Data Science
30 pages
MIC Assignment4
No ratings yet
MIC Assignment4
9 pages
IS 7118 Unit1 Introduction
No ratings yet
IS 7118 Unit1 Introduction
58 pages
Python Libraries
No ratings yet
Python Libraries
12 pages
Udemy Test4
No ratings yet
Udemy Test4
41 pages
Job Description For AI-ML Developer
No ratings yet
Job Description For AI-ML Developer
1 page
Technical Synopsis
No ratings yet
Technical Synopsis
5 pages
Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak
No ratings yet
Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak
25 pages
AIML_Dom_25_Nov_2024
No ratings yet
AIML_Dom_25_Nov_2024
22 pages
ML_Pipelines_AI_Community
No ratings yet
ML_Pipelines_AI_Community
53 pages
Modern AI Pro Essentials
100% (1)
Modern AI Pro Essentials
9 pages
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Python Deep Learning Complete Self-Assessment Guide
From Everand
Python Deep Learning Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
1SDA067020R1 xt2n 160 Tma 160 1600 3p F F
No ratings yet
1SDA067020R1 xt2n 160 Tma 160 1600 3p F F
3 pages
Carvin Engineering Data: Mts3200 Guitar Head Mts3212 Combo Guitar Amp
No ratings yet
Carvin Engineering Data: Mts3200 Guitar Head Mts3212 Combo Guitar Amp
4 pages
Pharmaco in For Matics
No ratings yet
Pharmaco in For Matics
7 pages
Lived Experiences of Students Having No Parents at Home
No ratings yet
Lived Experiences of Students Having No Parents at Home
24 pages
Procurement Log Register of Long Lead Materials
No ratings yet
Procurement Log Register of Long Lead Materials
1 page
The Seventh House in Astrology
100% (1)
The Seventh House in Astrology
6 pages
Immediate download Scientific Work and Creativity Advice from the Masters Reginald Smith Ed ebooks 2024
100% (3)
Immediate download Scientific Work and Creativity Advice from the Masters Reginald Smith Ed ebooks 2024
65 pages
Question Bank DWM 2022-23 Vii Semester B.E. Cse
No ratings yet
Question Bank DWM 2022-23 Vii Semester B.E. Cse
3 pages
Magnetics Ferrite Power Design 2013
100% (1)
Magnetics Ferrite Power Design 2013
9 pages
KAIST Spring Graduate Degree Program PDF
No ratings yet
KAIST Spring Graduate Degree Program PDF
2 pages
1 PDF
No ratings yet
1 PDF
1 page
Benvenuti Unitplan
No ratings yet
Benvenuti Unitplan
21 pages
Erciyes University Mechatronics Engineering Control Systems-II PID Control Project
No ratings yet
Erciyes University Mechatronics Engineering Control Systems-II PID Control Project
12 pages
2020 Sdewes Poster Biofilm
No ratings yet
2020 Sdewes Poster Biofilm
1 page
List
No ratings yet
List
109 pages
Changes in PMBOK® Guide 6th Edition08042018 PDF
No ratings yet
Changes in PMBOK® Guide 6th Edition08042018 PDF
1 page
Ningbo Sinoconve Rubber Conveyor Belt
No ratings yet
Ningbo Sinoconve Rubber Conveyor Belt
40 pages
Shielded Metal Arc Welding
100% (4)
Shielded Metal Arc Welding
33 pages
Study of Aesthetic Evaluation and Aesthetic Response To Architectural Space
No ratings yet
Study of Aesthetic Evaluation and Aesthetic Response To Architectural Space
9 pages
CARIS HIPS & SIPS Changes List PDF
No ratings yet
CARIS HIPS & SIPS Changes List PDF
48 pages
Indore Institute of Law
No ratings yet
Indore Institute of Law
4 pages
Matthew Becker Resume
No ratings yet
Matthew Becker Resume
1 page
The Systems Thinker - Causal Loop Construction - The Basics - The Systems Thinker
No ratings yet
The Systems Thinker - Causal Loop Construction - The Basics - The Systems Thinker
8 pages
Classification of Internet Products and Evaluation of Application Utilization Based On The Product Fulfillment Process
No ratings yet
Classification of Internet Products and Evaluation of Application Utilization Based On The Product Fulfillment Process
5 pages
Teenage Girls Say Instagram's Mental Health Impacts Are No Surprise
No ratings yet
Teenage Girls Say Instagram's Mental Health Impacts Are No Surprise
3 pages
Structure of Narrative Essay
No ratings yet
Structure of Narrative Essay
6 pages
Coal India Limited (Cil) : Recruitment of Management Trainees Advt No.: 02/2017
No ratings yet
Coal India Limited (Cil) : Recruitment of Management Trainees Advt No.: 02/2017
8 pages
Cirrus HD-OCT RNFL and ONH Analysis Report
No ratings yet
Cirrus HD-OCT RNFL and ONH Analysis Report
4 pages
UG 45 Calculation
No ratings yet
UG 45 Calculation
7 pages