Unit2 ML Programs
Unit2 ML Programs
3)Implement the Euclidean distance and Cosine similarity metrics from scratch in Python and
apply them to compare two vectors or data points.
[ ]: import math
7)Implement the KNN algorithm in Python and apply it to a dataset to make predictions for a
new data point.
[ ]: import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
1
# Generate a synthetic dataset
np.random.seed(42)
X = np.sort(5 * np.random.rand(100, 1), axis=0)
y = np.sin(X).ravel()
knn_regressor.fit(X_train, y_train)
plt.title('KNN Regression')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
9)Train a logistic regression model on a binary classification dataset and analyze the importance
2
of each feature using their corresponding coefficients.
[ ]: import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
11)Implement a linear SVM classifier using Python’s scikit-learn library for a binary classification
problem. Visualize the decision boundary and support vectors.
[ ]: import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
3
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
13)Build a Naive Bayes classifier to classify text documents into different categories. Preprocess
the text data and use the Laplace smoothing technique to handle unseen words.
[ ]: import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report
4
# Sample text data for illustration
data = {'text': ["This is a positive document.",
"Negative sentiment detected in this text.",
"The sentiment in this document is positive.",
"This is another positive example."],
'label': ['Positive', 'Negative', 'Positive', 'Positive']}
df = pd.DataFrame(data)
naive_bayes_classifier.fit(X_train_vectorized, y_train)
print('Classification Report:')
print(classification_report(y_test, y_pred))
14)Compare the performance of the Naive Bayes algorithm with other classification algorithms on
a given dataset.
[ ]: import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
5
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report
# Define classifiers
classifiers = {
'Naive Bayes': GaussianNB(),
'Random Forest': RandomForestClassifier(random_state=42),
'Support Vector Machine': SVC(kernel='linear', random_state=42),
'K-Nearest Neighbors': KNeighborsClassifier(),
}
15) Build a Random Forest classifier using scikit-learn and apply it to a dataset
6
[ ]: import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
random_forest_classifier.fit(X_train, y_train)
print('Classification Report:')
print(classification_report(y_test, y_pred, target_names=iris.target_names))