0% found this document useful (0 votes)
0 views7 pages

Psyliq Internship Completion - Internship Python or R

The document outlines three machine learning projects: Stock Prediction using LSTM, Titanic Classification, and Number Recognition using Neural Networks with the MNIST dataset. Each project includes steps for data collection, preprocessing, model building, training, evaluation, and prediction, along with sample code implementations. The projects utilize libraries such as yfinance, pandas, TensorFlow, and scikit-learn for various tasks.

Uploaded by

Prathik Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
0 views7 pages

Psyliq Internship Completion - Internship Python or R

The document outlines three machine learning projects: Stock Prediction using LSTM, Titanic Classification, and Number Recognition using Neural Networks with the MNIST dataset. Each project includes steps for data collection, preprocessing, model building, training, evaluation, and prediction, along with sample code implementations. The projects utilize libraries such as yfinance, pandas, TensorFlow, and scikit-learn for various tasks.

Uploaded by

Prathik Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 7

1) Stock Prediction using LSTM:

i. Data Collection:
 Use the yfinance library in Python to fetch historical stock price data. Example: import
yfinance as yf and data = yf.download('AAPL', start='2022-01-01', end='2023-01-01').

ii. Data Preprocessing:


 Handle missing values by either filling them with relevant values or removing the
corresponding entries.
 Remove outliers that may distort the model's training.
 Convert the data into a format suitable for analysis, ensuring that the date is set as the index.

iii. Feature Engineering:


 Create moving averages using pandas' rolling function.
 Compute technical indicators like Relative Strength Index (RSI) or Moving Average
Convergence Divergence (MACD) using appropriate formulas.

iv. Normalization:
 Normalize numerical features to a scale between 0 and 1. You can use the Min-Max scaling
method for this.

v. Data Splitting:
 Split the dataset into training and testing sets using the train_test_split function from
sklearn.model_selection.

vi. Model Building:


 Build an LSTM model using a deep learning library like TensorFlow or PyTorch. For
TensorFlow, you can use the Sequential model and add LSTM layers.

vii. Model Training:


 Train the LSTM model using the training dataset. Specify the number of epochs and other
hyperparameters. Example: model.fit(X_train, y_train, epochs=50, batch_size=32).

viii. Model Evaluation:


 Evaluate the model on the testing dataset. Calculate metrics like Mean Squared Error (MSE)
to assess the model's performance.
ix. Prediction:
 Use the trained model to make predictions on future stock prices by providing new input data.

Sample Implementation for above task is:-

Code:-
import yfinance as yf
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Data Collection
data = yf.download('AAPL', start='2022-01-01', end='2023-01-01')

# Data Preprocessing
data = data['Close'].values.reshape(-1, 1)
scaler = MinMaxScaler()
data_scaled = scaler.fit_transform(data)

# Feature Engineering
# Example: Creating a 7-day moving average
data_scaled = pd.DataFrame(data_scaled, columns=['Close'])
data_scaled['MA_7'] = data_scaled['Close'].rolling(window=7).mean()

# Normalization
data_normalized = scaler.transform(data_scaled)

# Data Splitting
X_train, X_test, y_train, y_test = train_test_split(data_normalized[:-1], data_normalized[1:],
test_size=0.2, shuffle=False)

# Model Building
model = Sequential()
model.add(LSTM(50, input_shape=(X_train.shape[1], 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')

# Model Training
model.fit(X_train.reshape(X_train.shape[0], X_train.shape[1], 1), y_train, epochs=50,
batch_size=32)

# Model Evaluation
loss = model.evaluate(X_test.reshape(X_test.shape[0], X_test.shape[1], 1), y_test)

# Prediction
predicted_prices = model.predict(X_test.reshape(X_test.shape[0], X_test.shape[1], 1))
2) Titanic Classification:
Data Collection:
 Downloading the Titanic dataset from a source like Kaggle.

Data Exploration:
 Use pandas for exploratory data analysis (EDA). Check for missing values, data types, and
summary statistics.

Data Preprocessing:
 Handle missing values by imputing or removing them.
 Encode categorical variables using one-hot encoding for algorithms that require numerical
input.

Data Splitting:
 Split the dataset into training and testing sets using train_test_split.

Model Selection:
 Choose a classification model such as logistic regression, decision trees, or random forests.

Model Training:
 Train the selected model on the training dataset. Example: from sklearn.linear_model
import LogisticRegression and model = LogisticRegression().fit(X_train, y_train).

Model Evaluation:
 Evaluate the model using metrics like accuracy, precision, recall, and F1 score on the testing
dataset.

Feature Importance:
 If using a tree-based model, analyze feature importance using the feature_importances_
attribute.
Sample Implementation for above task is :-

Code:-
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# Data Collection
titanic_data = pd.read_csv('titanic.csv')

# Data Preprocessing
titanic_data.dropna(subset=['Age', 'Embarked'], inplace=True)
titanic_data['Sex'] = titanic_data['Sex'].map({'male': 0, 'female': 1})
X = titanic_data[['Pclass', 'Sex', 'Age']]
y = titanic_data['Survived']

# Data Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model Selection
model = RandomForestClassifier()

# Model Training
model.fit(X_train, y_train)

# Model Evaluation
predictions = model.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
report = classification_report(y_test, predictions)
3) Number Recognition using Neural Network and MNIST
dataset:
Data Collection:
 Download the MNIST dataset, which is often available through TensorFlow or PyTorch
datasets.

Data Preprocessing:
 Normalize pixel values to a scale between 0 and 1. Reshape the images to a format suitable
for neural networks.

Data Splitting:
 Split the dataset into training and testing sets.

Model Building:
 Build a neural network using TensorFlow or PyTorch. For TensorFlow, use the Sequential
model and add dense layers.

Model Training:
 Train the neural network on the training dataset, specifying the number of epochs and batch
size.

Model Evaluation:
 Evaluate the model on the testing dataset using accuracy or other relevant metrics.

Prediction:
 Use the trained model to predict the digit in new handwritten images.

Sample Implementation for above code:-

Code:-
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical

# Data Collection
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Data Preprocessing
X_train, X_test = X_train / 255.0, X_test / 255.0
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)

# Model Building
model = Sequential()
model.add(Flatten(input_shape=(28, 28)))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))

# Model Training
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5)

# Model Evaluation
loss, accuracy = model.evaluate(X_test, y_test)

# Prediction
predictions = model.predict(X_test)

You might also like