Psyliq Internship Completion - Internship Python or R

1) Stock Prediction using LSTM:
i. Data Collection:
 Use the yfinance library in Python to fetch historical stock price data. Example: import
yfinance as yf and data = yf.download('AAPL', start='2022-01-01', end='2023-01-01').
ii. Data Preprocessing:

 Handle missing values by either filling them with relevant values or removing the
corresponding entries.
 Remove outliers that may distort the model's training.
 Convert the data into a format suitable for analysis, ensuring that the date is set as the index.
iii. Feature Engineering:

 Create moving averages using pandas' rolling function.
 Compute technical indicators like Relative Strength Index (RSI) or Moving Average
Convergence Divergence (MACD) using appropriate formulas.
iv. Normalization:
 Normalize numerical features to a scale between 0 and 1. You can use the Min-Max scaling
method for this.
v. Data Splitting:
 Split the dataset into training and testing sets using the train_test_split function from
sklearn.model_selection.
vi. Model Building:

 Build an LSTM model using a deep learning library like TensorFlow or PyTorch. For
TensorFlow, you can use the Sequential model and add LSTM layers.
vii. Model Training:

 Train the LSTM model using the training dataset. Specify the number of epochs and other
hyperparameters. Example: model.fit(X_train, y_train, epochs=50, batch_size=32).
viii. Model Evaluation:

 Evaluate the model on the testing dataset. Calculate metrics like Mean Squared Error (MSE)
to assess the model's performance.
ix. Prediction:
 Use the trained model to make predictions on future stock prices by providing new input data.
Sample Implementation for above task is:-
Code:-
import yfinance as yf
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
# Data Collection
data = yf.download('AAPL', start='2022-01-01', end='2023-01-01')
# Data Preprocessing
data = data['Close'].values.reshape(-1, 1)
scaler = MinMaxScaler()
data_scaled = scaler.fit_transform(data)
# Feature Engineering
# Example: Creating a 7-day moving average
data_scaled = pd.DataFrame(data_scaled, columns=['Close'])
data_scaled['MA_7'] = data_scaled['Close'].rolling(window=7).mean()
# Normalization
data_normalized = scaler.transform(data_scaled)
# Data Splitting
X_train, X_test, y_train, y_test = train_test_split(data_normalized[:-1], data_normalized[1:],
test_size=0.2, shuffle=False)
# Model Building
model = Sequential()
model.add(LSTM(50, input_shape=(X_train.shape[1], 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
# Model Training
model.fit(X_train.reshape(X_train.shape[0], X_train.shape[1], 1), y_train, epochs=50,
batch_size=32)
# Model Evaluation
loss = model.evaluate(X_test.reshape(X_test.shape[0], X_test.shape[1], 1), y_test)
# Prediction
predicted_prices = model.predict(X_test.reshape(X_test.shape[0], X_test.shape[1], 1))
2) Titanic Classification:
Data Collection:
 Downloading the Titanic dataset from a source like Kaggle.
Data Exploration:
 Use pandas for exploratory data analysis (EDA). Check for missing values, data types, and
summary statistics.
Data Preprocessing:
 Handle missing values by imputing or removing them.
 Encode categorical variables using one-hot encoding for algorithms that require numerical
input.
Data Splitting:
 Split the dataset into training and testing sets using train_test_split.
Model Selection:
 Choose a classification model such as logistic regression, decision trees, or random forests.
Model Training:
 Train the selected model on the training dataset. Example: from sklearn.linear_model
import LogisticRegression and model = LogisticRegression().fit(X_train, y_train).
Model Evaluation:
 Evaluate the model using metrics like accuracy, precision, recall, and F1 score on the testing
dataset.
Feature Importance:
 If using a tree-based model, analyze feature importance using the feature_importances_
attribute.
Sample Implementation for above task is :-
Code:-
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
# Data Collection
titanic_data = pd.read_csv('titanic.csv')
titanic_data.dropna(subset=['Age', 'Embarked'], inplace=True)
titanic_data['Sex'] = titanic_data['Sex'].map({'male': 0, 'female': 1})
X = titanic_data[['Pclass', 'Sex', 'Age']]
y = titanic_data['Survived']
# Data Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Model Selection
model = RandomForestClassifier()
# Model Training
model.fit(X_train, y_train)
# Model Evaluation
predictions = model.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
report = classification_report(y_test, predictions)
3) Number Recognition using Neural Network and MNIST
dataset:
Data Collection:
 Download the MNIST dataset, which is often available through TensorFlow or PyTorch
datasets.
Data Preprocessing:
 Normalize pixel values to a scale between 0 and 1. Reshape the images to a format suitable
for neural networks.
Data Splitting:
 Split the dataset into training and testing sets.
Model Building:
 Build a neural network using TensorFlow or PyTorch. For TensorFlow, use the Sequential
model and add dense layers.
Model Training:
 Train the neural network on the training dataset, specifying the number of epochs and batch
size.
Model Evaluation:
 Evaluate the model on the testing dataset using accuracy or other relevant metrics.
Prediction:
 Use the trained model to predict the digit in new handwritten images.
Sample Implementation for above code:-
Code:-
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical
# Data Collection
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
# Model Building
model = Sequential()
model.add(Flatten(input_shape=(28, 28)))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Model Training
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5)
# Model Evaluation
loss, accuracy = model.evaluate(X_test, y_test)
# Prediction
predictions = model.predict(X_test)

Psyliq Internship Completion - Internship Python or R

Uploaded by

Psyliq Internship Completion - Internship Python or R

Uploaded by

1) Stock Prediction using LSTM:

ii. Data Preprocessing:

iii. Feature Engineering:

vi. Model Building:

vii. Model Training:

viii. Model Evaluation:

Sample Implementation for above task is:-

Sample Implementation for above code:-

You might also like