0% found this document useful (0 votes)

63 views63 pages

Final RSR Word Report

Uploaded by

dadsgift0077

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views63 pages

Final RSR Word Report

Uploaded by

dadsgift0077

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Optimizing Stock Prediction Using Hybrid Neural

Network : Unified Evaluation Method Approach

A MAJOR PROJECT REPORT

Submitted by
SAI SUSHANTH C [RA2111028020054]
RIZWAN AHMED JAWED [RA2111028020049]
RAHUL S [RA2111028020052]

Under the guidance of

Dr. M. Prabu
(Assistant Professor, Department of Computer Science and
Engineering)
in fulfillment for the award of the degree

BACHELOR OF TECHNOLOGY
in

COMPUTER SCIENCE AND ENGINEERING

FACULTY OF ENGINEERING AND TECHNOLOGY

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY

RAMAPURAM, CHENNAI -600089
MAY 2024
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
(Deemed to be University U/S 3 of UGC Act, 1956)

BONAFIDE CERTIFICATE

Certified that this project report titled “Optimizing Stock Prediction Using
Hybrid Neural Network : Unified Evaluation Method Approach ” is
the bonafide work of SAI SUSHANTH C [RA2111028020054], RIZWAN AHMED
JAWED [RA2111028020049], RAHUL S [RA2111028020052]who carried out the project
work under my supervision. Certified further, that to the best of my knowledge the work
reported herein does not form any other project report or dissertation on the basis of which a
degree or award was conferred on an occasion on this or any other candidate.

SIGNATURE SIGNATURE
Dr. M. Prabhu,
Assistant Professor, Professor and Head,
Computer Science and Engineering, Computer Science and Engineering,
SRM Institute of Science and Technology, SRM Institute of Science and Technology,
Ramapuram, Chennai. Ramapuram, Chennai.

Submitted for the project viva-voce held on____________________at SRM Institute of

Science and Technology, Ramapuram, Chennai -600089.

INTERNAL EXAMINER EXTERNAL EXAMINER

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
RAMAPURAM, CHENNAI - 89

DECLARATION

We hereby declare that the entire work contained in this project report titled “Optimizing
Stock Prediction Using Hybrid Neural Network : Unified Evaluation Method
Approach” has been carried out by SAI SUSHANTH C [RA2111028020054],
RIZWAN AHMED JAWED [RA2111028020049], RAHUL S [RA2111028020052] at
SRM Institute of Science and Technology, Ramapuram Campus, Chennai- 600089, under the
guidance of [Link], Assistant Professor, Department of Computer Science and
Engineering.

Place: Chennai
Date:

SAI SUSHANTH C

RIZWAN AHMED JAWED

RAHUL S
ABSTRACT

In the realm of economic market trading, predicting charge movements stands

as a paramount difficulty for buyers aiming to reinforce their income. The vast
volume of statistics coupled with complicated interrelations necessitates the
adoption of algorithmic buying and selling and artificial intelligence. Despite
the plethora of techniques hired to forecast inventory market developments, the
results often fall quick of expectations due to the inherent complexity of
inventory markets. Many existing procedures both lack a precise definition of
fashion or forget about the temporal strong point of inventory facts, treating it
comparable to other attributes and using one-length-suits-all fashions to address
this critical time-series hassle. Our enterprise on this paper is to harness the
temporal characteristic of inventory statistics to beautify prediction accuracy.
Initially, as opposed to treating facts uniformly, we hire a time weight
characteristic to meticulously assign weights to statistics based totally on their
proximity in time to the records to be expected. Furthermore, we delineate
inventory trend definitions via drawing upon monetary theories and enterprise
exceptional practices. Time collection forecasting poses a mission due to the
plethora of techniques to be had and the myriad hyperparameters associated
with each method. The TensorFlow library emerges as a flexible device in this
area. As an open-supply library especially designed for forecasting univariate
time collection datasets, TensorFlow simplifies the technique and automates the
look for foremost hyperparameters. It endeavors to craft adept forecasts for
records displaying trends and seasonal patterns via default, thereby providing a
person-pleasant and green answer for time series evaluation in economic
markets.
TABLE OF CONTENTS

Page. No

ABSTRACT vi

LIST OF FIGURES ix

LIST OF ACRONYMS AND ABBREVIATIONS x

1 INTRODUCTION 1
1.1 Introduction 1
1.2 Problem Statement 3
1.3 Aim of the project 4
1.4 Project Domain 4
1.5 Scope of the Project. 4
1.6 Methodology 5

2 LITERATURE REVIEW 6

3 PROJECT DESCRIPTION 11
3.1 Existing System 11
3.2 Proposed System 11
3.2.1 Advantages 12
3.3 Feasibility Study 12
3.3.1 Economic Feasibility 12
3.3.2 Technical Feasibility 13
3.3.3 Social Feasibility 13
3.4 System Specification 13
3.4.1 Hardware Specification 13
3.4.2 Software Specification 14

4 MODULE DESCRIPTION 15
4.1 General Architecture 15
4.2 Design Phase 16
4.2.1 Data Flow Diagram 16
4.2.2 UML Diagram 17
4.2.3 Activity Diagram 18
4.2.4 Sequence Diagram 19
4.3 Module Description 20
4.3.1 Data Visualization 20
4.3.2 Feature Selection 20
4.3.3 Train the Model 21
4.3.4 Testing the model 21
4.3.5 Implementing the model 22

5 IMPLEMENTATION AND TESTING 23

5.1 Input and Output 23
5.1.1 Stock Price Prediction 24
5.1.2 View of Final Prediction 25
5.2 Testing 25
5.2.1 Types of Testing 26
5.2.2 Unit testing 26
5.2.3 Integration testing 26
5.2.4 Validation testing 27
5.2.5 Performance testing 28
5.2.6 Regression testing 28
5.2.7 Test Result 29
5.3 Testing Strategy 30

6 RESULTS AND DISCUSSIONS 31

6.1 Efficiency of the Proposed system 31
6.2 Comparison of Existing and Proposed System 31
6.3 Sample Code 32

7 CONCLUSION AND FUTURE ENHANCEMENTS 41

7.1 Conclusion 41
7.2 Future Enhancements 41

8 SOURCE CODE 43
8.1 Code 43

REFERENCES 48

PLAGARISM REPORT 50

PAPER PUBLICATION PROOF 54

LIST OF FIGURES

Fig. No FIGURES NAMES Pg. No

4.1 Architecture Diagram 15

4.2 Data Flow Diagram 16
4.3 UML Diagram 17
4.4 Activity Diagram 18
4.5 Sequence Diagram 19
5.1 Stock Price Prediction 24
5.2 Final Prediction 25
5.3 Test Image 29
6.1 Output Image 1 40
6.2 Output Image 2 40
LIST OF ACRONYMS AND ABBREVIATIONS

API Application programming interface

AI Artificial intelligence
ARIMA Auto- Regressive Integrated Moving Average
CNN Convolutional Neural network
CV Curriculum Vitae
DL Deep learning
HNN Hybrid Neural Network
KOSPI Korean Composite Stock Price Indexes
KOSDAQ Korean Securities Dealers Automated Quotations
LSTM Long Short-Term Memory
LAN Local Area Network
MAE Mean Absolute Error
MSE Mean Squared Error
NLP Natural Language Processing
NSE National Stock Exchange
PLR Process Level Redundancy
RNN Recurrent Neural Network
Rf Random Forest Algorithm
RMSE Root Mean Squared Error
SPM Software Project Management
TC Trader Company
TPM Trend prediction model
CHAPTER 1
INTRODUCTION

1.1 Introduction

Understanding and predicting inventory prices is a critical vicinity of take a look at,
each in instructional studies and realistic utility, because it reflects the functioning of our
economic and social structures. The stock marketplace's pivotal position inside the worldwide
economic system underscores the importance of comprehending its dynamics. Despite the
inherent complexity and unpredictability of economic pastime, vast efforts have been
committed to elucidating those dynamics. Scholars often conceptualize inventory markets as
tricky, nonlinear, and evolutionary systems, acknowledging their dynamic nature. In current
years, device gaining knowledge of has emerged as a treasured tool for inventory price
prediction, because of its capacity to discern and make the most nonlinear relationships inside
records mechanically. Among the plethora of system gaining knowledge of strategies, the
Trader Company (TC) approach has garnered interest as a promising technique. The TC
approach, akin to an real economic group, accommodates core components: the Trader, liable
for prediction, and the Company, tasked with aggregating predictions. This approach is
designed to house the dynamic nature of the inventory market, supplying each high predictive
accuracy and interpretability. By simulating the roles of buyers within a monetary group, the
TC method not most effective captures the intricacies of market dynamics however
additionally offers insights into the elements influencing stock charge actions. Its capacity to
fuse superior machine getting to know strategies with interpretability makes it a strong device
for traders and analysts looking for to navigate the complexities of the stock marketplace.

Many modern methods of stock forecasting, such as the TC method, mainly focus
on horizontal point estimates. However, the lack of statistical uncertainty that accompanies
these point forecasts raises reliability concerns and security concerns, especially the level of
forecast uncertainty with which the main effects of investment decisions are generated in the
economy market sector where a large portion of trading is automated and algorithmic trading
based machine learning strategies prevail Visibility is critical Algorithmic trading strategies,
which operate in large investor universes and often use data use multiple applications (such as
intraday or tick-to-tick data), requires reliable uncertainty to effective decision making.
Consequently, the main challenge addressed in this paper revolves around budgeting that not

1
only boasts high predictability but also provides robust levels of uncertainty. Forecasting price
movements in financial markets, adhering to the efficient market hypothesis, is inherently
challenging given the nature of stock market prices as random steps and unpredictable changes
it reveals therefore This challenge is further compounded in the case of bitcoin, whose price
fluctuates widely and reveals complex factors. Traders have traditionally relied on two main
tools—technology and fundamental research—to formulate their trading strategies. Based on
price trend analysis and trading volume analysis, technical analysis provides insights into
possible trading signals. In contrast, mainstream research delves into the economic and
economic dimensions of protectionism and examines its sensitivity.

Humans and computers analyze information, but as humans can manage budgets
and make decisions based on experience, algorithmic trading has emerged as a solution to
overcome these challenges, which can be difficult to manage large amounts of data due to
material various effects due to inflation. Algorithmic trading uses preprogrammed computers
with specific mathematical rules. There are two main channels in financial markets: price
prediction and algorithmic trading. Price forecasting focuses on building models to accurately
predict future prices, while algorithmic trading goes beyond forecasting to actively participate
in the market, such as choosing areas to maximize profits and trading volume What
fortunately, getting an accurate forecast doesn’t always mean the most benefit. The total loss
incurred by the trader in unfair practices may be greater than the just loss. Stock price data
exhibit time series characteristics, and the auto-regressive integrated moving average
(ARIMA) method is often used for time series forecasting.

The ARIMA model is a broadly utilized approach in time collection evaluation, in

particular for predicting the conduct of a variable through the years. It treats the data
collection as a stochastic manner and employs a mathematical version to approximate its
behavior. One of its key strengths lies in its capability to capture vital statistical homes of the
statistics. Moreover, ARIMA is quite bendy, bearing in mind the illustration of numerous time
collection through the adjustment of its order parameters. In parallel, leveraging gadget
studying strategies like Long Short-Term Memory (LSTM) gives a wonderful avenue for
forecasting, specially in stock price prediction. LSTM, an extension of Recurrent Neural
Networks (RNN), is famend for its ability to figure and make use of lengthy-term
dependencies within sequential statistics via its memory cell mechanism. Unlike conventional
RNNs, LSTM consists of a gating mechanism that enables it to determine the relevance of
incoming data, thereby mitigating the vanishing gradient problem commonly encountered in
RNNs. By incorporating LSTM, the forecasting version gains the capability to research tricky

2
relationships within time collection facts, probably main to more accurate predictions. A
terrific distinction between LSTM and conventional RNN lies of their dealing with of
temporal information. While RNNs rely upon quick-term reminiscence, recycling past data for
fast use, LSTMs excel in shooting lengthy-term dependencies, thereby improving their
predictive talents. To examine and examine the forecasting performance of these fashions,
empirical analyses are conducted using records from outstanding tech businesses like Google,
Apple, Netflix, and Amazon. By examining the forecasting consequences derived from each
ARIMA and LSTM fashions, insights into their relative efficacy in predicting stock prices
across different temporal horizons can be gleaned.

Economics and finance have long been fertile grounds for studies, drawing hobby
from commercial, governmental, and academic sectors alike. With their complex
dependencies on severa tangible and intangible elements, these fields gift a challenging yet
attractive arena for analysis and prediction. In specific, the volatility of stock markets adds a
further layer of complexity to prediction endeavors. Nevertheless, the ability for excessive
rewards serves as a sturdy motivator for the vast take a look at of these structures. Over the
years, a plethora of works have delved into stock-charge prediction, using various statistical
models and time-series analyses. Given the non-linear nature of stock fees' dependence on
multiple variables, conventional techniques frequently fall brief. To address this venture,
many have turned to large-data-driven device gaining knowledge of strategies. While earlier
tries the usage of techniques like random forests, help vector regression, and shallow neural
networks confirmed evidence-of-idea applicability, latest advancements in deep studying,
including Long Short-Term Memory (LSTM) networks and encoder-decoder structures, show
more promise, specifically due to their capability to address the time-series nature of market
records. In addition to prediction, the layout of optimized portfolios has been a focus of
studies in quantitative and statistical finance. The purpose of an optimal portfolio is to allocate
weights to a set of capital belongings in a manner that maximizes the go back while handling
chance. Markowitz's mean-variance optimization approach, based on asset returns suggest and
covariance matrix, laid a foundational framework. However, this concept has wonderful
obstacles, in particular concerning estimation errors in predicted returns and covariance
matrix.

1.2Problem Statement
Fundamental evaluation involves assessing the intrinsic value of a security by means
of analyzing the underlying factors impacting a organisation's present operations and future
potential. This approach delves into diverse aspects including monetary statements, industry
3
conditions, control excellent, and economic signs to determine whether or not a protection is
hyped up, undervalued, or pretty priced. On the other hand, technical analysis specializes in
reading statistical styles derived from buying and selling records, together with rate actions
and buying and selling volumes. Its goal is to become aware of trends and styles inside the
marketplace conduct to forecast future price moves and pinpoint ability access or go out
factors for trades. Both fundamental and technical evaluation provide awesome methods to
evaluating investment opportunities and are frequently selected primarily based on elements
like market situations, funding horizon, and man or woman options.

1.3 Objective of the Project

This project aims for a more accurate stock prediction system by using a time-sensitive
approach to analyze historical data and a Hybrid Neural Network (HNN) that leverages
financial theories. The HNN extracts patterns from stock prices for reliable, short-term
predictions.

1.4 Project Domain

The project focuses on creating a more unified and objective approach to predicting
stock trends using a specialized neural network. Here's a breakdown of the key aspects:
Unifying Evaluation Objective Methods: This highlights the project's aim to move beyond
subjective or fragmented evaluation methods for stock prediction. It seeks to establish a single,
standardized approach for assessing the effectiveness of prediction models. Stock Prediction:
The core objective lies in developing a system capable of forecasting future stock price
movements. Hybrid Neural Network Algorithm: The project utilizes a unique neural network
architecture (HNN). This HNN combines the strengths of traditional neural networks, known
for their ability to learn complex patterns from data, with insights gleaned from established
financial theories. The goal is to create a model that not only identifies patterns in historical
stock prices but also incorporates financial knowledge to improve prediction accuracy. In
essence, this project seeks to bridge the gap between machine learning techniques and
financial theory to create a more robust and reliable system for predicting stock market trends.

1.5 Scope of the Project

The project focuses on creating a more comprehensive and accurate system for

4
predicting stock market trends. Here's a breakdown of its scope: Time-Varying Data Analysis:
The project moves beyond static analysis of historical stock data. It acknowledges that recent
data holds greater significance for predicting future trends compared to distant historical
information. The scope encompasses developing techniques to assign weights to data points
based on their timeliness, allowing the model to prioritize the most relevant information for
improved prediction accuracy. Hybrid Neural Network (HNN) Development: The project
centers around creating a specialized neural network architecture called the HNN. This
network combines the strengths of traditional neural networks, known for their ability to learn
complex patterns from data, with insights gleaned from established financial theories. While
the specific details of the evaluation method might not be explicitly mentioned in the provided
scope description, the project title suggests a focus on creating a unified approach for
assessing the effectiveness of the HNN model. This likely involves establishing clear metrics
and benchmarks to gauge the model's prediction accuracy and potentially comparing its
performance against other existing stock prediction methods.

1.6 Methodology

The project "Unifying Evaluation Objective Methods for Stock Prediction using
Hybrid Neural Network Algorithm" likely involves collecting historical stock price data,
preprocessing it, and developing a specialized Hybrid Neural Network (HNN) architecture.
The methodology includes incorporating time-varying importance into the analysis, training
the HNN on preprocessed data, and evaluating its performance using standardized metrics.
Model refinement based on evaluation results is also anticipated. These educated guesses
outline a framework for the project's approach to stock prediction using a time-sensitive
methodology and specialized HNN.

5
CHAPTER 2
LITERATURE REVIEW

There are many papers published on Unifying Evaluation Objective Methods for Stock
Prediction using Hybrid Neural Network Algorithm each using their own algorithms. A few of
those papers are mentioned below.

The paper “How to Handle Data Imbalance and Feature Selection Problems in CNN-
Based Stock Price Forecasting”, 2022 by authors Zinnet Duygu Akehir and Erdal Kili,
Forecasting inventory market movements is a hard task due to the inherent uncertainty and the
multitude of influencing factors. Traditional time series strategies regularly conflict to achieve
correct predictions in this complicated environment. In latest literature, Convolutional Neural
Networks (CNNs) have emerged as a promising method for inventory marketplace
forecasting, demonstrating wonderful fulfillment. However, issues inclusive of information
imbalance stemming from labeling discrepancies and challenges in function selection were
determined in using these models. To address those shortcomings, this have a look at
introduces a singular rule-based totally labeling set of rules and an revolutionary characteristic
choice technique. The proposed labeling algorithm targets to mitigate data imbalances by
using providing an improved framework for assigning labels to inventory market information.
Simultaneously, the novel characteristic choice method seeks to decorate model overall
performance with the aid of figuring out the maximum relevant input variables. Leveraging
these improvements, a CNN-based totally model is built to expect tomorrow's exchange action
for stocks within the Dow30 index. Multiple units of image-based input variables are
generated, incorporating technical signs, gold, and oil rate data, to feed into the CNN version.
Comparative evaluation of prediction overall performance is conducted against existing
research within the literature. The experimental findings demonstrate that the CNN prediction
model, leveraging the proposed feature choice and labeling approaches, achieves a remarkable
development in accuracy, starting from 3% to 22%, in comparison to preceding CNN-
primarily based fashions. Moreover, the effectiveness of the proposed labeling technique
surpasses that of conventional records weighting methods, as established by means of
comparisons with Chen and Huang's approach. Overall, these findings underscore the
significance of innovative strategies in addressing data imbalance and feature selection
challenges in stock market forecasting, thus advancing the efficacy of CNN-based models in
this domain.
6
The paper “Novel Stock Crisis Prediction Technique-A Study on Indian Stock
Market”, 2021 by Nagaraj Naik and Biju R. Mohan, Predicting inventory fees has emerge as a
focus of research, and traditional techniques frequently rely upon statistical and econometric
fashions. Yet, these fashions face demanding situations in coping with nonstationary time
collection information efficiently. With the net's speedy evolution and the surge in social
media utilization, online information and comments function indicators of investor sentiments
and attitudes closer to shares, imparting treasured insights for inventory charge prediction.
This paper seeks to introduce a novel technique leveraging deep studying strategies,
amalgamating conventional financial index variables with social media text features to
beautify prediction accuracy.

The paper “A stock price prediction method based on deep learning technology”, 2021
by authors Xuan Ji , Jiachen Wang and Zhijun Yan, An innovative technique to personnel
recruitment selection involves leveraging a probabilistic computerized advice approach. This
technique contains numerous key components geared toward facilitating green matching
among candidates and task necessities. By harnessing the energy of automation, this technique
seeks to streamline the recruitment technique and optimize personnel selection. Notably, the
absence of computerized structures in medical establishments and hospitals has spurred the
development of automation inside the healthcare sector. This shift underscores the pressing
want for technological solutions to address staffing demanding situations in important sectors.
One such era, Natural Language Processing (NLP), has emerged as a transformative device
reaping benefits society at large. NLP's capacity to handle text-based totally records alleviates
the manual burden related to processing enormous amounts of textual data. Consequently, its
application in recruitment tactics offers great capacity for reinforcing performance and
accuracy even as riding fine outcomes for each employers and applicants. Automation in
resume classification is a game-changer for several reasons. First and foremost, it saves time
and resources.

The paper “Automated Resume Screening System”, 2020 by authors Frank Färber,
Utilizing a Vector Space Model, we can efficiently pair each CV with its corresponding
activity description. This approach entails using a vectorization version coupled with cosine
similarity to gauge the relevance among candidate profiles and job necessities. By computing
ranking scores thru this technique, we can become aware of the maximum suitable candidates
for the given job role. Sure thing! An Automated Resume Screening System is like a digital

7
gatekeeper for job applications. Its main job is to sift through the mountains of resumes that
flood in when a job is posted. This system uses algorithms and predefined criteria set by
recruiters or hiring managers to quickly categorize and filter resumes. The system typically
analyzes resumes based on keywords, skills, education, work experience, and other relevant
factors. It helps streamline the hiring process by narrowing down the pool of applicants to
those who closely match the job requirements. Think of it like a smart assistant for recruiters,
saving them time and effort by automating the initial screening process. It's like having a
helpful assistant that does the heavy lifting in the early stages of recruitment.

The paper “System for screening candidates for recruitment”, 2020 by authors Momin
Adnan, To search for the resumes that are closest to the specified description of the job, the
model used cosine similarity, CNN and content-based Recommendation. Resume
classification systems are designed to streamline the recruitment process by automatically
sorting and categorizing resumes based on predefined criteria. These systems often use a
combination of natural language processing (NLP) and machine learning algorithms to analyze
the content of resumes and extract relevant information. The system scans resumes for specific
keywords and phrases relevant to the job requirements. This helps filter out resumes that don't
contain the essential skills or qualifications. NLP algorithms are employed to identify and
extract key skills, qualifications, and experiences mentioned in the resumes. This allows the
system to assess whether candidates possess the necessary background for the position.

The paper “Stock Trend Prediction Using Candlestick Charting and Ensemble Machine
Learning Techniques With a Novelty Feature Engineering Scheme”, 2021 by authors Yaohu
Lin , Shancun Liu , Haijun Yang and Harris Wu, This system sorts all resumes according to
the company's requirements and sends them to the HR for further consideration. The required
resume is chosen from a pool of applicants, and the others are discarded. Resume sorting is a
crucial step in the resume classification process. It involves the use of automated systems to
analyze and categorize resumes based on predefined criteria. This process helps recruiters and
hiring managers efficiently manage large volumes of resumes and identify the most relevant
candidates for a particular job. Automated systems extract relevant information from resumes,
such as education history, work experience, skills, and contact details. The system compares
the extracted information with predefined keywords or criteria set by the employer. For
example, if a job requires specific skills or qualifications, the system looks for those keywords
in the resume.

8
The paper “Integrated Long-Term Stock Selection Models Based on Feature Selection
and Machine Learning Algorithms for China Stock Market”, 2020 by authors Xianghui Yuan ,
Jin Yuan , Tianzhao Jiang and Qurat Ul Ai, Which helps the recruiters in selecting the resumes
based on job-description in a short duration of time. It helps in an easy and efficient hiring
process by extracting the requirements automatically. It showcases your skills in developing
systematic approaches to solving problems or handling tasks within a professional or technical
context. In the context of resume classification, mentioning your proficiency in the
formulation of systems implies that you are adept at creating and implementing efficient
systems, methodologies, or frameworks relevant to your field of expertise.

The paper “Global Stock Market Prediction Based on Stock Chart Images Using Deep
Q-Network”, 2019 by authors Jinho Lee , Raehyun Kim , Yookyung Koh and Jaewoo Kang,
In Research shows that country-specific stock charts have the potential to generate returns not
only in that country but also in global markets Our findings suggest that a model raised in the
US. market alone showed strong performance that equaled or exceeded results in many other
markets over a 12 -year test period. This means that machine learning and artificial
intelligence approaches to currency price forecasting, which are typically only applied in
single-country studies, can be successfully applied globally and crucially, as a modeling
framework , the inputs and training methods are complex.

The paper “Stock Volatility Prediction by Hybrid Neural Network”, 2019 by authors
yujie Wang , Hui Liu , Qiang Guo , Shenxiang Xie and Xiaofeng Zhang, This paper offers a
singular technique that integrates sophisticated textual features with fundamental stock
records. Unlike traditional strategies that solely focus on volatility trends, our approach offers
a more complete extraction of stock capabilities. As a result, HTPNN demonstrates advanced
overall performance by means of correctly balancing prediction accuracy and computational
performance in forecasting stock volatility.

The paper “A Dual-Attention-Based Stock Price Trend Prediction Model With Dual
Features”, 2019 by authors Yingxuan Chen , Weiwei Lin and James Z. Wang, In order to
overcome the limitation of traditional methods of extracting relevant factors for analyzing
economic time series data, we propose a new method called Two Phase Trend Prediction
9
Model (TPM) This model uses factors two uses, two concepts designed for investment banks.
Initially, in the data preprocessing step we use the PLR method and CNN to extract two
factors, capture long-term trends and short-term market trends from historical data and then in
the time series modeling phase we apply a new system including short-term features to coder
and long-term decoder. We combine cognitive techniques for encoder and decoder
components, enabling adaptive selection and combining the most appropriate feature
dimensions across time Finally, TPM exhibits high accuracy in predicting trend slopes and
time length of the. The experimental results confirm the effectiveness of our method, showing
a remarkable reduction of 13 in RMSE.

10
CHAPTER 3

PROJECT DESCRIPTION

3.1 Existing System

This study uses machine learning algorithms to examine the relationship between
the Korea Composite Group Price Index (KOSPI), a national statistical index administered by
the Korean government, and the analysis of Korean stock market sales analysis examines data
that from the 1,470 companies listed in KOSPI and KOSDAQ and 20-year history This ranges
from 2000 to 2021 with physical indices It uses various machine learning algorithms such as
random forest, gradient growth, extreme gradient enhancement, adaptive enhancement, and
categorical enhancement on the spanned [Link] findings show that various changes in
national accounting indicators affect corporate sales in different sectors. For example,
industrial accidents greatly affect manufacturing, finance, and insurance, while other sectors
are affected by factors such as the price of gold, the number of automobiles produced, and
foreign exchange reserves. Consequently, the study suggests the use of national statistical
indicators to develop management strategies based on machine learning techniques. It
attempts to understand the impact of these indices on the company’s sales and determine the
best machine learning algorithms for analysis. By scrutinizing various indices and the
company’s sales data, the study identifies key variables affecting sales performance and
measures algorithm performance. Notably, gradient-boost appears as the best algorithm for all
tasks, with particular tasks favoring different algorithms. The study emphasizes the
importance of industry-specific variation of the Impact of macroeconomic indicators on
enterprise income. In particular, it finds the industrial twist of fate charge to be a giant
predictor of sales performance across various sectors, offering empirical insights into the
limitations of preceding studies.

3.2 Proposed System

In the proposed system, the fundamental principle of technical analysis in the

stock market is that prices move in trends, often depicted as zigzags rather than linear
movements. Peaks and troughs tend to occur in pairs, with temporary reversals before
resuming the original direction. The system focuses on leveraging the intrinsic time nature of
time-series stock data through three main aspects: recognizing the varying importance of data
over time, constructing portfolios using an HNN model for future price and return predictions,
11
and developing a decision-making model to maximize profits through buy/sell predictions.
Parameter adjustment is crucial for training neural networks, with optimal parameters
determined by comparing different settings. History points, essential for modeling with
recurrent neural networks, are determined by analyzing correlations between past and future
prices to gauge memory requirements.

3.2.1 Advantages
• Achieve a well-balanced tradeoff among various parameters.

• Minimizes the workload on infrastructures.

• Low Deployment Cost.

• Determine the significance of features or feature subsets.

• Quick Calculation Time.

• Can improve the worst-case performance.

3.3 Feasibility Study

The project "Unifying Evaluation Objective Methods for Stock Prediction using
Hybrid Neural Network Algorithm" presents both promising aspects and challenges to
feasibility. Promising elements include acknowledging the dynamic nature of market data
through time-varying importance, leveraging neural networks and financial theories for a more
robust model, and establishing a unified evaluation method for objective assessment.
However, challenges such as the complexity of stock market dynamics, neural network design,
and data availability and quality must be addressed for successful implementation. Overall
feasibility relies on effectively navigating these challenges while capitalizing on the project's
promising features to enhance stock prediction accuracy.

3.1.2 Economic Feasibility

The economic feasibility of the project "Unifying Evaluation Objective Methods for
Stock Prediction using Hybrid Neural Network Algorithm" involves weighing costs against
potential benefits. Costs include data acquisition expenses, computational resources for
training the HNN, and ongoing development and maintenance efforts. Benefits could include
improved investment decisions, reduced risk through trend insights, and potential for trading
automation. However, economic viability depends on the HNN's ability to generate returns
12
that outweigh costs. Factors like prediction accuracy, market efficiency, and transaction costs
must be carefully considered. While promising, the project may be most beneficial for
institutional investors or niche applications where inefficiencies can be exploited.

3.3.2 Technical Feasibility

The project "Unifying Evaluation Objective Methods for Stock Prediction using Hybrid
Neural Network Algorithm" is technically feasible based on several factors: Feasible
Techniques: Established methods like time-varying importance weighting and hybrid neural
networks are well-suited for stock prediction tasks. Abundant neural network libraries and
readily available historical stock data facilitate HNN development. Challenges: Data quality,
HNN architecture design, and market unpredictability are key challenges that needs
addressed.

3.3.3 Social Feasibility

The Social feasibility in this project, it examines the mission's impact on stakeholders
and society at massive. It considers elements inclusive of reputation via cease-users, ability
process introduction or displacement, and the broader societal implications of adopting
computerized power prediction structures. Engaging with stakeholders and addressing any
concerns regarding privateness, equity, and accessibility are crucial factors of ensuring social
acceptance and help for the undertaking.

3.4 System Specifications

Generally, operating systems and software applications specify a set of
requirements for physical computing resources, known as hardware. These
requirements are often listed along with the hardware compatibility list, especially for
operating systems. Minimum hardware specifications are usually,

3.4.1 Hardware Specification

 Processor: Minimum i3 Dual Core

 Ethernet connection (LAN) OR a wireless adapter (Wi-Fi)
 Hard Drive: Minimum 100 GB; Recommended 200 GB or more

13
 Memory (RAM): Minimum 8 GB; Recommended 32 GB or above

3.4.2 Software Specification

• Python
• Anaconda
• Jupyter Notebook
• TensorFlow

14
CHAPTER 4
MODULE DESCRIPTION

4.1 General Architecture

Figure 4.1 Architecture Diagram

Figure 4.1 represents the architecture diagram of the project. The architecture
diagram depicts a typical machine learning process for stock price prediction using a neural
network. The data is collected, preprocessed, and then split into training and testing sets. The
training set is used to train the HNN model, and the testing set is used to evaluate the model's
performance.

15
4.2 Design Phase

4.2.1 Data Flow Diagram

Figure 4.2 Data flow diagram

Figure 4.2 likely depicts the project's data flow, illustrating the journey of stock price data
within the system. It would showcase various stages of data processing, likely ranging from data
collection and cleaning to feature extraction, model training, evaluation, and finally, prediction
generation. The core of the diagram would be the Hybrid Neural Network (HNN) ensemble
technique. This would show how stock price data is fed into the ensemble model, how predictions
are generated, and how they are ultimately used to reduce the human workload in stock price
prediction tasks. Overall, Figure 4.2 would serve as a visual representation of the data pipeline and
workflow for stock price prediction, emphasizing the key components and interactions involved in
the system.

16
4.2.2 UML Diagram
Class Diagram

Figure 4.3 Class Diagram

Figure 4.3 represents the Class Diagram of the project. The class diagram of this
project illustrates likely depicts the numerous instructions and their relationships within the
device. This diagram might outline the key entities and their attributes, alongside the
techniques they employ to achieve the venture's goals. Given the nature of the assignment,
instructions might consist of additives such as data preprocessing modules, various
predictive models (including neural networks), optimization algorithms, and evaluation
metrics. Relationships between these components would illustrate how they interact and
collaborate to predict stock prices successfully. The diagram would possibly comprise
inheritance, aggregation, and association to represent the hierarchical structure and
dependencies between different components of the system, such as feature extraction
modules, model training components, and result evaluation methods. Overall, the class
diagram serves as a blueprint for developers, providing a visual representation of the
project's architecture and assisting in the implementation of the proposed methodology for
enhancing stock price prediction accuracy.

17
4.2.3 Activity Diagram

Figure 4.4 Activity Diagram

Figure 4.4 represents the Activity diagram of the project. It represents an employed to
visualize the sequence of moves or methods worried in stock prediction. This diagram detail
the steps taken through the deep learning ensemble version, from statistics preprocessing to
model schooling and assessment. It may illustrate statistics collection, characteristic
engineering, model choice, and performance evaluation. By depicting those activities, enables
stakeholders apprehend the go with the flow of operations inside the prediction gadget,
facilitating, evaluation, and potential optimizations inside the method.

18
4.2.4 Sequence Diagram

Figure 4.5 Sequence diagram

Figure 4.5 represents the sequence diagram. The sequence diagram might depict the
flow of interactions and messages between the distinct components involved in the stock price
prediction process. It could illustrate how data is collected from various sources such as
financial markets or news feeds, how it's preprocessed to extract relevant features, how these
features are utilized by the predictive models, and how predictions are generated.
Additionally, it can show interactions with external systems or databases for retrieving
historical stock data or validating predictions. The diagram serves as a visual representation of
the system's behavior, aiding stakeholders in understanding the process from a high-level
perspective.

19
4.3 Module Description
4.3.1 Module 1: DATA VISUALIZATION
Introduction to Data Visualization:

Data visualization involves presenting raw data through graphical representations, enabling a
more intuitive understanding of complex datasets. Its primary purpose is to explore the data
and uncover deep insights that may not be immediately apparent from the raw data alone.

Exploratory Data Analysis:

Using visualization, we can conduct exploratory data analysis to identify patterns, trends,
correlations, and anomalies within the dataset. This process helps in gaining a deeper
understanding of the data and its underlying structures.

Summary and Insight Generation:

Visualizations provide a concise summary of the dataset, offering an overview of its

characteristics and implications. They help in revealing insights and telling the story behind
the data, facilitating a deeper understanding of its implications and potential applications.

Detection of Data Issues:

Data visualization aids in identifying areas of the dataset that require attention and
improvement. It helps in detecting data types, missing or duplicated values, and outliers,
which are crucial for refining the dataset and ensuring its quality.

Enhancing Model Performance:

By addressing data issues identified through visualization, we can enhance model

performance and prediction accuracy. Visual exploration helps in refining the dataset,
leading to better model outcomes and more reliable predictions.

4.3.2 Module 2: Feature Selection and Feature Extraction

Feature selection and feature extraction are dimensionality reduction techniques in

machine learning, which are important steps to reduce model complexity and overfitting
Dimensionality reduction greatly affects model training. Feature selection is choosing a subset
of features from the initial set, aiming to simplify model complexity, improve computational
efficiency, and reduce generalization errors due to irrelevant features On the other hand,
20
feature extraction is finding features new from using original features, increasing model
efficiency, Overfitting is resisted and normalization errors are reducedFeature selection and
feature extraction are dimensionality reduction techniques in machine learning, which are
important steps to reduce model complexity and overfitting Dimensionality reduction greatly
affects model training.

4.3.3 Module 3: Train the Model

Creating a department among schooling and checking out sets within your dataset serves
as a important method to rapidly examine the efficacy of an set of rules to your unique
assignment. The training set is applied to assemble and refine the model, basically serving as a
simulation ground for the set of rules. Conversely, the check set is treated as novel statistics,
holding returned the actual output values from the algorithm. By using the trained version at the
check set inputs and evaluating its predictions towards the withheld outputs, we derive a overall
performance metric, gauging the model's effectiveness on unseen statistics. This process yields
an approximation of the algorithm's functionality whilst making predictions on unfamiliar
instances. A final new release of the machine getting to know model represents the model
deemed suitable for predicting outcomes on new information. To facilitate model training, get
entry to to the dataset, in conjunction with several utility functions, is fundamental. The
schooling phase involves a couple of iterations or passes thru the dataset, throughout which the
version's parameters are initialized randomly and step by step delicate.

4.3.4 Module 4: Testing the Model

Ensemble mastering amalgamates more than one character models to beautify

generalization overall performance. Presently, deep gaining knowledge of models with
multilayer processing structure reveal advanced performance compared to shallow or
conventional classification models. Deep ensemble mastering models amalgamate the strengths
of deep mastering models and ensemble mastering, resulting in advanced generalization overall
performance. The procedure of making a educate-check split includes hastily assessing set of
rules performance. The training dataset is utilized to prepare and educate the version, while the
take a look at dataset acts as new statistics with withheld output values. Predictions from the
educated version on the test dataset are then as compared to the withheld output values, enabling
computation of the version's overall performance measure. This estimate displays the set of
rules's skill in making predictions on unseen facts. A final gadget getting to know model is one
applied for making predictions on new statistics. To train a model we need access to the data,
several utility functions, and we need multiple iterations / passes through the dataset. The
21
training process will effectively use both functions repeatedly: Initially, the parameters of the
model are randomly instantiated. Next, the score of the model is checked. If the score is deemed
insufficient (often because it has improved compared to the previous iteration), the model
parameters are updated and the process is repeated.

4.3.5 Implementing the Model

 Integrates the trained model right into a production surroundings or application

infrastructure.
 Develops APIs or provider endpoints for interacting with the model for prediction
requests.
 Designs facts pipelines for eating new statistics and generating real-time or batch
predictions.
 Deploys the version on scalable and reliable computing infrastructure, thinking about
factors like latency, throughput, and useful resource utilization.
 Implements tracking and logging functionalities to track version performance, usage
metrics, and machine health.
 Ensures compatibility with present system.
 Provides documentation and support resources for developers and quit-customers to
facilitate adoption and utilization.

22
CHAPTER 5

IMPLEMENTATION AND TESTING

5.1 Input and Output

In both the existing and proposed systems, the process of implementation and
testing involves similar foundational steps, yet varies in specific details based on chosen
algorithms and data characteristics. The existing system necessitates substantial historical data
acquisition, including annual sales data sourced from financial databases or the Korea Listed
Companies Association, alongside national statistical indicators from sources like Statistics
Korea e-Nara Index and Bank of Korea Economic Statistical System. Data preprocessing
involves standardizing formats, handling missing values, and ensuring consistency across
companies and timeframes. Outputs entail feature importance analysis using machine learning
algorithms to discern national statistical indicators' influence on sales and performance
evaluation through metrics like MAE, MSE, and RMSE. Conversely, the proposed system
mandates historical stock price data acquisition, covering daily or weekly closing prices for
several years, and preprocessing akin to the existing system. Model training involves defining
the architecture and hyperparameters of a Hybrid Neural Network model and feeding
preprocessed data for training. Outputs encompass predicted stock prices and returns for
specified time horizons, alongside model evaluation using metrics like Sharpe Ratio and
Sortino Ratio, comparing predictions with actual market outcomes. This comprehensive
analysis of inputs and outputs is pivotal for effective model implementation and testing.

23
5.1.1 Stock Price Prediction

Figure 5.1: Stock Price Prediction

Figure 5.1 likely shows a comparison between a price prediction model's output and
the actual market price. One line would represent the predicted prices over time, while the other
line shows the real Bitcoin prices. Ideally, the predicted price line would closely follow the
actual price line, with minimal deviations. Significant and consistent gaps between the lines
would indicate that the model is not accurately capturing the price movements. This analysis
helps assess the model's effectiveness in forecasting prices.

24
5.1.2 View of Final Prediction

Figure 5.2: Final Prediction

Figure 5.2 likely shows a comparison between a Bitcoin price prediction model's output
and the actual market price. One line would represent the predicted prices over time, while the
other line shows the real Bitcoin prices. Ideally, the predicted price line would closely follow the
actual price line, with minimal deviations. Significant and consistent gaps between the lines would
indicate that the model is not accurately capturing the price movements. This analysis helps assess
the model's effectiveness in forecasting Bitcoin prices.

5.2 Testing
Testing involves systematically validating the functionality and performance of the
predictive models developed for stock price prediction. It ensures that the models produce accurate
forecasts and behave as expected under various conditions. For the stock price prediction project
described, testing is a critical phase to ensure the accuracy and reliability of the predictive models.

25
5.2.1 Types of testing
 Unit testing

 Validation Testing
 Integration testing

 Performance Testing

 Regression Testing

5.2.2 Unit testing

This involves testing individual components or units of the software independently. In the
context of the stock price prediction project, unit testing would focus on testing specific functions
or modules responsible for data preprocessing, model training, and evaluation.

INPUT:
import pytest

from your_application_file import cleanResume

def test_clean_resume_removes_urls(): input_text = "This is a test URL

[Link] expected_output "This is a test URL: assert
cleanResume(input_text) expected_output

def test Lean_resume_removes_punctuations(): input xt "Hello, world! This is a test..

expected_output "Hello world This is a test assert cleanResume(input_text)
expected_output

Test result:
 Testing data preprocessing functions to ensure they handle missing data,
outliers, and transformations accurately.
 Testing model training functions to verify that the models are trained with the
correct hyperparameters and architecture.
 Testing evaluation metrics functions to ensure they calculate performance
metrics such as Mean Absolute Error and Sharpe Ratio accurately.
5.2.3 Integration testing
Integration testing ensures that individual components of a system work
together as intended, detecting any interface issues. Integration testing focus on testing
specific functions or modules responsible for data preprocessing, model training, and
evaluation.

26
INPUT:
import pytest

from your_application_file import cleanResume

from your_application_file import resumeDataSet

def test_clean_resume_integration(): sample_resume resumeDatapet['Resume'][0]

cleaned_resume cleanResume (sample_resume) assert "http" not in cleaned_resume assert
"!" not in cleaned_resume

5.2.4 Validation testing

Validation testing for a stock prediction system employing a hybrid neural

network algorithm involves assessing its accuracy and reliability in forecasting stock
prices. Here's a general outline of how you could approach validation testing,
including inputs and outputs:

INPUT:
import pandas as pd
from sklearn.model_selection import train_test_split
from [Link]
ing import MinMaxScaler
# Load historical stock datastock_data = pd.read_csv('historical_stock_data.csv')
# Extract features
features = stock_data[['Volume', 'Price_Trend', 'Moving_Average',
'Technical_Indicator_1', 'Technical_Indicator_2']]
# Normalize feature data
scaler = MinMaxScaler()

27
scaled_features = scaler.fit_transform(features)
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(scaled_features,
stock_data['Next_Day_Price'], test_size=0.2, random_state=42)

5.2.5 Performance Testing

Performance Testing is a type of software testing that ensures software applications
perform properly under their expected workload. It is a testing technique carried out to
determine system performance in terms of sensitivity, reactivity, and stability under a
particular workload.
INPUT:
import time
# Measure prediction time
start_time = [Link]()
predictions = [Link](X_test)
end_time = [Link]()
prediction_time = end_time - start_time
print("Prediction Time:", prediction_time, "seconds")

Test result:
 It involves assessing the collaboration of different components to ensure seamless
functionality.
 The results indicated a successful integration, as the modules responsible for parsing,
feature extraction, and classification worked cohesively.
 Overall, the integration testing phase instilled confidence in the project's ability to
handle the intricacies of resume data processing.

5.2.6 Regression Testing
Regression Testing is defined as a type of software testing to confirm that a recent
program or code change has not adversely affected existing features. We can also say it is
nothing but a full or partial selection of already executed test cases that are re-executed to
ensure existing functionalities work fine.

INPUT:
28
# Retrain the model with the same data and [Link](X_train, y_train)
# Make predictions
predictions = [Link](X_test)
# Compare with previous MSE
mse_regression = mean_squared_error(y_test, predictions)
print("Regression Testing MSE:", mse_regression)
Test result:
 It involves assessing the collaboration of different components to ensure seamless
functionality.
 The results indicated a successful integration, as the modules responsible for parsing,
feature extraction, and classification worked cohesively.
 Overall, the integration testing phase instilled confidence in the project's ability to
handle the intricacies of resume data processing.

5.2.7 Test Result

Figure 5.3: Test Image

29
Figure 5.3 likely shows a test image of stock price prediction model's output and
the actual market price. One line would represent the predicted prices over time, while
the other line shows the real prices. Ideally, the predicted price line would closely
follow the actual price line, with minimal deviations. Significant and consistent gaps
between the lines would indicate that the model is not accurately capturing the price
movements. This analysis helps assess the model's effectiveness in forecasting prices.

5.3 Testing Strategy

• Unit testing: Unit testing verifies the bits of code to check the viability of the code.
• Integration testing: Integration testing is carried out to the efficiency of the model
with functional requirements.
• Validation Testing: Validation testing is carried out to the efficiency of the model
with functional requirements.
• Regression testing: The regression testing is done to verify the output with
the provided input against the functional requirements.

30
CHAPTER 6
RESULTS AND DISCUSSIONS

6.1 Efficiency of the proposed system

Improved Accuracy: Hybrid neural networks can combine the strengths of different
neural network architectures, potentially leading to more accurate predictions compared to
individual models. This translates to making better use of computational resources. Reduced
Training Time: Certain hybrid approaches might achieve similar accuracy with less training data
or shorter training times compared to standalone models. Efficiency Considerations:
Computational Complexity: Hybrid models can be computationally expensive to train, especially
if they involve deep learning architectures. This could impact efficiency on machines with
limited resources. Feature Engineering: Selecting and engineering relevant features for the model
can be a time-consuming process. The efficiency of the system depends on how well this is
addressed. Overall, the efficiency of the system hinges on the specific design choices and
implementation.

6.2 Comparison of Existing and Proposed System

The existing system analyzes how national statistical indicators affect company sales
across different industries. It uses machine learning to identify the most relevant indicators for
each industry (e.g., industrial accident rate for manufacturing). The proposed system focuses on
predicting stock prices and portfolio returns using a hybrid neural network (HNN) model. It
considers the time-series nature of stock data and uses historical price data to predict future
trends. The proposed system also incorporates techniques to optimize the HNN model for better
accuracy.

31
6.3 Sample Code

Get required Columns

32
33
LMS

34
LSTM

35
36
Epoch Printing Callback

LSTM Algorithm

37
38
Get Predictions From Model

39
40
Output

Figure 6.1: Output Image 1

Figure 6.2: Output Image 2

41
CHAPTER 7

CONCLUSION AND FUTURE ENHANCEMENTS

7.1 Conclusion

Our proposed model utilizes a time-series neural network with time-varying

importance functions to predict stock trends. This approach leverages the power of neural
networks for time-series data analysis and incorporates insights from classic financial theories.
By following time series theory, we formally defined the form of the trend and validated the
hypothesis that more recent data holds greater influence on prediction accuracy. The results
suggest a quasi-linear relationship between the importance of data and its position in the time
series. The Hybrid Neural Network (HNN) employed in this model offers advantages in terms
of simplicity and reliability. This stems from the inherent characteristics of historical stock
price data. Each price change reflects a culmination of various factors, and the entire historical
time series embodies the underlying foundation for future growth or decline. HNN essentially
unveils these internal patterns within the price data. Because the model reflects these inherent
relationships, it demonstrates feasibility and effectiveness in predicting future price
movements.

However, it's important to acknowledge the inherent complexities involved in stock

price prediction. Factors like political climate, economic conditions, investor psychology, and
company management all influence price movements, and external economic forces can
introduce significant randomness and even abrupt changes. These factors contribute to the
difficulty of predicting stock prices with absolute certainty. Despite these challenges, the HNN
model can provide valuable insights for stock price changes, particularly for short-term
predictions.

7.2 Future Enhancements

Building on the project's foundation of a unified evaluation method and a hybrid

neural network for stock prediction, future enhancements can focus on two key areas: Time-
Varying Importance of Data: The project currently analyzes data, but future work can delve
deeper into how the importance of that data changes over time. This is crucial for time-series
data like stock prices, where recent trends may hold more weight than historical data. By
incorporating time-varying importance, the model can adapt and potentially improve its
prediction accuracy. Algorithmic Trading Integration: The project focuses on prediction. To

42
create a more practical system, future work can integrate the prediction model with trading
strategies. This would involve developing algorithms that translate predictions into buy/sell
decisions, taking into account factors like risk management and portfolio allocation.
Combining these elements would lead to a complete algorithmic trading system that
leverages the project's prediction capabilities for real-world application. These enhancements
hold promise for building a more robust and practical stock prediction system.

43
CHAPTER 8
SOURCE CODE

8.1 Code

# In[1]:

import numpy as np
import pandas as pd
import [Link] as plt import
warnings
[Link]('ignore')
from sklearn.naive_bayes import MultinomialNB from
[Link] import OneVsRestClassifier from sklearn
import metrics
from [Link] import accuracy_score from
[Link] import scatter_matrix
from [Link] import KNeighborsClassifier from
sklearn import metrics

resumeDataSet = pd.read_csv('[Link]')
resumeDataSet['cleaned_resume'] = ''
[Link]()

# In[2]:

print ("Displaying the distinct categories of resume -") print

(resumeDataSet['Category'].unique())

# In[3]:

print ("Displaying the distinct categories of resume and the number of records
belonging to each category -")

print (resumeDataSet['Category'].value_counts())

44
# In[4]:

import seaborn as sns

[Link](figsize=(15,15))
[Link](rotation=90)

[Link](y="Category", data=resumeDataSet)

# In[5]:

from [Link] import GridSpec

targetCounts = resumeDataSet['Category'].value_counts()
targetLabels = resumeDataSet['Category'].unique()
# Make square figures and axes
[Link](1, figsize=(25,25))
the_grid = GridSpec(2, 2)

cmap = plt.get_cmap('coolwarm')

colors = [cmap(i) for i in [Link](0, 1, 3)]

[Link](the_grid[0, 1], aspect=1, title='CATEGORY DISTRIBUTION')

source_pie = [Link](targetCounts, labels=targetLabels, autopct='%1.1f%%',

shadow=True, colors=colors)

[Link]()

# In[6]:

import re

def cleanResume(resumeText):

resumeText = [Link]('http\S+\s*', ' ', resumeText) # remove URLs resumeText =

[Link]('RT|cc', ' ', resumeText) # remove RT and c

resumeText = [Link]('#\S+', '', resumeText) #

remove hashtags resumeText = [Link]('@\S+', ' ',

resumeText) # remove mentions

resumeText = [Link]('[%s]' % [Link]("""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""), ' ',

resumeText) # remove punctuations

resumeText = [Link](r'[^\x00-\x7f]',r' ', resumeText)

45
resumeText = [Link]('\s+', ' ', resumeText) # remove extra whitespace return

resumeText
# In[7]:

for company in company_list:

company['Daily Return'] = company['Adj Close'].pct_change()

print (resumeDataSet['cleaned_resume'][31])

# In[8]:

fig, axes = [Link](nrows=2, ncols=2)

fig.set_figheight(10)
fig.set_figwidth(15)

AAPL['Daily Return'].plot(ax=axes[0,0], legend=True, linestyle='--', marker='o')

axes[0,0].set_title('APPLE')

GOOG['Daily Return'].plot(ax=axes[0,1], legend=True, linestyle='--', marker='o')

axes[0,1].set_title('GOOGLE')

MSFT['Daily Return'].plot(ax=axes[1,0], legend=True, linestyle='--', marker='o')

axes[1,0].set_title('MICROSOFT')

AMZN['Daily Return'].plot(ax=axes[1,1], legend=True, linestyle='--', marker='o')

axes[1,1].set_title('AMAZON')

fig.tight_layout()

[Link](figsize=(12, 9))

for i, company in enumerate(company_list, 1):

[Link](2, 2, i)
company['Daily Return'].hist(bins=50)
[Link]('Daily Return')
[Link]('Counts')
[Link](f'{company_name[i - 1]}')

plt.tight_layout()
closing_df = pdr.get_data_yahoo(tech_list, start=start, end=end)['Adj Close']
[Link](x='GOOG', y='GOOG', data=tech_rets, kind='scatter', color='seagreen')
[Link](x='GOOG', y='MSFT', data=tech_rets, kind='scatter')
[Link](tech_rets, kind='reg')

46
# In[9]:

[Link](figsize=(12, 10))

[Link](2, 2, 2)

[Link](closing_df.corr(), annot=True, cmap='summer')

[Link]('Correlation of stock closing price')

df = pdr.get_data_yahoo('AAPL', start='2012-01-01', end=[Link]())

[Link](figsize=(16,6))

[Link]('Close Price History')

[Link](df['Close'])

[Link]('Date', fontsize=18)

[Link]('Close Price USD ($)', fontsize=18)

[Link]()

data = [Link](['Close'])

dataset = [Link]

training_data_len = int([Link]( len(dataset) * .95 ))

training_data_len

scaler = MinMaxScaler(feature_range=(0,1))

scaled_data = scaler.fit_transform(dataset)

scaled_data

train_data = scaled_data[0:int(training_data_len), :]

x_train = []

y_train = []

# In[10]:

for i in range(60, len(train_data)):

x_train.append(train_data[i-60:i, 0])

y_train.append(train_data[i, 0])
47
if i<= 61:

print(x_train)

print(y_train)

print()

x_train, y_train = [Link](x_train), [Link](y_train)

x_train = [Link](x_train, (x_train.shape[0], x_train.shape[1], 1))

def create_model():

model = Sequential()

[Link](LSTM(128, return_sequences=True, input_shape=

(x_train.shape[1], 1)))

[Link](LSTM(64, return_sequences=False))

[Link](Dense(25))

[Link](Dense(1))

return model

model = create_model()

[Link](optimizer='adam', loss='mean_squared_error')

[Link](x_train, y_train, batch_size=1, epochs=1)

test_data = scaled_data[training_data_len - 60: , :])

# In[11]:

for i in range(60, len(test_data)):

x_test.append(test_data[i-60:i, 0])

x_test = [Link](x_test)

x_test = [Link](x_test, (x_test.shape[0], x_test.shape[1], 1 ))

48
predictions = [Link](x_test)

predictions = scaler.inverse_transform(predictions)

rmse = [Link]([Link](((predictions - y_test) ** 2)))

rmse

# In[12]:

train = data[:training_data_len]

valid = data[training_data_len:]

valid['Predictions'] = predictions

[Link](figsize=(16,6))

[Link]('Model')

[Link]('Date', fontsize=18)

[Link]('Close Price USD ($)', fontsize=18)

[Link](train['Close'])

[Link](valid[['Close', 'Predictions']])

[Link](['Train', 'Val', 'Predictions'], loc='lower right')

[Link]()

49
REFERENCES

[1] A. L. P. SelÃ§uk, E. Yigit and Z. Ersoy, Prediction of bist price indices: A comparative study
between traditional and deep learning methods, Sigma J. Eng. Natural Sci., vol. 38, no. 4, pp.
1693-1704, (2020).

[2] A. Oueslati and Y. Hammami, Forecasting stock returns in Saudi Arabia and Malaysia, Rev.
Accounting Finance, vol. 17, no. 2, pp. 259-279, May 2018.

[3] B. Unal and C. Aladag, Stock exchange prediction via long short-term memory networks,
Proceedings Book, pp. 246, (2019).
[4] Gwangsu Lee, Exploring Predictive Variables Affecting the Sales of Companies Listed With
Korean Stock Indices Through Machine Learning Analysis, IEEE Access, (2023).

[5] K. Tissaoui and J. Azibi, International implied volatility risk indexes and Saudi stock return-
volatility predictabilities, North Amer. J. Econ. Finance, vol. 47, pp. 65-84, Jan. (2019).

[6] M. Vijh, D. Chandola, V. A. Tikkiwal and A. Kumar, Stock closing price prediction using
machine learning techniques, Proc. Computer. Sci., vol. 167, pp. 599-606, (Jan. 2020).

[7] Nagaraj Naik, Biju R. Mohan, Novel Stock Crisis Prediction Technique - A Study on Indian
Stock Market, IEEE Access, (2021).

[8] N. T. Hung, Stock market volatility and exchange rate movements in the Gulf Arab countries:
A Markov-state switching model, J. Islamic Accounting Bus. Res., vol. 11, no. 9, pp. 1969-
1987, Aug. (2020).

[9] R. Achkar, F. Elias-Sleiman, H. Ezzidine and N. Haidar, Comparison of BPA-MLP and

LSTM-RNN for stocks prediction, 2018 6th International Symposium on Computational and
Business Intelligence, pp. 48-51, (2018)

[10] Saud S. Alotaibi, Ensemble Technique With Optimal Feature Selection for Saudi Stock
Market Prediction: A Novel Hybrid Red Deer-Grey Algorithm, IEEE Access, (2021).

[11] S. M. Idrees, M. A. Alam and P. Agarwal, A prediction approach for stock market volatility
based on time series data, IEEE Access, vol. 7, pp. 17287-17298, (2019).

[12] S. Tekin and E. Canakoglu, Analysis of price models in Istanbul stock exchange, Proc. 27th
Signal Process. Commun. Appl. Conf. (SIU), pp. 1-4, Apr. (2019).
50
[13] Xuan Ji, Jiachen Wang, Zhijun Yan, A stock price prediction method based on deep learning
technology, International Journal of Crowd Science, (2020).

[14] Y. Trichi li, M. B. Abbes and A. Masmoudi, Predicting the effect of Googling investor
sentiment on Islamic stock market returns: A five-state hidden Markov model, Int. J. Islamic
Middle Eastern Finance Manage., vol. 13, no. 2, pp. 165-193, Feb. (2020).

[15] Yaohu Lin, Shancun Liu, Haijun Yang, Harris Wu Stock Trend Prediction Using
Candlestick Charting and Ensemble Machine Learning Techniques With a Novelty Feature
Engineering Scheme IEEE Access, 2021.

51
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
(Deemed to be University u / s 3 of UGC Act, 1956)

Office of Controller of Examinations

REPORT FOR PLAGIARISM CHECK ON THE DISSERTATION / PROJECT REPORT FOR UG / PG
PROGRAMMES
(To be attached in the dissertation / project report)
BANDI KULAVARDHAN
1 Name of the Candidate (IN YASWANTH KUMAR
BLOCK LETTERS) PULLA INDRA SAI

Chennai 600091
Chennai 600122
2 Address of Candidate Chennai 600089

Mobile Number: 9391243437, 9505376347, 6301598234

RA2011003020245, RA2011003020275, RA2011003020282
3 Registration Number

02/06/2003, 14/09/2002, 11/12/2002

4 Date of Birth

Computer Science and Engineering

5 Department

Mrs. J. Juslin Sega

6 Faculty

Unifying Evaluation Objective Methods For Stock Prediction

7 Title of the Dissertation / Project Using Hybrid Neural Network Algorithm

Individual or group : Group

(Strike whichever is not applicable)

8 Whether the above project / dissertation a) If the project / dissertation is done in

is done by group, then how many students
together completed the project : #03

b) Mention the Name &

Register number of other
candidates :
BANDI KULAVARDHAN RA2011003020245
YASWANTH KUMAR RA2011003020275
PULLA INDRA SAI RA2011003020282

[Link] Sega, Assistant Professor, Department of Computer

9 Name and address of the Supervisor Science and Engineering, SRM Institute of Science and
/ Guide Technology, Ramapuram Campus, Chennai 89

Mail ID : juslinsj@[Link]
Mobile Number : 9597694549

52
10 Name and address of the Co-Supervisor /
Guide

Mail ID: Mobile Number:

11 Software Used Turnitin

12 Date of Verification 03/05/2024

13 Plagiraism Details : (to attach the final report from the software)

Percentage of Percentage of % of plagiarism

Chapter Title of the Report similarity index similarity index after excluding
(including self (Excluding self Quotes,
citation) citation) Bibliography,
etc.,
Unifying Evaluation Objective Methods
1 For Stock Prediction Using Hybrid NA NA 8%
Neural Network Algorithm

NA NA NA
Appendices

I / We declare that the above information have been verified and found true to the best of my / our knowledge.

Name & Signature of the Staff

Signature of the Candidate ( Who uses the plagiarism check software )

Name & Signature of the Co-Supervisor / Co-Guide

Name & Signature of the Supervisor / Guide

Dr. K. Raja

Name & Signature of the HOD

53
54
55

21kn1a4250 1
No ratings yet
21kn1a4250 1
58 pages
A Novel Image Style Transfer Model Using Generative AI
No ratings yet
A Novel Image Style Transfer Model Using Generative AI
72 pages
Black Book On Automatic TimeTable Generator
80% (5)
Black Book On Automatic TimeTable Generator
59 pages
Cryptocurrency Price Prediction Using Deep Learning
No ratings yet
Cryptocurrency Price Prediction Using Deep Learning
52 pages
Project R 19
No ratings yet
Project R 19
94 pages
Project
No ratings yet
Project
63 pages
Batch 22
No ratings yet
Batch 22
40 pages
Finalproject Report Flight Price
No ratings yet
Finalproject Report Flight Price
44 pages
Sinemn Pro
No ratings yet
Sinemn Pro
54 pages
FINALREPORTCHETHAN
No ratings yet
FINALREPORTCHETHAN
41 pages
YOLO for Object Detection and Recognition
No ratings yet
YOLO for Object Detection and Recognition
54 pages
Bonafide Certificate for Project Work
No ratings yet
Bonafide Certificate for Project Work
5 pages
Smart Room Occupancy Analysis: Employing IOT Data and Predictive Insights
No ratings yet
Smart Room Occupancy Analysis: Employing IOT Data and Predictive Insights
63 pages
National Institute of Technology Calicut: Signature Verification Project Report
No ratings yet
National Institute of Technology Calicut: Signature Verification Project Report
39 pages
Front Pages1
No ratings yet
Front Pages1
6 pages
Content Part - Merged
No ratings yet
Content Part - Merged
76 pages
Accountable Privacy-Preserving Mechanism For Cloud Computing Based On Identity-Based Encryption
No ratings yet
Accountable Privacy-Preserving Mechanism For Cloud Computing Based On Identity-Based Encryption
103 pages
DCCCCCCCCCCC
No ratings yet
DCCCCCCCCCCC
41 pages
Telecom Report
No ratings yet
Telecom Report
45 pages
Rina Project
No ratings yet
Rina Project
46 pages
Final Documentation-1
No ratings yet
Final Documentation-1
47 pages
Major Project CONTENTS (1) Project
No ratings yet
Major Project CONTENTS (1) Project
3 pages
Sat - 19.Pdf - Prediction of Network Attacks Using Superrvised Machine Learning Algorithm
No ratings yet
Sat - 19.Pdf - Prediction of Network Attacks Using Superrvised Machine Learning Algorithm
11 pages
Report HFP
No ratings yet
Report HFP
71 pages
TABLE OF CONTENTS1 6th Sem Mini Project
No ratings yet
TABLE OF CONTENTS1 6th Sem Mini Project
3 pages
PDL Lab 4
No ratings yet
PDL Lab 4
32 pages
Report
No ratings yet
Report
47 pages
Facemask Detection Using Convolutional Neural Networks
100% (1)
Facemask Detection Using Convolutional Neural Networks
11 pages
New Report
No ratings yet
New Report
73 pages
Final Doc Fin
No ratings yet
Final Doc Fin
87 pages
Roshini Project
No ratings yet
Roshini Project
74 pages
CKD Prediction with ML Algorithms
No ratings yet
CKD Prediction with ML Algorithms
33 pages
Intern Report (Corrected)
No ratings yet
Intern Report (Corrected)
64 pages
Chapter No Description Page No I 1 2 2 Literature Survey 6 3 System Analysis
No ratings yet
Chapter No Description Page No I 1 2 2 Literature Survey 6 3 System Analysis
2 pages
A Machine Learning Project Report Fake News Prediction
No ratings yet
A Machine Learning Project Report Fake News Prediction
24 pages
Indian Airline Ticket Price Analysis
No ratings yet
Indian Airline Ticket Price Analysis
60 pages
AUTOMATIC AGE AND GENDER DETECTION MCA - Documentes
No ratings yet
AUTOMATIC AGE AND GENDER DETECTION MCA - Documentes
68 pages
Fixed Project Report
No ratings yet
Fixed Project Report
48 pages
DMS-Railway Management System
No ratings yet
DMS-Railway Management System
33 pages
Sat - 67.Pdf - Human Activity Recognition With Smartphones Using Machine Learning Process
No ratings yet
Sat - 67.Pdf - Human Activity Recognition With Smartphones Using Machine Learning Process
11 pages
Final Document Recent f4
No ratings yet
Final Document Recent f4
52 pages
Final Document Recent f5
No ratings yet
Final Document Recent f5
52 pages
1X3 BK Gol HP Sheet Blood Bank Management Systembb
No ratings yet
1X3 BK Gol HP Sheet Blood Bank Management Systembb
222 pages
Face
No ratings yet
Face
46 pages
Tejasinterncontent
100% (1)
Tejasinterncontent
6 pages
21091F0026 B.neelimaRani
No ratings yet
21091F0026 B.neelimaRani
70 pages
Bank Management System Report
No ratings yet
Bank Management System Report
47 pages
Predicting Health Insurance Claim Frauds Using Machine Learning
No ratings yet
Predicting Health Insurance Claim Frauds Using Machine Learning
11 pages
1922 B.SC Cs Batchno 24
No ratings yet
1922 B.SC Cs Batchno 24
64 pages
Online Cab Booking System
100% (3)
Online Cab Booking System
37 pages
Group 11 Final Book
No ratings yet
Group 11 Final Book
56 pages
Calicut Bank Loan Management System
No ratings yet
Calicut Bank Loan Management System
5 pages
Big Data Job Market Insights
No ratings yet
Big Data Job Market Insights
5 pages
Deep Learning for Skin Lesion Analysis
No ratings yet
Deep Learning for Skin Lesion Analysis
8 pages
Uber Price Prediction Using ML Techniques
No ratings yet
Uber Price Prediction Using ML Techniques
42 pages
Mini Project PDF
No ratings yet
Mini Project PDF
6 pages
Rptback01 1
No ratings yet
Rptback01 1
6 pages
Machine Learning Report
No ratings yet
Machine Learning Report
49 pages
Introduction To Time Series
No ratings yet
Introduction To Time Series
7 pages
Data-Driven Insights for XYZ Café
No ratings yet
Data-Driven Insights for XYZ Café
11 pages
Sta457 Week 1 Lec Note
No ratings yet
Sta457 Week 1 Lec Note
48 pages
Business Forecasting Techniques Explained
No ratings yet
Business Forecasting Techniques Explained
3 pages
Module 2 - Demand Forecasting
No ratings yet
Module 2 - Demand Forecasting
27 pages
CSE 4334/5334 Data Mining
No ratings yet
CSE 4334/5334 Data Mining
22 pages
Tutorial Letter 101/0/2016: Forecasting
No ratings yet
Tutorial Letter 101/0/2016: Forecasting
11 pages
A Deep Learning Approach For Predictive Maintenance - Predicting Future Event Faults Within The TimeKeeper by EFACEC System
No ratings yet
A Deep Learning Approach For Predictive Maintenance - Predicting Future Event Faults Within The TimeKeeper by EFACEC System
11 pages
Data Viz with ggplot2 for Analysts
No ratings yet
Data Viz with ggplot2 for Analysts
30 pages
Football Analytics: Now and Beyond: A Deep Dive Into The Current State of Advanced Data Analytics
No ratings yet
Football Analytics: Now and Beyond: A Deep Dive Into The Current State of Advanced Data Analytics
25 pages
Data Analysis & Regression Techniques
No ratings yet
Data Analysis & Regression Techniques
8 pages
Forecasting PDF
No ratings yet
Forecasting PDF
101 pages
Forecast Error Analysis
No ratings yet
Forecast Error Analysis
3 pages
Understanding Gcorr 2020 Europe Retail: Whitepaper
100% (1)
Understanding Gcorr 2020 Europe Retail: Whitepaper
41 pages
Predicting Stock Market Price Using Support Vector Regression
No ratings yet
Predicting Stock Market Price Using Support Vector Regression
7 pages
Time Series Forecasting-Monograph
No ratings yet
Time Series Forecasting-Monograph
58 pages
Sustainability of Sharia Rural Bank in Central Java
No ratings yet
Sustainability of Sharia Rural Bank in Central Java
8 pages
Machine Learning Roadmap 2020
No ratings yet
Machine Learning Roadmap 2020
1 page
Data Mining Concepts Models and Techniques 1st Edition by Florin Gorunescu ISBN 3642197213 9783642197215 Download
100% (4)
Data Mining Concepts Models and Techniques 1st Edition by Florin Gorunescu ISBN 3642197213 9783642197215 Download
46 pages
Practical Business Statistics 7th Edition Andrew Siegel Digital Version 2025
No ratings yet
Practical Business Statistics 7th Edition Andrew Siegel Digital Version 2025
102 pages
Joyeux and Milunovich - ECONOMETRIC METHODS
No ratings yet
Joyeux and Milunovich - ECONOMETRIC METHODS
14 pages
Student Demand Forecasting Exercise
No ratings yet
Student Demand Forecasting Exercise
4 pages
Statistics For Management and Economics Keller Ebook Polished PDF Edition
100% (5)
Statistics For Management and Economics Keller Ebook Polished PDF Edition
45 pages
Enhancing Churn Forecasting With Sentiment Analysis of Steam Reviews
No ratings yet
Enhancing Churn Forecasting With Sentiment Analysis of Steam Reviews
13 pages
Sales Management Notes (Unit-2)
No ratings yet
Sales Management Notes (Unit-2)
20 pages
Azure Cognitive Services Anomaly Detector
No ratings yet
Azure Cognitive Services Anomaly Detector
238 pages
Uwave: Accelerometer-Based Personalized Gesture Recognition and Its Applications
No ratings yet
Uwave: Accelerometer-Based Personalized Gesture Recognition and Its Applications
9 pages
Forecasting Time Series with Outliers
No ratings yet
Forecasting Time Series with Outliers
23 pages
Practice Questions Lecture 6-8
No ratings yet
Practice Questions Lecture 6-8
8 pages
MBAF 502: Quantitative Analysis Course
No ratings yet
MBAF 502: Quantitative Analysis Course
2 pages