Sign Language Recognition Using Machine Learning
Sign Language Recognition Using Machine Learning
IJISRT24MAY273 www.ijisrt.com 73
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY273
An ensemble model builds upon several separate The research done by various authors is also studied, and
models to provide a prediction that is more reliable and some of the important research articles are also discussed
powerful than any of the individual models working in this article. This article discussed how machine
alone. An ensemble model called Random Forest is made learning methods could benefit the field of automatic
up of several decision trees because of its scalability, sign language recognition and the potential gaps that
resilience, and capacity to manage high-dimensional data, machine learning approaches need to address for the
Random Forest is a robust and adaptable ensemble learning real-time sign language recognition.
method that is frequently employed for both regression and [8] Important application of hand gesture recognition
classification applications. A random portion of the training that is translation of sign language. In sign language, the
data and a random subset of the characteristics are used to fingers’ configuration, the hand’s orientation, and the
create each decision tree that makes up Random Forest. A hand’s relative position to the body are the primitives of
random portion of the training data and a random subset of structured expressions. The importance of hand gesture
the characteristics are used to create each decision tree that recognition has increased due to the rapid growth of the
makes up Random Forest. hearing-impaired population. In this paper, a system is
proposed for dynamic hand gesture recognition using
II. LITERATURE SURVEY multiple deep learning.
[9] The approach is to have a vision based system in
[1] The basic concept of sign language recognition which the sequence of images representing a word in
system and review of its existing techniques along with ISL is translated to equivalent English word. The
their comparison presented. The main objective of translation would be done by means of Deep learning
presenting this survey is to highlight the importance of algorithms namely convolutional neural nets and
vision based method with a specific focus on sign recurrent neural nets. The system will be analyzing
language. They covered most of the currently known sequence of images, hence CNNs will analyze each
methods for SLR tasks based on deep neural image and their sequence is analyzed by LSTM (which
architectures that were developed over the past several is an implementation of RNN). We divided dataset into
years, and divided them into clusters based on their chief training dataset and testing dataset, which obtained
traits. The most common design deploys a CNN network 73.60% accuracy. The image distributions are kept fairly
to derive discriminative features from raw data, since different in training and testing datasets.
this type of network offers the best properties for this [11] An alternative to written and direct communication
task. In many cases, multiple types of networks were languages used in India and the Indian subcontinent is
combined in order to improve final performance. Indian Sign Language (ISL). People who are deaf or
[2] Work on American Sign Language (ASL) words mute and are unable to hear or talk frequently use it.
share similar characteristics. These characteristics are Compared to other sign languages used in developed
usually during sign trajectory which yields similarity nations, the ISL is a novel sign language. Given its
issues and hinders ubiquitous application. However, current application characteristics, automatic recognition
recognition of similar ASL words confused translation of any sign language, including ISL, is necessary. ISL
algorithms, which lead to misclassification. Based on automation will benefit both communities—those who
fast fisher vector (FFV) and bi-directional Long-Short can exclusively communicate in ISL and those who do
Term memory (Bi-LSTM) method, a large database of not know the language at all—because unblessed
dynamic sign words recognition algorithm called individuals frequently have trouble interacting in public
bidirectional long-short term memory-fast fisher vector settings like airports, train stations, banks, and hospitals.
(FFV-Bi-LSTM) is designed. The performance or FFV- [12] The basis of every human interaction, whether it be
Bi-LSTM is further evaluated on ASL data set, leap personal or professional, is communication. It is among
motion dynamic hand gestures data set (LMDHG), and the necessities for surviving in a community. Without a
semaphoric hand gestures contained in the Shape clear, mutually understood language, verbal
Retrieval Contest (SHREC) dataset. communication is impossible. In India, sign language is
[6] The research article investigates the impact of used for communication by about 26% of the disabled
machine learning in the state of the sign language population. Therefore, it is imperative to close the
recognition and classification. It highlights the issues communication gap that exists between the general
faced by the present recognition system for which the public and those who are speech challenged. The
research frontier on sign language recognition intends objective is to create a pair of sensor gloves that can
the solutions. In this article, around 240 different translate motions used in Indian Sign Language (ISL)
approaches have been compared that explore sign into audible speech.
language recognition for recognizing multilingual signs.
IJISRT24MAY273 www.ijisrt.com 74
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY273
This Chapter discusses the system architecture of sign language recognition using deep learning techniques and ensemble
model.
A. Architecture Design
As shown in the Figure 1 The Proposed system mainly consist of two modules namely Training phase and Testing phase.
IJISRT24MAY273 www.ijisrt.com 75
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY273
IJISRT24MAY273 www.ijisrt.com 76
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY273
Preprocessing:
The first stage is to prepare the input data for the
network. Common methods include: Hand Segmentation is
separating the hand from the image. Normalization is
scaling the pixel intensity values to a specific range for
better network training and Background Subtraction is
IJISRT24MAY273 www.ijisrt.com 77
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY273
IJISRT24MAY273 www.ijisrt.com 78
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY273
Segmentation :
Segmentation mainly involves separating the hand
region from the background image. The isolated hand region
is then evaluated to determine the exact gesture being sent.
Segmentation applies the segmentation mask to the original
image, and saves the resulting masked image. Fig 6 Image of Hand Gesture Representing
Indian Sign Language
Feature Extraction :
Feature Extraction is the process of identifying and In conclusion, the development of a sign language
extracting informative characteristics from the hand region recognition system is a significant step towards fostering
in an image. The technique to detect and track the locations inclusive communication for the deaf and hard-of-hearing
of particular points on an individual's hand is known as hand communities. It can provide real-time translation of sign
landmarking. These locations, which are sometimes referred language into spoken language or text. It Enabling deaf
to as landmarks or key points, can be the wrist, the tips and individuals to interact and convey their messages more
bases of the fingers, or other hand points. One can use easily in various situations like education, employment, and
landmarks to recognize the various indications that the social settings.By leveraging advancements in deep learning
person is making. By using the MediaPipe library can and human-computer interaction, such a system has the
identify hand landmarks. potential to bridge the communication gap between
individuals who use sign language and those who do not use
V. RESULT AND CONCLUSION sign language.
CNN, Residual network (ResNet), and Ensemble In term of future improving the preprocessing to
model performance in the suggested system were assessed predict gestures even in low light conditions with a higher
and compared. When compared to the other two models, the accuracy. The proposed system can an be further built as a
ensemble model provides the most accurate performance web/mobile application for the users and it’s only works for
measurement. sign language gestures of a single hand. So, it can be
enhanced to take gestures using both hands.
IJISRT24MAY273 www.ijisrt.com 79
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY273
IJISRT24MAY273 www.ijisrt.com 80