0% found this document useful (0 votes)

143 views5 pages

Speech Recognition System Overview

The document discusses the evolution and functionality of speech recognition systems, detailing their interdisciplinary nature and the technology's advancements over the past 40 years. It outlines the components of a generic speech recognition system, user interface design principles, and applications in telecommunications, healthcare, and military settings. The paper emphasizes the importance of accuracy, speed, and environmental factors in the performance of speech recognition systems.

Uploaded by

sudam rathod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

143 views5 pages

Speech Recognition System Overview

Uploaded by

sudam rathod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

SPEECH RECOGNITION SYSTEM

1
More Sachin Angad, 2Nisargandh Prachi Ashwin, 3Rathod S.M.
1,2
Student Dept. of Electronics &Telecommunication, Aditya Polytechnic, Beed, Maharashtra, India
3
Head Of Dept. Electronics &Telecommunication, Aditya Polytechnic, Beed, Maharashtra, India

____________________________________________________________________________________________

recognition or speech to text (STT). It incorporates

ABSTRACT knowledge and research in the computer science,

linguistics and computer engineering fields. The
Speech recognition basically means talking to a
reverse process is speech synthesis. Some speech
computer, having it recognize what we are saying.
recognition systems require "solly" (also called
This process fundamentally functions as a pipeline
"enrollment") where an individual speaker reads text
that converts PCM (Pulse Code Modulation) digital
or isolated vocabulary into the system. The system
audio from a sound card into recognized speech.
analyzes the person's specific voice and uses it to
Speech recognition technology has evolved for more
fine-tune the recognition of that person's speech,
than 40 years, spurred on by advances in signal
resulting in increased accuracy. Systems that do not
processing, algorithms, architectures, and hardware.
use training are called "speaker-
During that time it has gone from a laboratory
[1]
independent" systems.
curiosity to an art, and eventually to a full-fledged
understood by a wide range of engineers, scientists, B. Generic Speech Recognition System:

linguists, psychologists, and systems designers. Over

those 4 decades, the technology of speech
recognition has evolved, leading to a steady stream
of increasingly more difficult asks which have been
tackled and solved.

Key words: Thing speak, ECG, temperature, oxygen

level, heart rate, ESP32, Arduino, Android app.

A. INTRODUCTION
Fig1:-(Block Diagram of Generic Speech
Speech recognition is an interdisciplinary subfield of Recognition System)
computer science and computational linguistics that The figure shows a block diagram of a typical
develops methodologies and technologies that enable integrated continuous speech recognition system.
the recognition and translation of spoken language Interestingly enough, this generic block diagram
into text by computers. It is also known as automatic can be made to work on virtually any speech
speech recognition (ASR), computer speech recognition task that has been devised in the past 40
years, i.e. isolated word recognition, connected word conversation moving forward; periods of great
recognition, continuous speech recognition, etc. The uncertainty on the parts of either the user or
feature. analysis module provides the acoustic the machine.
feature vectors used to characterize the spectral We now expand somewhat on each of these
properties of the time- varying speech signal. The factors:
word level acoustic match module evaluates the
User Interface Design: In order to make a
similarity between the input feature vector sequence
speech interface as simple and as effective as
(corresponding to a portion of the input speech) and
Graphical User Interfaces (GUI), 3 key design
a set of acoustic word models for all words in the
principles should be followed as closely as
recognition task vocabulary to determine which
possible, namely:
words were most likely spoken. The sentence-level
 Provide a continuous representation of the
match module uses a language model (i.e., a model
objects and actions of interest.
of syntax and semantics) to determine the most
 Provide a mechanism for rapid, incremental,
likely sequence of words. Syntactic and semantic
and reversible operations whose impact on the
rules can be specified, either manually, based on task
object of interest is immediately visible.
constraints, or with statistical models such as word
and class N-gram probabilities. Search and  Use physical actions or labeled button presses
recognition decisions are made by 502 considering instead of text commands.
all likely word sequences.

Almost every aspect of the continuous speech C. Dialogue Design Principles:

recognizer of Figure 1 has been studied and
optimized over the years. As a result, we have
obtained a great deal of knowledge about how to
design the feature analysis module, how to choose
appropriate recognition units, how to populate the
word lexicon, how to build acoustic word models,
how to model language syntax and semantics, how to
decode word matches against word models, how to
efficiently determine a sentence match, and finally
how to eventually choose the best recognized
sentence.

Building Good Speech-Based Applications: (Fig 2:- Block Diagram Of Speech

Recognition System)
In addition to having good speech recognition
technology, effective speech based applications For many interactions between a person and a
heavily depend on several factors, including: machine, a dialogue is needed to establish a
complete interaction with the machine. The
 Good models of dialogues that keep the
„ideal‟ dialogue allows either the user or the Technology)
machine: to initiate queries or to choose to It is essential that any application of speech
respond to queries initiated by the other side. recognition be realistic about the capabilities of the
(Such systems are called„mixed initiative‟ technology, and build in failure correction modes.
systems.) A complete set of design principles Hence building a credit card recognition; application
for dialogue systems has not yet evolved (it is before digit error rates fell below 0.5% per digit is a
far too early yet). However, much as we have formula for failure, since for a 16-digit credit card,
learned good speech interface design the string error rate will be at the 10% level or
principles, many of the same or similar higher, thereby frustrating customers who speak
principles are evolving for dialogue clearly and distinctly, and making the system totally
management. The key principles that have unusable for customers who slur their speech or
evolved are the following: otherwise make it difficult to understand their
 Summarize actions to be taken, whenever spoken inputs. Utilizing this principle, the following
possible. successful applications have been built:

 Provide real-time, low delay, responses from Game/aids-to-the-handicapped: voice control of

the machine and allow the user to barge in it at selective features of the game, the wheelchair, the
any time. environment (climate control).

 Orient users to their „location‟ in task space as

often as possible. The Telecommunications need for Speech
 Use flexible grammars to provide Recognition
incrementally of the dialogue. The telecommunications network is evolving as the
 Whenever possible, customize and traditional POTS (Plain Old Telephony Services)
personalize the dialogue (novice/expert) network comes together with the dynamically
evolving Packet network, in a structure which we
believe will look something like the one shown in
D. Match Task to the Technology:
the Figure below.

Telecommunication Applications of Speech

Recognition

Speech recognition was introduced into the

telecommunications network in the early 1990‟s for
two reasons, namely to reduce costs via automation
of attendant functions, and to provide new
revenuegenerating services that were previously
impractical because of the associated costs of using
attendants.
(Fig 3:- Block Diagram of Match Task to the
Examples of telecommunications services which to around 30 seconds and misdirected calls to virtual
were created to achieve cost reduction include the nil.
following: In-car systems
Voice Dialing Systems have been created for Typically a manual control input, for example by
voice dialing by name (so-called alias dialing such means of a finger control on the steering wheel,
as Call Home, Call Office) from AT&T, NYNEX, enables the speech recognition system and this is
and Bell Atlantic, and by number (AT&T signaled to the driver by an audio prompt. Following
SDN/NRA) to enable customers to complete calls the audio prompt, the system has a “listening
without having to push buttons associated with the window” during which it may accept a speech input
telephone number being called. for recognition.
E. Replacing complicated and often Simple voice commands may be used to initiate
frustrating ‘push button’ IVR: phone calls, select radio stations or play music from
Due to poorly implemented and managed systems, a compatible smartphone, MP3 player or music-
IVR and automated call handling systems may be loaded flash drive. Voice recognition capabilities
often unpopular and frustrating with customers. vary between car make and model. Some of the most
However, there is a way to improve this scenario. recent car models offer natural-language speech
Termed „intelligent call steering‟ (ICS), it does not recognition in place of a fixed set of commands,
involve any „button pushing‟. The system simply allowing the driver to use full sentences and
asks the customer what they want (in their words, common phrases. With such systems, there is,
not yours) and then transfers them to the most therefore, no need for the user to memorize a set of
suitable resource to handle their call. Callers dial fixed command words.
one number and are greeted by the message High-performance fighter aircraft
“Welcome to XYZ Company, how I can help you?”
Substantial efforts have been devoted in the last
The caller is routed to the right agent within 20 to
decade to the test and evaluation of speech
30 seconds of
recognition in fighter aircraft. Of particular note
the call being answered with misdirected calls have been the US program in speech recognition for
reduced to as low as 3-5 percent. the Advanced Fighter Technology Integration
By introducing Natural Language Speech (AFTI)/F-16 aircraft (F-16 VISTA), the program in
Recognition (NLSR), general insurance company France for Mirage aircraft, and other programs in the
Suncorp replaced its original push button IVR, UK dealing with a variety of aircraft platforms. In
enabling the customer to simply say what they want. these programs, speech recognizers have been
Using a financial services‟ statistical language model operated successfully in fighter aircraft, with
of over 100,000 phrases, the system can more applications including setting radio frequencies,
accurately assess the nature of the call and transfer it commanding an autopilot system, setting steer-point
the first time to the appropriate department or coordinates and weapons release parameters, and
advisor. The company reduced its call waiting times controlling flight display.
Performance of speech recognition systems- issue for voice in helicopters is the impact on pilot
It is usually specified in terms of accuracy and effectiveness. Battle Management – Speech
speed. Accuracy may be measured in terms of recognition equipment was tested in conjunction
performance accuracy which is usually rated with with an integrated information display for naval
word error rate , whereas speed is measured with battle management [Link] and
the real time [Link] machines can achieve other domains – ASR in the field of computer
very high performance in controlled conditions and gaming and simulation is becoming more
require only a short period of [Link]
conditions usually assume that users -have speech
F. CONCLUSION
characteristics which match the training [Link]
This paper presents the Speech Recognition in
achieve proper speaker [Link] in clean and
Artificial intelligence systems and it is important to
no noise [Link] are 2 models on
consider the environment in which the speech
statistically- based Speech Recognition-Hidden
recognition system has to [Link] grammar used
Markov Model (HMM model)Dynamic Time
by the speaker and accepted by the system, noise
Wrapping (DTW model)
level, noise type, position of the microphone, and
DTW - based Speech Recognition –
speed and manner of the user‟s speech are some
Dynamic time warping is an algorithm for factors that may affect the quality of speech
measuring similarity between two sequences which recognition.
may vary in time or speed. It is a historical
G. REFERENCES
[Link] between speaking patterns
1. ohn Levis and Ruslan Suvorov, "Automatic
would be detected. DTW has been applied to video,
Speech Recognition".
audio, and graphics -- indeed, any data which can
2. B.H. Juang and Lawrence R. Rabiner,
be turned into a linear representation can be
"Automatic Speech Recognition - A Brief
analyzed with [Link] sequence technique is
History of the Technology Development".
also used in HMMs model.
3. S. Xue, X. Y. Kou and S. T. Tan, "Natural
school assignments by using speech-to-text
Voice- Enabled CAD: Modeling via Natural
programs. They can also utilize speech recognition
Discourse".
technology to freely enjoy searching the Internet or
using a computer at home without having to 4. Ekenta Elizabeth Odokuma and
physically operate a mouse and keyboard. Orluchukwu Great Ndidi, "Development Of A
Voice-Controlled Personal Assistant For The
Applications of Speech Recognition -
Elderly And Disabled".
Health Care -In this even in the wake of Speech
recognition technologies MT haven’t become
obsolute. Military -High-performance fighter
aircraft- Speech recognizers have been operated

Helicopters - As in fighter applications overriding

Introduction to Computer Hardware & Software
No ratings yet
Introduction to Computer Hardware & Software
46 pages
ClassNotes S4 Lesson1
No ratings yet
ClassNotes S4 Lesson1
37 pages
Features and Parts of System Unit
No ratings yet
Features and Parts of System Unit
14 pages
Data Comm & Networking
100% (2)
Data Comm & Networking
165 pages
Group Technology & Cellular Manufacturing
100% (1)
Group Technology & Cellular Manufacturing
65 pages
Assignment#1
No ratings yet
Assignment#1
5 pages
Impact Printers
No ratings yet
Impact Printers
25 pages
LED Examination Lamps ML-40L
No ratings yet
LED Examination Lamps ML-40L
1 page
Computer Types Explained
No ratings yet
Computer Types Explained
3 pages
Output Devices Explained
No ratings yet
Output Devices Explained
13 pages
Components of a Computer System
No ratings yet
Components of a Computer System
4 pages
Microsoft Word Basics
No ratings yet
Microsoft Word Basics
32 pages
Biomedical Data: Acquisition and Use
No ratings yet
Biomedical Data: Acquisition and Use
21 pages
Parts-of-a-Computer 20240127 190313 0000
No ratings yet
Parts-of-a-Computer 20240127 190313 0000
2 pages
Basic Computer Operations
100% (1)
Basic Computer Operations
6 pages
Week 3 - Microprocessors and Microcomputers
No ratings yet
Week 3 - Microprocessors and Microcomputers
20 pages
Computer Parts and Their Functions
No ratings yet
Computer Parts and Their Functions
19 pages
Computers' Societal Impact
No ratings yet
Computers' Societal Impact
6 pages
COMPUTER MATERIALS TOOLS EQUIPMENT and TESTING DEVICES
No ratings yet
COMPUTER MATERIALS TOOLS EQUIPMENT and TESTING DEVICES
36 pages
Lecture 5 - Network Operating System
No ratings yet
Lecture 5 - Network Operating System
23 pages
Hardware and Software
No ratings yet
Hardware and Software
7 pages
Chapter 4-NEW-1
No ratings yet
Chapter 4-NEW-1
37 pages
Smart Glove For Real Time Sign Language Translation
No ratings yet
Smart Glove For Real Time Sign Language Translation
13 pages
Respirometer Practical
No ratings yet
Respirometer Practical
1 page
Ultrasonic Walking Stick for the Blind
No ratings yet
Ultrasonic Walking Stick for the Blind
7 pages
Unit 1 TTL2
100% (1)
Unit 1 TTL2
13 pages
Lesson 3 PCO Input Device Pointing Device-MIDTERM
No ratings yet
Lesson 3 PCO Input Device Pointing Device-MIDTERM
4 pages
IoT IV Bag Monitoring System
100% (1)
IoT IV Bag Monitoring System
3 pages
ICT Grade 12th
No ratings yet
ICT Grade 12th
35 pages
Intro to Software & OS Basics
No ratings yet
Intro to Software & OS Basics
18 pages
SAQA - 14921 - Learner Guide
No ratings yet
SAQA - 14921 - Learner Guide
29 pages
HCI
No ratings yet
HCI
119 pages
Gas Detection & Prevention With Auto Dial Up Logic Final Year Electronic Project
100% (3)
Gas Detection & Prevention With Auto Dial Up Logic Final Year Electronic Project
126 pages
Ergonomic Keyboarding Techniques Guide
No ratings yet
Ergonomic Keyboarding Techniques Guide
22 pages
Seminar PPT (1) Technical
No ratings yet
Seminar PPT (1) Technical
23 pages
Computer Hardware Overview for ICT
No ratings yet
Computer Hardware Overview for ICT
13 pages
ICT Assignment 2
No ratings yet
ICT Assignment 2
7 pages
A Information Technology Devices
No ratings yet
A Information Technology Devices
21 pages
Chapter No. 1 Motherboard & Its Components
No ratings yet
Chapter No. 1 Motherboard & Its Components
19 pages
Heliodisplay Seminar Report
100% (1)
Heliodisplay Seminar Report
30 pages
Emerging Technologies 2007
No ratings yet
Emerging Technologies 2007
267 pages
Design Hearing Aid Device
No ratings yet
Design Hearing Aid Device
7 pages
Motherboard Components Labeled - Motherboard Parts and Functions PDF
80% (5)
Motherboard Components Labeled - Motherboard Parts and Functions PDF
5 pages
English 4 IT - Unit 11 Computer Hardware Components Reading
No ratings yet
English 4 IT - Unit 11 Computer Hardware Components Reading
3 pages
Starfast-A Wireless Wearable EEG Biometric System Based On The ENOBIO Sensor
100% (2)
Starfast-A Wireless Wearable EEG Biometric System Based On The ENOBIO Sensor
8 pages
Intelligent Medicine Box for Elders
No ratings yet
Intelligent Medicine Box for Elders
6 pages
Course Information Sheet: Smart Manufacturing
No ratings yet
Course Information Sheet: Smart Manufacturing
4 pages
Computer Applications in Nursing
No ratings yet
Computer Applications in Nursing
38 pages
Generations of Computer
No ratings yet
Generations of Computer
16 pages
Mobile ECG Tele-Health System Enhancement
No ratings yet
Mobile ECG Tele-Health System Enhancement
5 pages
Influences of ICT in Students Learning
No ratings yet
Influences of ICT in Students Learning
21 pages
NET JAVA Week1 Module
No ratings yet
NET JAVA Week1 Module
18 pages
Sensors, Actuators & Smart Objects
No ratings yet
Sensors, Actuators & Smart Objects
20 pages
MAC 1200 ST: GE Healthcare
No ratings yet
MAC 1200 ST: GE Healthcare
4 pages
Pipeline
No ratings yet
Pipeline
22 pages
Overview On Application Development and Emerging Technologies
No ratings yet
Overview On Application Development and Emerging Technologies
5 pages
Maritime Computer Systems Guide
No ratings yet
Maritime Computer Systems Guide
7 pages
MS Univ - Internet and Its Applications
No ratings yet
MS Univ - Internet and Its Applications
220 pages
Telecommunication Applications of Speech Recognition
No ratings yet
Telecommunication Applications of Speech Recognition
100 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Pharmacy Teaching
No ratings yet
Pharmacy Teaching
32 pages
Ambhore Ankush - Compressed
No ratings yet
Ambhore Ankush - Compressed
2 pages
VWV - 22062 - VLSI With VHDL Solved Manual ? (Join AICTE)
100% (2)
VWV - 22062 - VLSI With VHDL Solved Manual ? (Join AICTE)
108 pages
MOJO4310K05A52779301
No ratings yet
MOJO4310K05A52779301
1 page
Ete (22636) Answer Key
No ratings yet
Ete (22636) Answer Key
4 pages
IoT Course Syllabus for Semester 4
No ratings yet
IoT Course Syllabus for Semester 4
9 pages
Intro to Algorithms & Flowcharts
No ratings yet
Intro to Algorithms & Flowcharts
9 pages
CND Manual Solution
No ratings yet
CND Manual Solution
49 pages
UI/UX Design Course Overview
No ratings yet
UI/UX Design Course Overview
9 pages
Electronics Workshop Practice Course Guide
No ratings yet
Electronics Workshop Practice Course Guide
9 pages
314301-Environmental Education and Sustainability
No ratings yet
314301-Environmental Education and Sustainability
8 pages
Microprocessors (22415) Diploma Micro Project I Scheme MSBTE. - MSBTE MICRO PROJECTS Project
No ratings yet
Microprocessors (22415) Diploma Micro Project I Scheme MSBTE. - MSBTE MICRO PROJECTS Project
7 pages
314328-Microcontroller & Applications
No ratings yet
314328-Microcontroller & Applications
8 pages
NBX System Features and Operations Quiz
No ratings yet
NBX System Features and Operations Quiz
11 pages
Electronics Engineering Lab Guide
100% (1)
Electronics Engineering Lab Guide
4 pages
Offshore & Onshore Reliability Data Oreda
100% (3)
Offshore & Onshore Reliability Data Oreda
11 pages
Dell FluidFS Version 5
No ratings yet
Dell FluidFS Version 5
210 pages
SAILOR 6120 Mini C SSAS
No ratings yet
SAILOR 6120 Mini C SSAS
2 pages
YouTube Autopiloter
No ratings yet
YouTube Autopiloter
6 pages
README Updating Firmware
No ratings yet
README Updating Firmware
8 pages
Wqmis User Guide
No ratings yet
Wqmis User Guide
140 pages
Question Bank Questions Term 2401
No ratings yet
Question Bank Questions Term 2401
18 pages
History Performance Cell (UMTS) Logic Cell (UMTS) 20180418092121
No ratings yet
History Performance Cell (UMTS) Logic Cell (UMTS) 20180418092121
42 pages
No-Code/Low-Code Development Overview
No ratings yet
No-Code/Low-Code Development Overview
15 pages
Dahua DHI-ASI2212J Access Control
No ratings yet
Dahua DHI-ASI2212J Access Control
2 pages
DICOM Conformance for ECG-2350
No ratings yet
DICOM Conformance for ECG-2350
20 pages
DC 250 Error Codes
No ratings yet
DC 250 Error Codes
40 pages
Practical Programs (5 - 24)
No ratings yet
Practical Programs (5 - 24)
19 pages
iRemovaLProasasdiq Guide - Fixes
No ratings yet
iRemovaLProasasdiq Guide - Fixes
4 pages
Aroosha OSS: Change Management
No ratings yet
Aroosha OSS: Change Management
55 pages
RDBMS - Module5 - Distributed and Parallel DB
No ratings yet
RDBMS - Module5 - Distributed and Parallel DB
7 pages
Manual Testing Interview Questions & Answers-PART4
No ratings yet
Manual Testing Interview Questions & Answers-PART4
5 pages
Mounting Kit For Rally
No ratings yet
Mounting Kit For Rally
1 page
DPT - 1 Question Paper - BA
No ratings yet
DPT - 1 Question Paper - BA
4 pages
Auxiliary Programs
No ratings yet
Auxiliary Programs
236 pages
Student Bus Pass Online System
No ratings yet
Student Bus Pass Online System
3 pages
Packet Tracer - Troubleshoot WLAN Issues: Addressing Table
No ratings yet
Packet Tracer - Troubleshoot WLAN Issues: Addressing Table
3 pages
Data Sheet
No ratings yet
Data Sheet
40 pages
Week 8 Milestone Worksheet
No ratings yet
Week 8 Milestone Worksheet
15 pages
Triway Technologies 2016
No ratings yet
Triway Technologies 2016
17 pages
DHCP Relay Packet Flow Explained
No ratings yet
DHCP Relay Packet Flow Explained
2 pages
The Poetpreneur by Olumide Holloway Aka King Olulu
No ratings yet
The Poetpreneur by Olumide Holloway Aka King Olulu
92 pages
Dromey Quality Model
No ratings yet
Dromey Quality Model
4 pages

Speech Recognition System Overview

Uploaded by

Speech Recognition System Overview

Uploaded by

SPEECH RECOGNITION SYSTEM

recognition or speech to text (STT). It incorporates

ABSTRACT knowledge and research in the computer science,

linguists, psychologists, and systems designers. Over

Key words: Thing speak, ECG, temperature, oxygen

Almost every aspect of the continuous speech C. Dialogue Design Principles:

Building Good Speech-Based Applications: (Fig 2:- Block Diagram Of Speech

 Provide real-time, low delay, responses from Game/aids-to-the-handicapped: voice control of

 Orient users to their „location‟ in task space as

Telecommunication Applications of Speech

Speech recognition was introduced into the

Helicopters - As in fighter applications overriding

You might also like