0% found this document useful (0 votes)

115 views32 pages

Understanding Speech Analysis Techniques

The document discusses various techniques for analyzing speech sounds, including waveform analysis, spectrograms, and linear prediction coding (LPC). Waveform analysis looks at changes in sound intensity over time. Spectrograms examine dynamic changes in a speech spectrum and are useful for segmenting phonemes. LPC separates resonant vocal tract characteristics from sound source characteristics to identify formant peaks representing resonances. Examples of applying these techniques to analyze vowels, stops, affricates and fricatives are shown through various figures.

Uploaded by

ilasundaram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

115 views32 pages

Understanding Speech Analysis Techniques

Uploaded by

ilasundaram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Speechanalysis S h l i

WhatisSpeechAnalysis? What is Speech Analysis?

Analysisofspeechsoundstakingintoconsiderationtheirmethodof y p g

production Thelevelofprocessingbetweenthedigitisedacousticwaveformandthe The level of processing between the digitised acoustic waveform and the acousticfeaturevectors. Th Theextractionof``interesting''informationasanacousticvector t ti f ``i t ti '' i f ti ti t

waveforms

SpeechWaveforms h f
A waveform is a two dimensional representation of a sound. The two dimensions in a waveform display are time and intensity. The vertical dimension is intensity and the horizontal dimension is time. Waveforms are known as time domain representations of sound as they display changes in intensity over time. The intensity dimension actually displays sound pressure. Sound pressure is a measure of the tiny variations in air pressure that we are able to perceive as sound. I t it i th Intensity in these waveforms i a simple li f is i l linear scaling of sound li f d pressure (not dB).

ResonancesandFormants
Resonances are vibratory characteristics of a resonating body. In the case of an air filled tube the resonance characteristics exist even when there is no sound being produced. When we produce vowel sounds the resonances of the vocal tract selectively enhance sound vibrations close to the resonance frequencies and selectively attenuate sound vibrations remote from the resonance frequencies frequencies. This results in peaks in the acoustic spectrum of the resulting speech sound. These acoustic spectral peaks are called formants, particularly when they occur in vowels and vowellike consonants.

Spectrograms Spectrograms permit the examination of the dynamic changes in a Spectrogramspermittheexaminationofthedynamicchangesina

speechspectrum. This is particularly useful for the examination of rapidly changing Thisisparticularlyusefulfortheexaminationofrapidlychanging consonants(eg.stopbursts)andalsoforvoweltransitions(between vowelsandconsonantsandbetweenthetargetsindiphthongs). Spectrograms,usuallyinconjunctionwithwaveforms,areessential duringthesegmentingandlabelingofspeech. Spectrogramsusuallyprovidetheclearestvisualcuestothe boundariesbetweenphonemes. Spectrogramsdonot,however,provideaccuratemeasurementsof vowelformantsasbroadbandspectrogramshaveapoorfrequency resolution(about300Hz)andsothereisahighdegreeofintrinsic errorinformantmeasurementstakenvisuallyfromspectrograms. error in formant measurements taken visually from spectrograms ThatiswhywetendtouseFFTsandLPCsfortheaccurate measurementofformantfrequencies.

Fig:waveformandbroadbandspectrogramoftheword"heard"

Figure:anarrowbandspectrogramoftheword"heard"

Figure: Thisisabroadbandspectrogramof theword"hide"withtheformanttracksfor formants1to5superimposedoverit.

1_aam 0.0143017892 0.490396511

g1 0

aag

aa1

aa2

aam

m2 0.491

Time (s) ( )

aayvu 1

-1 g 0 Time (s) aa ay y yv v vu u 0.8455

0.18 0 18

0.2

0.1

0.07

0.04

0.07

0.19

Words aayvu g aa ay y yv v vu u

Duration insecs 0.77 0.19 0 19 0.2 0.1 0.07 0.04 0.07 0 07 0.06 0.2

Intensity indB 80.4 62.4 62 4 81.3 84.0 80.5 78.7 73.4 73 4 78.2 77.8

Pitch inHz 160.2 128 137.1 171.1 179 174.5 162.2 162 2 166.5 167.2

F1 540.7 900.78 900 78 810.4 654.07 362.1 349.3 348.7 348 7 3636.0 387.36

FormantsinHz F2 F3 1484.6 3750.3 1853.0 1853 0 2899.3 2899 3 1181.6 2865.5 1755.3 2599.9 2275.9 2570.3 1928.6 2365.0 1154.98 1154 98 2418.4 2418 4 1147.2 2570.8 1488.5 2611.5

F4 3750.2 4078.2 4078 2 3792.2 3753.5 3878.4 3876.5 3636.0 3636 0 3568.2 3693.2

LPC of aa in aayvu
Sound pressu level (dB/Hz) ure

886.4

1212.5

60
2916.7

3754.0

4813.6

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

LPC of ay in aayvu
Sound press sure level (dB/Hz)

671.6

1694.1 2272.1

3679.9

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

LPC of y in aayvu

Sound pressure level (dB/Hz) d (

352.9 2323.9 3939.3 4939.6

1000

2000

3000 Frequency (Hz)

4000

5000 5500

LPC of v in aayvu

Sound pressure level (d /Hz) dB

323.3 1190.2 2346.2 3613.2

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

LPC of vu in aayvu

Sound pressure level (dB/Hz) p B

360.3
60

1108.7

2612.9

3583.6

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

LPC of u in aayvu

Sound pressure level (d /Hz) dB

397.4
60

1486.3

3583.6 2590.7

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

Linear Prediction Coefficient (LPC)

Linear Prediction Coefficient (LPC) analysis attempts to predict the poles (related to resonances or formants) that, when combined with the speech source spectrum (the "residual" in LPC analysis), would result in the original waveform. g

An LPC analysis separates the analysis of the resonant characteristics of a speech sound from the source characteristics of that sound.

The resulting LPC spectrum is a smoothed spectrum with the peaks representing the formants (resulting from the vocal tract resonances) of the spectrum of a vowel or vowel like consonant vowel-like consonant.

Figure:ThisisanLPCanalysisofthevowelinheard.Note thesmoothspectrumclearlyshowingthepositionsofthe mainspectralpeaks(formants)ofthisvowel

Figure:Whitenoiseusedasasimplifiedmodelofafricativesound source. Notetherandompatternofboththewaveform(bottom)andthe spectrum(top).Alsonotethatthespectralenvelope(LPCspectruminred) isapproximatelyflat.

Identification of Speech Waveforms

Figure:Threelongvowelsinan/h_d/context.

Figure:ThreeEnglishvoicelessoralstopsinCVcontext

Figure:ThreeEnglishvoicedoralstopsinCVcontext.

Figure:ThetwoEnglishaffricatesinCVcontext.

Figure9:WaveformsoftwooftheEnglishvoicelessfricativesinCVcontext

How To Read A Spectrograms (Course3)
No ratings yet
How To Read A Spectrograms (Course3)
28 pages
3.2 Automatic Speech Recognition
No ratings yet
3.2 Automatic Speech Recognition
151 pages
P and P Essay Spectrogram
No ratings yet
P and P Essay Spectrogram
3 pages
Speech Features Extraction Techniques
No ratings yet
Speech Features Extraction Techniques
9 pages
Phonetics in Speech Recognition Systems
No ratings yet
Phonetics in Speech Recognition Systems
69 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
69 pages
Phonetics and Phonology Explained
No ratings yet
Phonetics and Phonology Explained
21 pages
Audproc 2
No ratings yet
Audproc 2
40 pages
Understanding Spectrograms in Sound Analysis
No ratings yet
Understanding Spectrograms in Sound Analysis
5 pages
Acoustic Phonetics Overview
No ratings yet
Acoustic Phonetics Overview
19 pages
Speech Sounds in NLP: Production & Analysis
No ratings yet
Speech Sounds in NLP: Production & Analysis
9 pages
Chapter 6
No ratings yet
Chapter 6
13 pages
Acoustics of Speech: Julia Hirschberg CS 4706
No ratings yet
Acoustics of Speech: Julia Hirschberg CS 4706
30 pages
Acoustic Phonetics Course Guide
No ratings yet
Acoustic Phonetics Course Guide
22 pages
Speech Lab
No ratings yet
Speech Lab
7 pages
How Do I Read A Spectrogram?: Rob's Blog
No ratings yet
How Do I Read A Spectrogram?: Rob's Blog
15 pages
Handout Spectrogram
100% (1)
Handout Spectrogram
5 pages
Acoustics of Speech: Julia Hirschberg CS 4706
No ratings yet
Acoustics of Speech: Julia Hirschberg CS 4706
29 pages
Speech Acoustics Project
No ratings yet
Speech Acoustics Project
22 pages
Understanding Speech Signal Features
No ratings yet
Understanding Speech Signal Features
5 pages
Sound Waves
No ratings yet
Sound Waves
8 pages
Speech Science Lab: Spectrogram Analysis
No ratings yet
Speech Science Lab: Spectrogram Analysis
3 pages
Acoustic Phonetics - The Handbook of Phonetic Sciences - Blackwell Reference Online
100% (1)
Acoustic Phonetics - The Handbook of Phonetic Sciences - Blackwell Reference Online
32 pages
Speech Processing Course Guide
No ratings yet
Speech Processing Course Guide
54 pages
Recall What Are Sound Features? Feature Detection and Extraction Features in Sphinx III
No ratings yet
Recall What Are Sound Features? Feature Detection and Extraction Features in Sphinx III
11 pages
Speech Signal Processing Overview
No ratings yet
Speech Signal Processing Overview
54 pages
Lecture 3
No ratings yet
Lecture 3
7 pages
Lec2 Audition
No ratings yet
Lec2 Audition
37 pages
Understanding Acoustic Phonetics
No ratings yet
Understanding Acoustic Phonetics
5 pages
Vowels
No ratings yet
Vowels
43 pages
Auditary Phonetics
No ratings yet
Auditary Phonetics
5 pages
List of Figures: Second Unit: Audio and Speech Descriptors
No ratings yet
List of Figures: Second Unit: Audio and Speech Descriptors
22 pages
Speech Processing Basics
No ratings yet
Speech Processing Basics
86 pages
Acoustic Phonetics & PRAAT Guide
100% (1)
Acoustic Phonetics & PRAAT Guide
19 pages
Acoustics of Fricatives 8
No ratings yet
Acoustics of Fricatives 8
6 pages
Introduction To Acoustics
No ratings yet
Introduction To Acoustics
7 pages
Lecours 1968
No ratings yet
Lecours 1968
3 pages
Understanding Acoustic Phonetics
No ratings yet
Understanding Acoustic Phonetics
49 pages
Speech Chapter 4
No ratings yet
Speech Chapter 4
41 pages
Digital Signal Processing: Course
No ratings yet
Digital Signal Processing: Course
47 pages
Acoustic Phonetics
No ratings yet
Acoustic Phonetics
4 pages
Acoustic Phonetics Overview
0% (1)
Acoustic Phonetics Overview
52 pages
Text, Speech and Phono
No ratings yet
Text, Speech and Phono
2 pages
Favsi m3 (Models)
No ratings yet
Favsi m3 (Models)
48 pages
Blacklock (2004) Tesis-Characteristics of Variation in Production of Normal and Disordered Fricative - Multitaper
No ratings yet
Blacklock (2004) Tesis-Characteristics of Variation in Production of Normal and Disordered Fricative - Multitaper
288 pages
Speech Analysis
No ratings yet
Speech Analysis
10 pages
Speech Sound Production: Recognition Using Recurrent Neural Networks
No ratings yet
Speech Sound Production: Recognition Using Recurrent Neural Networks
20 pages
Lab2 Cepstrales Sin Cepstrales
No ratings yet
Lab2 Cepstrales Sin Cepstrales
21 pages
Voice Signal Processing For Speech Synthesis: June 2006
No ratings yet
Voice Signal Processing For Speech Synthesis: June 2006
6 pages
UNc2rjc ncr2ocmxedIT 2
No ratings yet
UNc2rjc ncr2ocmxedIT 2
3 pages
Acoustic Phonetics Overview
No ratings yet
Acoustic Phonetics Overview
15 pages
Speaker Recognition
No ratings yet
Speaker Recognition
19 pages
Acoustics and Digital Signal Processing
No ratings yet
Acoustics and Digital Signal Processing
42 pages
Spectrogram
No ratings yet
Spectrogram
1 page
Praat Manual
100% (2)
Praat Manual
1,270 pages
Understanding Resonance in Waves
No ratings yet
Understanding Resonance in Waves
23 pages
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
No ratings yet
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
3 pages
1.1 C Acoustic SP of Coarti
No ratings yet
1.1 C Acoustic SP of Coarti
11 pages
Phonetics in Computer Applications
No ratings yet
Phonetics in Computer Applications
13 pages
Prof. K. Rajan
No ratings yet
Prof. K. Rajan
65 pages
Text-to-Speech Research at MILE Lab
No ratings yet
Text-to-Speech Research at MILE Lab
85 pages
Dr. TV. Geetha
No ratings yet
Dr. TV. Geetha
176 pages
DR A. Muthukumar
No ratings yet
DR A. Muthukumar
36 pages
English Grammar
No ratings yet
English Grammar
128 pages
Marketing Strategy of Nestle: BBA LLL Ali Raza 14-Arid-4830
No ratings yet
Marketing Strategy of Nestle: BBA LLL Ali Raza 14-Arid-4830
42 pages
Solomon 2ed Organic Chemistry PR
No ratings yet
Solomon 2ed Organic Chemistry PR
20 pages
RDP 203 Analytical Balance Maintenance and Calibration Guidelines and SOP Template
No ratings yet
RDP 203 Analytical Balance Maintenance and Calibration Guidelines and SOP Template
7 pages
Electives
No ratings yet
Electives
11 pages
Duran Duran: A Critical Video Review
No ratings yet
Duran Duran: A Critical Video Review
3 pages
Tangazo La Nafasi Za Kazi Tanapa 17-01-2020
No ratings yet
Tangazo La Nafasi Za Kazi Tanapa 17-01-2020
16 pages
METHOD OF MAKING HIGH PURITY Lithium Hydroxide and Hydrochloric Ascid
No ratings yet
METHOD OF MAKING HIGH PURITY Lithium Hydroxide and Hydrochloric Ascid
12 pages
Potsdam Village Police Dept. Blotter April 8, 2018
No ratings yet
Potsdam Village Police Dept. Blotter April 8, 2018
4 pages
31 Lecture Presentation 0
No ratings yet
31 Lecture Presentation 0
45 pages
Exam Schedule for Parents
No ratings yet
Exam Schedule for Parents
4 pages
Mikrotik Traffic Control Guide
No ratings yet
Mikrotik Traffic Control Guide
63 pages
Geography Dissertation Writing Help
100% (2)
Geography Dissertation Writing Help
4 pages
Substation Protection Systems Overview
50% (2)
Substation Protection Systems Overview
21 pages
About Davinci Resolve Studio 19.1.3
No ratings yet
About Davinci Resolve Studio 19.1.3
2 pages
Door Hardware Technical Guide
No ratings yet
Door Hardware Technical Guide
1 page
6.3 MTR Specs
No ratings yet
6.3 MTR Specs
2 pages
Cla - Multiple Integration
No ratings yet
Cla - Multiple Integration
6 pages
QM78202 Data Sheet
No ratings yet
QM78202 Data Sheet
34 pages
Information Guide - Tier 2 Robert Brown Promising Researcher Award (Jul 2025)
No ratings yet
Information Guide - Tier 2 Robert Brown Promising Researcher Award (Jul 2025)
9 pages
BJPA Vol - XXINo.1Jan-June.2024 Pages
No ratings yet
BJPA Vol - XXINo.1Jan-June.2024 Pages
22 pages
Begel David 202410110852100170
No ratings yet
Begel David 202410110852100170
1 page
Lecure 1
No ratings yet
Lecure 1
20 pages
Blocked Credit
No ratings yet
Blocked Credit
43 pages
Wireshark Setup for Catalyst 3850
No ratings yet
Wireshark Setup for Catalyst 3850
6 pages
Data Review Meeting Template - 2018 Final-1
No ratings yet
Data Review Meeting Template - 2018 Final-1
19 pages
Non-Proportional Reinsurance Guide
No ratings yet
Non-Proportional Reinsurance Guide
17 pages
Employee Overtime Calculator
No ratings yet
Employee Overtime Calculator
2 pages
Normalization in Databases
No ratings yet
Normalization in Databases
40 pages
MOONLIGHT AND MAGNOLIAS - The Public Theatre Pages 1-10 - Flip PDF Download - FlipHTML5
No ratings yet
MOONLIGHT AND MAGNOLIAS - The Public Theatre Pages 1-10 - Flip PDF Download - FlipHTML5
10 pages

Understanding Speech Analysis Techniques

Uploaded by

Understanding Speech Analysis Techniques

Uploaded by

Speechanalysis S h l i

WhatisSpeechAnalysis? What is Speech Analysis?

Spectrograms Spectrograms permit the examination of the dynamic changes in a Spectrogramspermittheexaminationofthedynamicchangesina

Figure: Thisisabroadbandspectrogramof theword"hide"withtheformanttracksfor formants1to5superimposedoverit.

1_aam 0.0143017892 0.490396511

-1 g 0 Time (s) aa ay y yv v vu u 0.8455

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

Sound pressure level (dB/Hz) d (

352.9 2323.9 3939.3 4939.6

3000 Frequency (Hz)

Sound pressure level (d /Hz) dB

323.3 1190.2 2346.2 3613.2

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

Sound pressure level (dB/Hz) p B

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

Sound pressure level (d /Hz) dB

20 0 1000 2000 3000 Frequency (Hz) 4000 5000 5500

Linear Prediction Coefficient (LPC)

Figure:ThisisanLPCanalysisofthevowelinheard.Note thesmoothspectrumclearlyshowingthepositionsofthe mainspectralpeaks(formants)ofthisvowel

Figure:Whitenoiseusedasasimplifiedmodelofafricativesound source. Notetherandompatternofboththewaveform(bottom)andthe spectrum(top).Alsonotethatthespectralenvelope(LPCspectruminred) isapproximatelyflat.

Identification of Speech Waveforms

You might also like