Spoken Language Processing in Python Chapter4
Spoken Language Processing in Python Chapter4
transcription helper
functions
S P OK EN LAN GUAGE P ROCES S IN G IN P YTH ON
Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Exploring audio les
# Import os module
import os
(['call_1.mp3',
'call_2.mp3',
'call_3.mp3',
'call_4.mp3'])
# Transcribe call 1
recognizer = sr.Recognizer()
call_1_file = sr.AudioFile("acme_audio_files/call_1.wav")
with call_1_file as source:
call_1_audio = recognizer.record(call_1_file)
recognizer.recognize_google(call_1_audio)
# Print attributes
print(f"Channels: {audio_segment.channels}")
print(f"Sample width: {audio_segment.sample_width}")
print(f"Frame rate (sample rate): {audio_segment.frame_rate}")
print(f"Frame width: {audio_segment.frame_width}")
print(f"Length (ms): {len(audio_segment)}")
print(f"Frame count: {audio_segment.frame_count()}")
Channels: 2
Sample width: 2
Frame rate (sample rate): 32000
Frame width: 4
Length (ms): 54888
Frame count: 1756416.0
"hello welcome to Acme studio support line my name is Daniel how can I best help
you hey Daniel this is John I've recently bought a smart from you guys and I know
that's not good to hear John let's let's get your cell number and then we
can we can set up a way to fix it for you one number for 1757 varies how long do
you reckon this is going to take about an hour now while John we're going to try
our best hour I will we get the sealing member will start up this support case
I'm just really really really really I've been trying to contact 34 been put on
hold more than an hour and half so I'm not really happy I kind of wanna get this
issue 6 is fossil"
Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Installing sentiment analysis libraries
$ pip install nltk
"hey Dave is this any better do I order products are currently on July 1st and I haven't
received the product a three-week step down this parable 6987 5"
Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Installing spaCy
# Install spaCy
$ pip install spacy
I 0
'd 1
like 4
to 9
talk 12
about 17
a 23
smartphone 25...
I'd like to talk about a smartphone I ordered on July 31st from your Sydney store,
my order number is 4093829.
I spoke to one of your customer service team, Georgia, yesterday.
smartphone PRODUCT
July 31st DATE
Sydney GPE
4093829 CARDINAL
one CARDINAL
Georgia GPE
yesterday DATE
Daniel Bourke
Machine Learning Engineer/YouTube
creator
Inspecting the data
# Inspect post purchase audio folder
import os
post_purchase_audio = os.listdir("post_purchase")
print(post_purchase_audio[:5])
['post-purchase-audio-0.mp3',
'post-purchase-audio-1.mp3',
'post-purchase-audio-2.mp3',
'post-purchase-audio-3.mp3',
'post-purchase-audio-4.mp3']
text_list = []
# Transcribe audio
text = transcribe_audio(file)
return text_list
print(post_purchase_text[:5])
['hey man I just water product from you guys and I think is amazing but I leave a little
'these clothes I just bought from you guys too small is there anyway I can change the s
"I recently got these pair of shoes but they're too big can I change the size",
"I bought a pair of pants from you guys but they're way too small",
"I bought a pair of pants and they're the wrong colour is there any chance I can change
Daniel Bourke
Machine Learning Engineer/YouTube
creator
What you've done
1. Converted audio les into soundwaves with Python and NumPy .
4. Built a spoken language processing pipeline with NLTK , spaCy and sklearn .
print(one_last_transcription)