0% found this document useful (0 votes)
12 views12 pages

AJ Python Speech Recog Part Five

The document describes the pocketsphinx Python package for voice activity detection and speech recognition. It summarizes key classes like Vad for voice activity detection and Config for configuration. Vad analyzes audio frames to detect speech/non-speech. Config allows initializing models and parameters for speech recognition and acts like a dictionary. The document provides details on initializing, serializing, and accessing configuration parameters.

Uploaded by

Abhishek Jain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
12 views12 pages

AJ Python Speech Recog Part Five

The document describes the pocketsphinx Python package for voice activity detection and speech recognition. It summarizes key classes like Vad for voice activity detection and Config for configuration. Vad analyzes audio frames to detect speech/non-speech. Config allows initializing models and parameters for speech recognition and acts like a dictionary. The document provides details on initializing, serializing, and accessing configuration parameters.

Uploaded by

Abhishek Jain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 12

Main pocketsphinx package — PocketSphinx 5.0.

1 documentation 06/08/23, 12:19 AM

class pocketsphinx.Vad(mode=PS_VAD_LOOSE,
sample_rate=PS_VAD_DEFAULT_SAMPLE_RATE,
frame_length=PS_VAD_DEFAULT_FRAME_LENGTH)

Voice ac!vity detec!on class.

Parameters: mode (int) – Aggressiveness of voice ac!vity detc!on (0-3)


sample_rate (int) – Sampling rate of input, default is 16000.
Rates other than 8000, 16000, 32000, 48000 are only
approximately supported, see note in frame_length .
Outlandish sampling rates like 3924 and 115200 will raise a
ValueError .
frame_length (float) – Desired input frame length in
seconds, default is 0.03. The actual frame length may be
different if an approximately supported sampling rate is
requested. You must always use the frame_bytes and
frame_length a#ributes to determine the input size.

Raises: ValueError – Invalid input parameter (see above).

frame_bytes

Number of bytes (not samples) required in an input frame.

You must pass input of this size, as bytes , to the Vad .

Type: int

frame_length

Length of a frame in seconds (may be different from the one requested in the
constructor!)

Type: float

is_speech(self, frame, sample_rate=None)

Classify a frame as speech or not.

Parameters: frame (bytes) – Buffer containing speech data (16-bit signed


integers). Must be of length frame_bytes (in bytes).
https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 21 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

integers). Must be of length frame_bytes (in bytes).

Returns: Classifica!on as speech or not speech.

Return type: boolean

Raises: IndexError – buf is of invalid size.


ValueError – Other internal VAD error.

sample_rate

Sampling rate of input data.

Type: int

Other classes

class pocketsphinx.Config(*args, **kwargs)

Configura!on object for PocketSphinx.

The PocketSphinx recognizer can be configured either implicitly, by passing


keyword arguments to Decoder , or by crea!ng and manipula!ng Config objects.
There are a large number of parameters, most of which are not important or
subject to change.

A Config can be ini!alized with keyword arguments:

config = Config(hmm="path/to/things", dict="my.dict")

It can also be ini!alized by parsing JSON (either as bytes or str):

config = Config.parse_json('''{"hmm": "path/to/things",


"dict": "my.dict"}''')

The “parser” is very much not strict, so you can also pass a sort of pseudo-YAML
to it, e.g.:

config = Config.parse_json("hmm: path/to/things, dict: my.dict")

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 22 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

You can also ini!alize an empty Config and set arguments in it directly:

config = Config()
config["hmm"] = "path/to/things"

In general, a Config mostly acts like a dic!onary, and can be iterated over in the
same fashion. However, a#emp!ng to access a parameter that does not already
exist will raise a KeyError .

Many parameters have default values. Also, when construc!ng a Config directly
(as opposed to parsing JSON), hmm , lm , and dict are set to the default models
(some kind of US English models of unknown origin + CMUDict). You can prevent
this by passing None for any of these parameters, e.g.:

config = Config(lm=None) # Do not load a language model

Decoder ini!aliza!on will fail if more than one of lm , jsgf , fsg , keyphrase ,
kws , allphone , or lmctl are set in the configura!on. To make life easier, and
because there is no possible case in which you would do this inten!onally, if you
ini!alize a Decoder or Config with any of these (and not lm ), the default lm
value will be removed. This is not the case if you decide to set one of them in an
exis!ng Config , so in that case you must make sure to set lm to None :

config["jsgf"] = "spam_eggs_and_spam.gram"
config["lm"] = None

You may also call default_search_args() a$er the fact to set hmm , lm , and
dict to the system defaults. Note that this will set them uncondi!onally.

See Configura!on parameters for a descrip!on of exis!ng parameters.

default_search_args(self)

Set arguments for the default acous!c and language model.

Set hmm , lm , and dict to the default ones (some kind of US English models
of unknown origin + CMUDict). This will overwrite any previous values for
these parameters, and does not check if the files exist.

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 23 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

describe(self)

Iterate over parameter descrip!ons.

This func!on returns a generator over the parameters defined in a


configura!on, as Arg objects.

Returns: Descrip!ons of parameters including their default values


and documenta!on

Return type: Iterable[Arg]

dumps(self)

Serialize configura!on to a JSON-forma#ed str .

This produces JSON from a configura!on object, with default values included.

Returns: Serialized JSON

Return type: str

Raises: Run!meError – if serializa!on fails somehow.

exists(self, key)

get_boolean(self, key)

get_float(self, key)

get_int(self, key)

get_string(self, key)

items(self)

sta!c parse_file(unicode path)

DEPRECATED: Parse a config file.

This reads a configura!on file in “command-line” format, for example:

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 24 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

-arg1 value -arg2 value


-arg3 value

Parameters: path (str) – Path to configura!on file.

Returns: Parsed config, or None on error.

Return type: Config

sta!c parse_json(json)

Parse JSON (or pseudo-YAML) configura!on

Parameters: json (bytes|str) – JSON data.

Returns: Parsed config, or None on error.

Return type: Config

set_boolean(self, key, val)

set_float(self, key, double val)

set_int(self, key, long val)

set_string(self, key, val)

set_string_extra(self, key, val)

class pocketsphinx.Arg(name, default, doc, type, required)

Descrip!on of a configura!on parameter.

default

Default value of parameter.

doc

Descrip!on of parameter.

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 25 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

name

Parameter name (without leading dash).

required

Is this parameter required?

type

Type (as a Python type object) of parameter value.

class pocketsphinx.LogMath(base=1.0001, shi"=0, use_table=False)

Log-space computa!on object used by PocketSphinx.

PocketSphinx does various computa!ons internally using integer math in


logarithmic space with a very small base (usually 1.0001 or 1.0003).

add(self, p, q)

exp(self, p)

get_zero(self)

ln_to_log(self, p)

log(self, p)

log10_to_log(self, p)

log_to_ln(self, p)

log_to_log10(self, p)

class pocketsphinx.Jsgf(unicode path, Jsgf parent=None)

JSGF parser.

build_fsg(self, JsgfRule rule, LogMath logmath, float lw)

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 26 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

get_name(self)

get_rule(self, name)

class pocketsphinx.JsgfRule

JSGF Rule.

Do not create this class directly.

get_name(self)

is_public(self)

class pocketsphinx.NGramModel(Config config, LogMath logmath, unicode path)

N-Gram language model.

add_word(self, word, float weight)

casefold(self, ngram_case_t kase)

prob(self, words)

sta!c readfile(unicode path)

size(self)

sta!c str_to_type(unicode typestr)

sta!c type_to_str(ngram_file_type_t _type)

write(self, unicode path, ngram_file_type_t "ype=NGRAM_AUTO)

class pocketsphinx.FsgModel(name, LogMath logmath, float lw, int nstate)

Finite-state recogni!on grammar.

accept(self, words)

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 27 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

add_alt(self, baseword, altword)

add_silence(self, silword, int state, float silprob)

sta!c jsgf_read_file(unicode filename, LogMath logmath, float lw)

null_trans_add(self, int src, int dst, int logp)

sta!c readfile(unicode filename, LogMath logmath, float lw)

set_final_state(self, state)

set_start_state(self, state)

tag_trans_add(self, int src, int dst, int logp, int wid)

trans_add(self, int src, int dst, int logp, int wid)

word_add(self, word)

word_id(self, word)

word_str(self, wid)

writefile(self, unicode path)

writefile_fsm(self, unicode path)

writefile_symtab(self, unicode path)

class pocketsphinx.Lattice

Word la%ce.

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 28 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

sta!c readfile(unicode path)

write(self, unicode path)

write_htk(self, unicode path)

class pocketsphinx.Segment

Word segmenta!on, as generated by Decoder.seg .

word

Name of word.

Type: str

start_frame

Index of start frame.

Type: int

end_frame

Index of end frame (inclusive!)

Type: int

ascore

Acous!c score (density).

Type: float

lscore

Language model score (joint probability).

Type: float

lback

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 29 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

Language model backoff order.

Type: int

class pocketsphinx.Hypothesis(hypstr, score, prob)

Recogni!on hypothesis, as returned by Decoder.hyp .

hypstr

Recognized text.

Type: str

score

Recogni!on score.

Type: float

best_score

Alias for score for compa!bility.

Type: float

prob

Posterior probability.

Type: float

class pocketsphinx.Alignment

Sub-word alignment as returned by get_alignment .

For the moment this is read-only. You are able to iterate over the words, phones,
or states in it, as well as sub-itera!ng over each of their children, as described in
AlignmentEntry .

phones(self)

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 30 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

Iterate over phones in the alignment.

states(self)

Iterate over states in the alignment.

words(self)

Iterate over words in the alignment.

class pocketsphinx.AlignmentEntry

Entry (word, phone, state) in an alignment.

Itera!ng over this will iterate over its children (i.e. the phones in a word or the
states in a phone) if any. For example:

for word in decoder.get_alignment():


print("%s from %.2f to %.2f" % (word.name, word.start,
word.start + word.duration))
for phone in word:
print("%s at %.2f duration %.2f" %
(phone.name, phone.start, phone.duration))

name

Name of segment (word, phone name, state id)

Type: str

start

Index of start frame.

Type: int

duration

Dura!on in frames.

Type: int

score

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 31 of 32
Main pocketsphinx package — PocketSphinx 5.0.1 documentation 06/08/23, 12:19 AM

Acous!c score (density).

Type: float

https://pocketsphinx.readthedocs.io/en/latest/pocketsphinx.html Page 32 of 32

You might also like