Joey S2T

JoeyS2T is a JoeyNMT extension for Speech-to-Text tasks such as Automatic Speech Recognition (ASR) and end-to-end Speech Translation (ST). It inherits the core philosophy of JoeyNMT, a minimalist novice-friendly toolkit built on PyTorch, seeking simplicity and accessibility.

What's new

Upgraded to JoeyNMT v2.3.
Our paper has been accepted at EMNLP 2022 System Demo Track!

Features

JoeyS2T implements the following features:

Transformer Encoder-Decoder
1d-Conv Subsampling
Cross-entropy and CTC joint objective
Mel filterbank spectrogram extraction
CMVN, SpecAugment
WER evaluation

Furthermore, all the functionalities in JoeyNMT v2 are also available from JoeyS2T:

BLEU and ChrF evaluation
BPE tokenization (with BPE dropout option)
Beam search and greedy decoding (with repetition penalty, ngram blocker)
Customizable initialization
Attention visualization
Learning curve plotting
Scoring hypotheses and references
Multilingual translation with language tags

Installation

JoeyS2T is built on PyTorch. Please make sure you have a compatible environment. We tested JoeyS2T v2.3 with

python 3.11
torch 2.1.2
torchaudio 2.1.2
cuda 12.1

Clone this repository and install via pip:

$ git clone https://github.com/may-/joeys2t.git
$ cd joeys2t
$ python -m pip install -e .
$ python -m unittest

📝 Note You may need to install extra dependencies (torchaudio backends): ffmpeg, sox, soundfile, etc. See torchaudio installation instructions.

Documentation & Tutorials

Please check the JoeyNMT's documentation first, if you are not familiar with JoeyNMT yet.

For details, follow the tutorials in notebooks dir.

Benchmarks & Pretrained models

We provide benchmarks and pretraind models for Speech Recognition (ASR) and Speech Translation (ST) with JoeyS2T.

The models are also available via Torch Hub!

import torch

model = torch.hub.load('may-/joeys2t', 'mustc_v2_ende_st')
translations = model.generate(['test.wav'])
print(translations[0])
# 'Hallo, Welt!'

⚠️ Warning The 1d-conv layer may raise an error for too short audio inputs. (We cannot convolve the frames shorter than the kernel size!)

Reference

If you use JoeyS2T in a publication or thesis, please cite the following paper:

@inproceedings{ohta-etal-2022-joeys2t,
    title = "{JoeyS2T}: Minimalistic Speech-to-Text Modeling with {JoeyNMT}",
    author = "Ohta, Mayumi  and
      Kreutzer, Julia  and
      Riezler, Stefan",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, UAE",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.emnlp-demos.6",
    pages = "50--59",
}

Contact

Please leave an issue if you have found a bug in the code.

For general questions, email me at ohta <at> cl.uni-heidelberg.de.

Name	Name	Last commit message	Last commit date
Latest commit may- Jan 25, 2024 1ff13e1 · Jan 25, 2024 History 906 Commits
.github	.github	joeys2t	Jan 25, 2024
configs	configs	joeys2t	Jan 25, 2024
docs	docs	joeys2t	Jan 25, 2024
joeynmt	joeynmt	joeys2t	Jan 25, 2024
notebooks	notebooks	joeys2t	Jan 25, 2024
scripts	scripts	joeys2t	Jan 25, 2024
test	test	joeys2t	Jan 25, 2024
.gitattributes	.gitattributes	Create .gitattributes	Mar 6, 2019
.gitignore	.gitignore	documentation refactoring	Jan 25, 2024
.pylintrc	.pylintrc	joeys2t	Jan 25, 2024
.readthedocs.yml	.readthedocs.yml	prepare v2.3.0	Jan 20, 2024
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md	Create CODE_OF_CONDUCT.md	Mar 13, 2019
LICENSE	LICENSE	Update LICENSE	Jun 19, 2020
Makefile	Makefile	prepare v2.3.0	Jan 20, 2024
README.md	README.md	joeys2t	Jan 25, 2024
hubconf.py	hubconf.py	joeys2t	Jan 25, 2024
joey-small.png	joey-small.png	pic	Oct 15, 2018
joey2-small.png	joey2-small.png	update README	May 24, 2022
requirements.txt	requirements.txt	joeys2t	Jan 25, 2024
setup.cfg	setup.cfg	prepare v2.3.0	Jan 20, 2024
setup.py	setup.py	documentation refactoring	Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joey S2T

What's new

Features

Installation

Documentation & Tutorials

Benchmarks & Pretrained models

Reference

Contact

About

Releases 1

Packages

Contributors 29

Languages

License

may-/joeys2t

Folders and files

Latest commit

History

Repository files navigation

Joey S2T

What's new

Features

Installation

Documentation & Tutorials

Benchmarks & Pretrained models

Reference

Contact

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 29

Languages

Packages