Skip to content

Commit

Permalink
setup: update most dependencies (#1323)
Browse files Browse the repository at this point in the history
- setup: switch to torch 2.0+ and lightning 2.0+
- setup: switch to torchaudio 2.0+ and soundfile 0.12+
- setup: switch to pyannote.core 5.0+ and pyannote.database 5.0+
- setup: switch to speechbrain 0.5.14+

- BREAKING(task): rename `Segmentation` task to `SpeakerDiarization`
- BREAKING(task): remove support for variable chunk duration

- BREAKING(pipeline): remove `SpeakerSegmentation` pipeline (in favor of `SpeakerDiarization` pipeline)
- BREAKING(pipeline): remove support `FINCHClustering` and `HiddenMarkovModelClustering`

- BREAKING: drop support for Python 3.7
  • Loading branch information
hbredin authored Apr 17, 2023
1 parent 7121a73 commit 9faa8fc
Show file tree
Hide file tree
Showing 20 changed files with 75 additions and 859 deletions.
46 changes: 23 additions & 23 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ name: Tests

on:
push:
branches: [ develop ]
branches: [develop]
pull_request:
branches: [ develop ]
branches: [develop]

jobs:
build:
Expand All @@ -13,28 +13,28 @@ jobs:
strategy:
matrix:
os: [ubuntu-latest]
python-version: [3.7, 3.8, 3.9]
python-version: [3.8, 3.9, "3.10"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install libsndfile
if: matrix.os == 'ubuntu-latest'
run: |
sudo apt-get install libsndfile1
- name: Install pyannote.audio
run: |
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install libsndfile
if: matrix.os == 'ubuntu-latest'
run: |
sudo apt-get install libsndfile1
- name: Install pyannote.audio
run: |
pip install -e .[dev,testing]
- name: Test with pytest
run: |
- name: Test with pytest
run: |
export PYANNOTE_DATABASE_CONFIG=$GITHUB_WORKSPACE/tests/data/database.yml
pytest --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
file: ./coverage.xml
env_vars: PYTHON
name: codecov-pyannote-audio
fail_ci_if_error: false
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
file: ./coverage.xml
env_vars: PYTHON
name: codecov-pyannote-audio
fail_ci_if_error: false
8 changes: 7 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,18 @@

## Version 3.0 (xxxx-xx-xx)

- setup: switch to pyannote.database 5.0
- feat(task): add support for label scope in speaker diarization task (from pyannote.database 5.0)
- feat(task): add support for missing classes in multi-label segmentation task (from pyannote.database 5.0)
- improve(task): load metadata as tensors rather than pyannote.core instances
- setup: switch to torch 2.0+ and lightning 2.0+
- setup: switch to torchaudio 2.0+ and soundfile 0.12+
- setup: switch to pyannote.core 5.0+ and pyannote.database 5.0+
- setup: switch to speechbrain 0.5.14+
- BREAKING(task): rename `Segmentation` task to `SpeakerDiarization`
- BREAKING(task): remove support for variable chunk duration
- BREAKING(pipeline): remove `SpeakerSegmentation` pipeline (in favor of `SpeakerDiarization` pipeline)
- BREAKING(pipeline): remove support `FINCHClustering` and `HiddenMarkovModelClustering`
- BREAKING: drop support for Python 3.7

## Version 2.1.1 (2022-10-27)

Expand Down
22 changes: 6 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,31 +30,21 @@ for turn, _, speaker in diarization.itertracks(yield_label=True):
# ...
```

## What's new in `pyannote.audio` 2.x?
## Highlights

For version 2.x of `pyannote.audio`, [I](https://herve.niderb.fr) decided to rewrite almost everything from scratch.
Highlights of this release are:

- :exploding_head: much better performance (see [Benchmark](#benchmark))
- :snake: Python-first API
- :hugs: pretrained [pipelines](https://hf.co/models?other=pyannote-audio-pipeline) (and [models](https://hf.co/models?other=pyannote-audio-model)) on [:hugs: model hub](https://huggingface.co/pyannote)
- :exploding_head: state-of-the-art performance (see [Benchmark](#benchmark))
- :snake: Python-first API
- :zap: multi-GPU training with [pytorch-lightning](https://pytorchlightning.ai/)
- :control_knobs: data augmentation with [torch-audiomentations](https://github.com/asteroid-team/torch-audiomentations)
- :boom: [Prodigy](https://prodi.gy/) recipes for model-assisted audio annotation

## Installation

Only Python 3.8+ is officially supported (though it might work with Python 3.7)
Only Python 3.8+ is supported.

```bash
conda create -n pyannote python=3.8
conda activate pyannote

# pytorch 1.11 is required for speechbrain compatibility
# (see https://pytorch.org/get-started/previous-versions/#v1110)
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 -c pytorch

pip install pyannote.audio
# install from develop branch
pip install -qq https://github.com/pyannote/pyannote-audio/archive/refs/heads/develop.zip
```

## Documentation
Expand Down
3 changes: 3 additions & 0 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ Installation

::

$ conda create -n pyannote python=3.10
$ conda activate pyannote
$ conda install pytorch torchvision torchaudio -c pytorch
$ pip install pyannote.audio


Expand Down
2 changes: 1 addition & 1 deletion pyannote/audio/cli/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

import hydra
from hydra.utils import instantiate
from lightning_lite.utilities.seed import seed_everything
from lightning.pytorch import seed_everything
from omegaconf import DictConfig, OmegaConf

# from pyannote.audio.core.callback import GraduallyUnfreeze
Expand Down
11 changes: 3 additions & 8 deletions pyannote/audio/cli/train_config/trainer/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@
_target_: pytorch_lightning.Trainer
accelerator: auto
accumulate_grad_batches: 1
auto_scale_batch_size: False
auto_lr_find: False
benchmark: False
benchmark: null # TODO: automatically set to True when using fixed duration chunks
deterministic: False
check_val_every_n_epoch: 1
devices: auto
Expand All @@ -13,7 +11,7 @@ enable_checkpointing: True
enable_model_summary: True
enable_progress_bar: True
fast_dev_run: False
gradient_clip_val: 0
gradient_clip_val: null
gradient_clip_algorithm: norm
limit_predict_batches: 1.0
limit_test_batches: 1.0
Expand All @@ -25,16 +23,13 @@ max_steps: -1
max_time: null
min_epochs: 1
min_steps: null
move_metrics_to_cpu: False
multiple_trainloader_mode: max_size_cycle
num_nodes: 1
num_sanity_val_steps: 2
overfit_batches: 0.0
precision: 32
profiler: null
reload_dataloaders_every_n_epochs: 0
replace_sampler_ddp: True
use_distributed_sampler: True # TODO: check what this does exactly
strategy: null
sync_batchnorm: False
track_grad_norm: -1
val_check_interval: 1.0
5 changes: 1 addition & 4 deletions pyannote/audio/core/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,8 @@
import torch.optim
from huggingface_hub import hf_hub_download
from huggingface_hub.utils import RepositoryNotFoundError
from lightning_fabric.utilities.cloud_io import _load as pl_load
from pyannote.core import SlidingWindow
from lightning_lite.utilities.cloud_io import _load as pl_load
from pytorch_lightning.utilities.model_summary import ModelSummary
from semver import VersionInfo
from torch.utils.data import DataLoader
Expand Down Expand Up @@ -523,9 +523,6 @@ def val_dataloader(self) -> DataLoader:
def validation_step(self, batch, batch_idx):
return self.task.validation_step(batch, batch_idx)

def validation_epoch_end(self, outputs):
return self.task.validation_epoch_end(outputs)

def configure_optimizers(self):
return torch.optim.Adam(self.parameters(), lr=1e-3)

Expand Down
17 changes: 3 additions & 14 deletions pyannote/audio/core/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,31 +23,23 @@

from __future__ import annotations

from functools import partial

import scipy.special

try:
from functools import cached_property
except ImportError:
from backports.cached_property import cached_property

import multiprocessing
import sys
import warnings
from dataclasses import dataclass
from enum import Enum
from functools import cached_property, partial
from numbers import Number
from typing import Dict, List, Optional, Sequence, Text, Tuple, Union
from typing import Dict, List, Literal, Optional, Sequence, Text, Tuple, Union

import pytorch_lightning as pl
import scipy.special
import torch
from pyannote.database import Protocol
from torch.utils.data import DataLoader, Dataset, IterableDataset
from torch_audiomentations import Identity
from torch_audiomentations.core.transforms_interface import BaseWaveformTransform
from torchmetrics import Metric, MetricCollection
from typing_extensions import Literal

from pyannote.audio.utils.loss import binary_cross_entropy, nll_loss
from pyannote.audio.utils.protocol import check_protocol
Expand Down Expand Up @@ -447,9 +439,6 @@ def val_dataloader(self) -> Optional[DataLoader]:
def validation_step(self, batch, batch_idx: int):
return self.common_step(batch, batch_idx, "val")

def validation_epoch_end(self, outputs):
pass

def default_metric(self) -> Union[Metric, Sequence[Metric], Dict[str, Metric]]:
"""Default validation metric"""
msg = f"Missing '{self.__class__.__name__}.default_metric' method."
Expand Down
2 changes: 0 additions & 2 deletions pyannote/audio/pipelines/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,11 @@
from .overlapped_speech_detection import OverlappedSpeechDetection
from .resegmentation import Resegmentation
from .speaker_diarization import SpeakerDiarization
from .speaker_segmentation import SpeakerSegmentation
from .voice_activity_detection import VoiceActivityDetection

__all__ = [
"VoiceActivityDetection",
"OverlappedSpeechDetection",
"SpeakerSegmentation",
"SpeakerDiarization",
"Resegmentation",
"MultiLabelSegmentation",
Expand Down
Loading

0 comments on commit 9faa8fc

Please sign in to comment.