-
New York University
- Brooklyn, NY
-
23:08
(UTC -05:00) - sivannavis.github.io
- @NavissivanD
- in/sivanding
Starred repositories
Latte: Cross-framework Python Package for Evaluation of Latent-based Generative Models
A simple and elegant Jekyll theme for an academic personal homepage
A beautiful, simple, clean, and responsive Jekyll theme for academics
[ICLR'24] Learning to Compose: Improving Object Centric Learning by Injecting Compositionality
Multi-view-AE: An extensive collection of multi-modal autoencoders implemented in a modular, scikit-learn style framework.
Multi-VAE: Learning Disentangled View-common and View-peculiar Visual Representations for Multi-view Clustering
Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
This toolbox aims to unify audio generation model evaluation for easier comparison.
Python packaging and dependency management made easy
This repo implements a Stable Diffusion model in PyTorch with all the essential components.
This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch
Vector (and Scalar) Quantization, in Pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
AudioLDM training, finetuning, evaluation and inference.
Official repository supporting the L3DAS23 IEEE ICASSP Grand Challenge
[OBSOLETE] Plugin that adds OAuth2 login support to yt-dlp's YouTube extractors
A feature-rich command-line audio/video downloader
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
Stereo, Binaural, Surround -- The more the better
Learning audio concepts from natural language supervision
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)