Skip to content
View koustuvsinha's full-sized avatar

Highlights

  • Pro

Organizations

@nodeschool @freeCodeCamp @ReScience @iemdatagroup @iem-devs @rllabmcgill @reproducibility-challenge @ml-retrospectives

Block or report koustuvsinha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).

Jupyter Notebook 184 19 Updated Dec 13, 2024

🔥 Aurora Series: A more efficient multimodal large language model series for video.

Python 57 4 Updated Nov 16, 2024
Python 3,089 265 Updated Oct 16, 2024
Python 323 22 Updated Nov 5, 2024

💭👀precognition.nvim - Precognition uses virtual text and gutter signs to show available motions.

Lua 929 11 Updated Dec 5, 2024

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,164 168 Updated Dec 14, 2024
Python 53 3 Updated Sep 19, 2024

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 494 254 Updated Jul 4, 2024

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Python 48 4 Updated Jul 10, 2024

Machine Learning Engineering Open Book

Python 11,980 727 Updated Dec 4, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,792 117 Updated Oct 30, 2024

Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)

Python 94 9 Updated Oct 21, 2023

PyTorch native finetuning library

Python 4,468 457 Updated Dec 13, 2024

The official Meta Llama 3 GitHub site

Python 27,494 3,132 Updated Aug 12, 2024

Cross-platform, fast, feature-rich, GPU based terminal

Python 25,009 999 Updated Dec 12, 2024

A neovim plugin for interactively running code with the jupyter kernel. Fork of magma-nvim with improvements in image rendering, performance, and more

Python 634 34 Updated Nov 14, 2024

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Python 1,283 55 Updated Dec 10, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 31,864 4,844 Updated Dec 14, 2024

utilities for decoding deep representations (like sentence embeddings) back to text

Python 753 85 Updated Sep 22, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 22,607 2,216 Updated Nov 28, 2024

EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties

Python 119 9 Updated Nov 10, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,840 264 Updated Jun 4, 2024

A PyTorch-based Speech Toolkit

Python 9,055 1,412 Updated Dec 9, 2024

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 10,906 1,084 Updated Dec 9, 2024

HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos

Python 17 1 Updated Mar 20, 2024

Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]

Python 353 44 Updated May 19, 2022

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 2,720 255 Updated Aug 9, 2024

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,106 253 Updated Nov 26, 2024

[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering

Python 181 22 Updated Jan 14, 2024

Fast Differentiable Tensor Library in JavaScript and TypeScript with Bun + Flashlight

TypeScript 1,147 26 Updated Jul 23, 2024
Next