[CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding."

Python 49 3 Updated Sep 23, 2024

eisneim / clip-vip_video_search

showing how to use CLIP-Vip to do video search

Python 13 2 Updated Nov 16, 2023

Nixtla / neuralforecast

Scalable and user friendly neural 🧠 forecasting algorithms.

Python 3,248 376 Updated Jan 13, 2025

lcultrera / WildCapture

WildCapture This repository contains the code and dataset used in the paper titled "Leveraging Visual Attention for out-of-distribution Detection" published at ICCV 2023, Paris Out Of Distribution …

Python 3 Updated Oct 12, 2023

chroma-core / chroma

the AI-native open-source embedding database

Rust 17,052 1,412 Updated Jan 19, 2025

facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Python 6,748 1,226 Updated Nov 26, 2024

louisYen / S3R

video anomaly detection

Python 76 11 Updated Sep 21, 2022

OpenMotionLab / MotionGPT

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs

Python 1,559 95 Updated Apr 3, 2024

stanfordnlp / dspy

DSPy: The framework for programming—not prompting—language models

Python 21,151 1,596 Updated Jan 17, 2025

LIAAD / tieval

An Evaluation Framework for Temporal Information Extraction Systems

Python 17 1 Updated Dec 13, 2024

TREC-AToMiC / AToMiC

An Image/Text Retrieval Test Collection to Support Multimedia Content Creation

Jupyter Notebook 20 2 Updated Oct 21, 2023

LiuRicky / ts2_net

[ECCV2022] A pytorch implementation for TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval

Python 75 9 Updated Nov 29, 2022

microsoft / XPretrain

Multi-modality pre-training

Python 479 38 Updated May 8, 2024

m-bain / frozen-in-time

Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]

Python 355 43 Updated May 19, 2022

yzhuoning / Awesome-CLIP

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

1,166 57 Updated Jun 28, 2024

Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,229 242 Updated Mar 5, 2024

gfiameni / hpdl

Jupyter Notebook 31 2 Updated Sep 16, 2022

TXH-mercury / VALOR

[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Python 273 16 Updated Dec 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nicola Messina mesnico

Achievements

Achievements

Highlights

Block or report mesnico

Stars

lorebianchi98 / Talk2DINO

boheumd / MA-LMM

h-zhao1997 / cobra

state-spaces / mamba

MrZilinXiao / AutoVER

lorebianchi98 / FG-CLIP

facebookresearch / hydra

facebookresearch / jepa

hpcaitech / Open-Sora

BradyFU / Awesome-Multimodal-Large-Language-Models

lzw-lzw / GroundingGPT

Mathux / TMR

lorebianchi98 / FG-OVD