Skip to content
View ngzhili's full-sized avatar
🖐️
🖐️
  • National University of Singapore
  • Singapore
  • LinkedIn in/ngzhili

Block or report ngzhili

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Curated list of datasets and tools for post-training.

2,629 228 Updated Jan 29, 2025

Multilingual Dialogue Datasets

19 4 Updated Aug 18, 2022
Jupyter Notebook 57 8 Updated Apr 2, 2024

This is the official implementation of NeurIPS 2021 "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

Python 71 10 Updated Apr 1, 2022

A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.

174 4 Updated Jul 31, 2024
7 Updated Feb 4, 2025

Benchmarking library for RAG

Jupyter Notebook 161 14 Updated Feb 6, 2025

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…

Jupyter Notebook 11,793 1,197 Updated Feb 2, 2025

Download your Spotify playlists and songs along with album art and metadata (from YouTube if a match is found).

Python 18,656 1,685 Updated Jan 18, 2025

A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.

530 48 Updated Jan 23, 2025

RAG Citation enhances Retrieval-Augmented Generation (RAG) by automatically generating relevant citations for AI-generated content. It ensures credibility by backing responses with accurate referen…

Python 21 1 Updated Nov 4, 2024

Effortlessly run LLM backends, APIs, frontends, and services with one command.

Python 1,119 86 Updated Feb 7, 2025

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 66,177 7,074 Updated Feb 7, 2025

Interactively explore unstructured datasets from your dataframe.

TypeScript 1,145 84 Updated Jan 10, 2025

Uniform Manifold Approximation and Projection

Python 7,613 817 Updated Nov 29, 2024

Open-source vector similarity search for Postgres

C 13,890 654 Updated Jan 19, 2025

BM25 search implemented in PL/pgSQL

Jupyter Notebook 23 Updated Dec 6, 2024

pgvector support for Python

Python 1,067 67 Updated Dec 19, 2024

Postgres for Search and Analytics

Rust 6,657 209 Updated Feb 7, 2025

Generic rag framework to apply the power of LLMs on any given dataset

Python 518 84 Updated Jan 15, 2025

pgvector + embeddings API

Python 20 4 Updated Dec 14, 2023

Attribute (or cite) statements generated by LLMs back to in-context information.

Jupyter Notebook 195 16 Updated Oct 8, 2024

Cache-Augmented Generation: A Simple, Efficient Alternative to RAG

Python 944 142 Updated Feb 4, 2025

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Rust 9,348 836 Updated Jan 28, 2025

The GUI for MongoDB.

TypeScript 1,222 195 Updated Feb 7, 2025

JS tokenizer for LLaMA 3 and LLaMA 3.1

JavaScript 102 6 Updated Aug 11, 2024

<textarea /> component for React which grows with content

TypeScript 2,283 249 Updated Jan 10, 2025

AWS Certified Cloud Practitioner Short Notes And Practice Exams (CLF-C02)

HTML 2,323 829 Updated Jan 9, 2025

Completely unstyled, fully accessible UI components, designed to integrate beautifully with Tailwind CSS.

TypeScript 26,719 1,100 Updated Feb 6, 2025

Radix Themes is an open-source component library optimized for fast development, easy maintenance, and accessibility. Maintained by @workos.

TypeScript 6,156 221 Updated Jan 25, 2025
Next