Stars
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Fully open reproduction of DeepSeek-R1
Recognize bio-medical entities from a text corpus
Precision Medicine Knowledge Graph (PrimeKG)
An Open Large Reasoning Model for Real-World Solutions
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
Medical Graph RAG: Graph RAG for the Medical Data
The modified Panda-gym benchmark for evaluating skill-aware RL algorithms.
[Paper List] Papers integrating knowledge graphs (KGs) and large language models (LLMs)
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
Library of my solutions for various programming contests
Solution to the PFSP problem
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
⭐️ 一个可爱且任性的 喜马拉雅专辑音频无限制下载器O(∩_∩)O
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
✨✨Latest Advances on Multimodal Large Language Models
Official implementation of Skill-aware Mutural Information (SaMI) from the paper Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning.
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.