-
Tensoic AI
- Mumbai, India
- https://adrx.me
- in/adarshxs
- @adarshxs
Highlights
- Pro
Stars
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
Convert PDF to HTML without losing text or format.
A port of muerrilla's sd-webui-Detail-Daemon as a node for ComfyUI, to adjust sigmas that control detail.
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality
An awesome & curated list of best LLMOps tools for developers
An open-source RAG-based tool for chatting with your documents.
noise_step: Training in 1.58b With No Gradient Memory
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
SGLang is a fast serving framework for large language models and vision language models.
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Open and efficient video watermarking
A generative world for general-purpose robotics & embodied AI learning.
A blender addon for generating meshes with AI
An interactive multilingual learning platform powered by Sarvam AI, AI4Bharat, and OpenAI.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
[WACV 2024] Training-Free Layout Control with Cross-Attention Guidance
A holistic way of understanding how Llama and its components run in practice, with code and detailed documentation.
Efficient Triton Kernels for LLM Training
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
A generative speech model for daily dialogue.