Stars
Real-CE: A Benchmark for Chinese-English Scene Text Image Super-resolution (ICCV2023)
[ECCV2020] A super-resolution dataset of paired LR-HR scene text images
A collection of papers and resources on scene text image super-resolution.
Awesome LLMs on Device: A Comprehensive Survey
GitHub page for "Large Language Model-Brained GUI Agents: A Survey"
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
2021年最新总结,推荐工程师合适读本,计算机科学,软件技术,创业,思想类,数学类,人物传记书籍
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution
LPIPS metric. pip install lpips
Tool for onnx->keras or onnx->tflite. Hope this tool can help you.
The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Open-Sora: Democratizing Efficient Video Production for All
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities