data_processing
OpenRefine is a free, open source power tool for working with messy data and improving it
A high performance Python graph library implemented in Rust.
Interact with your SQL database, Natural Language to SQL using LLMs
Pure Python 3 MTProto API Telegram client library, for bots too!
Automatic Generation of Visualizations and Infographics using Large Language Models
No-code in the front, Python in the back. An open-source framework for creating data apps.
Dara is a dynamic application framework designed for creating interactive web apps with ease, all in pure Python.
The simplest, fastest way to get business intelligence and analytics to everyone in your company 😋
💫 Industrial-strength Natural Language Processing (NLP) in Python
esProc SPL is a scripting language for data processing, with well-designed rich library functions and powerful syntax, which can be executed in a Java program through JDBC interface and computing i…
ClickHouse® is a real-time analytics DBMS
Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
Vizro is a toolkit for creating modular data visualization applications.
Spacedrive is an open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust.
lakeFS - Data version control for your data lake | Git for data
Scratch is a swiss army knife for big data.
chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
Distributed Task Queue (development branch)
A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.
Open Source Metering and Usage Based Billing API ⭐️ Consumption tracking, Subscription management, Pricing iterations, Payment orchestration & Revenue analytics
Proxy server to bypass Cloudflare protection
MySQLのパフォーマンススキーマやPostgreSQLのpg_stat_statementsの内容を整形した上でTSV形式に出力するツール
Extremely fast LINQ aggregation operations implementation optimized by Burst Compiler
Rust implementation of SIF and uSIF: Simple and fast sentence embedding
Skytable is a modern scalable NoSQL database with BlueQL, designed for performance, scalability and flexibility. Skytable gives you spaces, models, data types, complex collections and more to build…
Panel: The powerful data exploration & web app framework for Python
Apache ECharts is a powerful, interactive charting and data visualization library for browser