chgl

🧊

chgl

🧊

42 followers · 31 following

Achievements

x2 x3

Achievements

x2 x3

Stars

🪠 Data Integration

115 repositories

ConduitIO / conduit

Conduit streams data between data stores. Kafka Connect replacement. No JVM required.

Go 401 50 Updated Nov 27, 2024

jitsucom / jitsu

Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days

TypeScript 4,125 295 Updated Nov 27, 2024

dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.

Python 11,938 1,493 Updated Nov 27, 2024

orchest / orchest

Build data pipelines, the easy way 🛠️

TypeScript 4,083 259 Updated Jun 6, 2023

apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.

Java 5,463 2,430 Updated Nov 27, 2024

OpenMined / PipelineDP

PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.

Python 276 77 Updated Oct 17, 2024

apache / iceberg

Apache Iceberg

Java 6,514 2,252 Updated Nov 27, 2024

projectnessie / nessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

Java 1,044 130 Updated Nov 27, 2024

GoogleCloudPlatform / healthcare-data-harmonization

This is an engine that converts data of one structure to another, based on a configuration file which describes how. There is an accompanying syntax to make writing mappings easier and more robust.

Java 213 66 Updated Nov 21, 2024

open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…

TypeScript 5,631 1,056 Updated Nov 27, 2024

microsoft / FHIR-Converter

Conversion utility to translate legacy data formats into FHIR

Liquid 413 180 Updated Nov 26, 2024

nocodb / nocodb

🔥 🔥 🔥 Open Source Airtable Alternative

TypeScript 49,935 3,427 Updated Nov 27, 2024

adobe / S3Mock

A simple mock implementation of the AWS S3 API startable as Docker image, TestContainer, JUnit 4 rule, JUnit Jupiter extension or TestNG listener

Kotlin 844 181 Updated Nov 21, 2024

capitalone / DataProfiler

What's in your data? Extract schema, statistics and entities from datasets

Python 1,434 163 Updated Nov 13, 2024

apache / camel-karavan

Apache Camel Karavan a Low-code Data Integration Platform

TypeScript 454 157 Updated Nov 27, 2024

dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Python 10,013 1,634 Updated Nov 27, 2024

kwai / blaze

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

Rust 1,302 121 Updated Nov 26, 2024

neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.

Rust 15,276 445 Updated Nov 27, 2024

pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Rust 30,562 1,979 Updated Nov 27, 2024

redpanda-data / connect

Fancy stream processing made operationally mundane

Go 8,147 840 Updated Nov 26, 2024

raystack / dagger

Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.

Java 267 41 Updated Aug 29, 2023

crate / crate

CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.

Java 4,127 567 Updated Nov 27, 2024

redpanda-data / console

Redpanda Console is a developer-friendly UI for managing your Kafka/Redpanda workloads. Console gives you a simple, interactive approach for gaining visibility into your topics, masking data, manag…

TypeScript 3,836 352 Updated Nov 27, 2024

OpenLineage / OpenLineage

An Open Standard for lineage metadata collection

Java 1,777 309 Updated Nov 27, 2024

treeverse / lakeFS

lakeFS - Data version control for your data lake | Git for data

Go 4,461 359 Updated Nov 26, 2024

masesgroup / KNet

KNet is a comprehensive .NET suite for Apache Kafka™ providing all features: Producer, Consumer, Admin, Streams, Connect, backends (ZooKeeper and Kafka)

C# 40 6 Updated Nov 25, 2024

provectus / kafka-ui

Open-Source Web UI for Apache Kafka Management

Java 9,878 1,195 Updated Jul 26, 2024

infinyon / fluvio

Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.

Rust 3,901 489 Updated Nov 26, 2024

cloudera / hue

Open source SQL Query Assistant service for Databases/Warehouses

JavaScript 1,182 372 Updated Nov 27, 2024

apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

Java 12,791 3,294 Updated Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chgl

Achievements

Achievements

Block or report chgl

🪠 Data Integration

ConduitIO / conduit

jitsucom / jitsu

dagster-io / dagster

orchest / orchest

apache / hudi

OpenMined / PipelineDP

apache / iceberg

projectnessie / nessie

GoogleCloudPlatform / healthcare-data-harmonization

open-metadata / OpenMetadata

microsoft / FHIR-Converter

nocodb / nocodb

adobe / S3Mock

capitalone / DataProfiler

apache / camel-karavan

dbt-labs / dbt-core

kwai / blaze

neondatabase / neon

pola-rs / polars

redpanda-data / connect

raystack / dagger

crate / crate

redpanda-data / console

OpenLineage / OpenLineage

treeverse / lakeFS

masesgroup / KNet

provectus / kafka-ui

infinyon / fluvio

cloudera / hue

apache / doris