Stars
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apereo CAS - Identity & Single Sign On for all earthlings and beyond.
DataX集成可视化页面,选择数据源即可一键生成数据同步任务,支持RDBMS、Hive、HBase、ClickHouse、MongoDB等数据源,批量创建RDBMS数据同步任务,集成开源调度系统,支持分布式、增量同步数据、实时查看运行日志、监控执行器资源、KILL运行进程、数据源信息加密等。
REST job server for Apache Spark
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
A simple expressive web framework for java. Spark has a kotlin DSL https://github.com/perwendel/spark-kotlin
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data 📊
TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
An easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
《Designing Data-Intensive Application》DDIA中文翻译
crossoverJie / JGrowing
Forked from javagrowing/JGrowingJava is Growing up but not only Java。Java成长路线,但学到不仅仅是Java。
👨🎓 Java Core Sprout : basic, concurrent, algorithm
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Livy is an open source REST interface for interacting with Apache Spark from anywhere
给flink开发的web系统。支持页面上定义udf,进行sql和jar任务的提交;支持source、sink、job的管理;可以管理openshift上的flink集群
JQuery multiselect plugin based on Twitter Bootstrap.
基于Spring+SpringMVC+Mybatis分布式敏捷开发系统架构,提供整套公共微服务服务模块:集中权限管理(单点登录)、内容管理、支付中心、用户管理(支持第三方登录)、微信平台、存储系统、配置中心、日志分析、任务和通知等,支持服务治理、监控和追踪,努力为中小型企业打造全方位J2EE企业级开发解决方案。
A multipurpose plugin for alert, confirm & dialog, with extended features.
Design patterns implemented in Java