Open Source Linux Data Management Systems

Data Management Systems for Linux

View 529 business solutions

Browse free open source Data Management systems and projects for Linux below. Use the toggles on the left to filter open source Data Management systems by OS, license, language, programming language, and project status.

  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    TurboVNC

    TurboVNC

    High-speed, 3D-friendly, TightVNC-compatible remote desktop software

    TurboVNC is a high-performance, enterprise-quality version of VNC based on TightVNC, TigerVNC, and X.org. It contains a variant of Tight encoding that is tuned for maximum performance and compression with 3D applications (VirtualGL), video, and other image-intensive workloads. TurboVNC, in combination with VirtualGL, provides a complete solution for remotely displaying 3D applications with interactive performance. TurboVNC's high-speed encoding methods have been adopted by TigerVNC and libvncserver, and TurboVNC is also compatible with any other TightVNC derivative. TurboVNC forked from TightVNC in 2004 and still covers all of the TightVNC 1.3.x features, but TurboVNC contains numerous feature enhancements and bug fixes relative to TightVNC, and it compresses 3D and video workloads much better than TightVNC while using generally only 5-20% of the CPU time of the latter. Using non-default settings, TurboVNC can also be made to compress 2D workloads as "tightly" as TightVNC.
    Leader badge
    Downloads: 128,844 This Week
    Last Update:
    See Project
  • 2
    Azure Data Studio

    Azure Data Studio

    A data management tool that enables working with other SQL tools

    Azure Data Studio is a cross-platform database tool for data professionals who use on-premises and cloud data platforms on Windows, macOS, and Linux. Azure Data Studio offers a modern editor experience with IntelliSense, code snippets, source control integration, and an integrated terminal. It's engineered with the data platform user in mind, with the built-in charting of query result sets and customizable dashboards. Use Azure Data Studio to query, design, and manage your databases and data warehouses wherever they are, on your local computer or in the cloud. Azure Data Studio offers a modern, keyboard-focused SQL coding experience that makes your everyday tasks easier with built-in features, such as multiple tab windows, a rich SQL editor, IntelliSense, keyword completion, code snippets, code navigation, and source control integration (Git).
    Downloads: 463 This Week
    Last Update:
    See Project
  • 3
    Gephi

    Gephi

    Gephi the open graph Viz platform

    Gephi is the leading visualization and exploration software for all kinds of graphs and networks. Gephi is open-source and free. Gephi is an award-winning open-source platform for visualizing and manipulating large graphs. It runs on Windows, Mac OS X and Linux. Localization is available in English, French, Spanish, Japanese, Russian, Brazilian Portuguese, Chinese, Czech and German. Fast Powered by a built-in OpenGL engine, Gephi is able to push the envelope with very large networks. Visualize networks up to a million elements. All actions (e.g. layout, filter, drag) run in real-time. Simple Easy to install and get started. An UI that is centered around the visualization. Like Photoshop™ for graphs. Modular Extend Gephi with plug-ins. The architecture is built on top of Apache Netbeans Platform and can be extended or reused easily through well-written APIs.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 4
    Serial Studio

    Serial Studio

    Multi-purpose serial data visualization & processing

    Serial Studio is a simple, multi-platform, and multi-purpose serial data visualization program that allows embedded developers to visualize, analyze, and present data generated from their projects and devices while avoiding the need to write project-specific visualization software. Over my many CanSat-based competitions, I found myself writing and maintaining several Ground Station software for each program. However, I decided that it would be easier and more sustainable to define one flexible Ground Station Software that lets developers define how each CanSat presents data using an extensible communication protocol for easy data visualization. Developers can also use Serial Studio for almost any data acquisition and visualization project outside of CanSat, now supporting data retrieval from hardware serial ports, software serial ports, MQTT, and network sockets (TCP/UDP). You can download and install Serial Studio for your preferred platform.
    Downloads: 26 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    lakeFS

    lakeFS

    lakeFS - Git-like capabilities for your object storage

    Increase data quality and reduce the painful cost of errors. Data engineering best practices using git-like operations on data. lakeFS is an open-source data version control for data lakes. It enables zero-copy Dev / Test isolated environments, continuous quality validation, atomic rollback on bad data, reproducibility, and more. Data is dynamic, it changes over time. Dealing with that without a data version control system is error-prone and labor-intensive. With lakeFS, your data lake is version controlled and you can easily time-travel between consistent snapshots of the lake. Easier ETL testing - test your ETLs on top of production data, in isolation, without copying anything. Safely experiment and test on full production data. Easily Collaborate on production data with your team. Automate data quality checks within data pipelines.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    TTA Lossless Audio Codec
    Lossless compressor for multichannel 8,16 and 24 bits audio data, with the ability of password data protection. Being 'lossless' means that no data/quality is lost in the compression - when uncompressed, the data will be identical to the original.
    Leader badge
    Downloads: 95 This Week
    Last Update:
    See Project
  • 7
    Conduit

    Conduit

    Conduit streams data between data stores. Kafka Connect replacement

    Conduit is a data streaming tool written in Go. It aims to provide the best user experience for building and running real-time data pipelines. Conduit comes with batteries included, it provides a UI, common connectors, processors and observability data out of the box. Sync data between your production systems using an extensible, event-first experience with minimal dependencies that fit within your existing workflow. Eliminate the multi-step process you go through today. Just download the binary and start building. Conduit connectors give you the ability to pull and push data to any production datastore you need. If a datastore is missing, the simple SDK allows you to extend Conduit where you need it. Conduit pipelines listen for changes to a database, data warehouse, etc., and allows your data applications to act upon those changes in real-time. Run it in a way that works for you; use it as a standalone service or orchestrate it within your infrastructure.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 8
    EtherApe
    EtherApe is a graphical network monitor modeled after etherman. Featuring Ethernet, IP, TCP, FDDI, Token Ring and wireless modes, it displays network activity graphically. Hosts and links change in size with traffic. Color coded protocols display.
    Downloads: 89 This Week
    Last Update:
    See Project
  • 9
    Logstash

    Logstash

    Centralize, transform and stash your data

    Logstash is a server-side data processing pipeline that dynamically ingests data from numerous sources, transforms it, and ships it to your favorite “stash” regardless of format or complexity. It supports and ingests data of all shapes, sizes and sources, dynamically transforms and prepares this data, and transports it to the output of your choice. Logstash is extensible, with over 200 plugins available to let you create and configure your pipeline how you choose.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust multi-tenant, multi-tool engine that scales technically and organizationally. Dagster as a unified control plane: The ‘single plane of glass’ data teams love to use. Rein in the chaos and maintain control over your data as the complexity scales. Centralize your metadata in one tool with built-in observability, diagnostics, cataloging, and lineage. Spot any issues and identify performance improvement opportunities.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    MyCAT

    MyCAT

    Active, high-performance open source database middleware

    MyCAT is an Open-Source software, “a large database cluster” oriented to enterprises. MyCAT is an enforced database which is a replacement for MySQL and supports transaction and ACID. Regarded as MySQL cluster of enterprise database, MyCAT can take the place of expensive Oracle cluster. MyCAT is also a new type of database, which seems like a SQL Server integrated with the memory cache technology, NoSQL technology and HDFS big data. And as a new modern enterprise database product, MyCAT is combined with the traditional database and new distributed data warehouse. In a word, MyCAT is a fresh new middleware of database. MyCAT ’s objective is to smoothly migrate the current stand-alone database and applications to cloud side with low cost and to solve the bottleneck problem caused by the rapid growth of data storage and business scale.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    OSHMI - Open Substation HMI

    OSHMI - Open Substation HMI

    SCADA HMI for substations, IoT and automation applications

    Now with IEC61850 support! This project combines existing open source projects and tools to create a very capable, mobile and cloud-friendly HMI system that can rival proprietary software. This approach makes it possible to join forces of each project (Chromium, SVG/HTML5, PHP, Lua, SQLite, Inkscape, Lib61850, OpenDNP3, Nginx, Vega, PostgreSQL, Grafana,…) to achieve a great set of open, evergreen, modular and customizable tools for building great HMIs for automation projects. This is not a toy project! It's been actually used in dozens of substations up to 230kV level and also in control centers with configurations up to 70.000 tags. Feel free to ask questions in the "Discussion" section. Help sponsoring OSHMI here https://github.com/sponsors/riclolsen. >>> Have a look also at my new SCADA project here: https://github.com/riclolsen/json-scada
    Leader badge
    Downloads: 49 This Week
    Last Update:
    See Project
  • 13
    Apache HBase

    Apache HBase

    Get random, realtime read/write access to your Big Data

    Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables, billions of rows X millions of columns, atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable. A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options. Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX. Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Arize Phoenix

    Arize Phoenix

    Uncover insights, surface problems, monitor, and fine tune your LLM

    Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative) are an amazing technology that will power many of future ML use cases. A large set of these technologies are being deployed into businesses (the real world) in what we consider a production setting.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    ODD Platform

    ODD Platform

    First open-source data discovery and observability platform

    Unlock the power of big data with OpenDataDiscovery Platform. Experience seamless end-to-end insights, powered by unprecedented observability and trust - from ingestion to production - while building your ideal tech stack! Democratize data and accelerate insights. Find data that fits your use case and discover hints left by your peers to leverage existing knowledge. Explore tags, ownership details, links to other sources and other information to shorten and simplify data discovery phase. Forget unnerved stakeholders and wasting too much time on digging the root cause of data issues when it fails. With ODD’s automatic company-wide ingestion-to-product lineage you’ll have answers in just seconds and stakeholders won’t need to wait. Sleep well, knowing all your data is in check. Forget manual testing, days of debugging, and weeks of worrying. Know the impact of each code change with automatic testing. Enjoy lineage and alerts powered with data quality information.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    Apache RocketMQ is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability. Messaging patterns including publish/subscribe, request/reply and streaming. Financial grade transactional message. Built-in fault tolerance and high availability configuration options base on DLedger. A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. Reliable FIFO and strict ordered messaging in the same queue. Efficient pull and push consumption model. Million-level message accumulation capacity in a single queue. Multiple messaging protocols like JMS and OpenMessaging. Flexible distributed scale-out deployment architecture. Lightning-fast batch message exchange system.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    LaraAdmin

    LaraAdmin

    Open Source Laravel Admin Panel

    LaraAdmin is a multi-purpose, open source Laravel Admin Panel / CMS that can be used to build an Admin Backend, Data Management Tool or CRM boilerplate for Laravel. It offers a complete set of utilities and features, including Advanced CRUD Generation, Module Manager, Schema Manager, Backups and Workflows. LaraAdmin controls your Models, Data and their Role Permissions without touching any code at all, saving you time and effort and allowing you to focus on Data representation rather than Data Handling. It's got a modular architecture that makes Data management a breeze; generates CRUD methods and views which are very easy to customize; and is very easy to install-- all it takes is one command.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    HPCC Systems

    HPCC Systems

    End-to-end big data in a massively scalable supercomputing platform.

    HPCC Systems® (www.hpccsystems.com) from LexisNexis® Risk Solutions is a proven, open source solution for Big Data insights that can be implemented by businesses of all sizes. With HPCC Systems, developers can design applications with Big Data at their core, enabling businesses to better analyze and understand data at scale, improving business time to results and decisions. HPCC Systems offers a consistent data-centric programming language, two processing platforms and a single, complete end-to-end architecture for efficient processing. Read our blog (http://hpccsystems.com/blog ), or connect with us on Twitter (@hpccsystems), Facebook (https://www.facebook.com/hpccsystems ) and LinkedIn (http://www.linkedin.com/company/hpcc-systems) HPCC Systems is available on AWS & can be configured through the Instant Cloud Solution.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 19
    Broot

    Broot

    A new way to see and navigate directory trees

    Get an overview of a directory, even a big one. That's what makes it usable where the old tree command would produce pages of output. Hit alt/enter and you're back to the terminal in the desired location. This way, you can navigate to a directory with the minimum amount of keystrokes, even if you don't exactly remember where it is. Broot is fast and doesn't block (any keystroke interrupts the current search to start the next one). Never lose track of file hierarchy while you search. Broot tries to select the most relevant file. You can still go from one match to another one using tab or arrow keys. You may also search with a regular expression. To do this, add a / before the pattern. You may also apply logical operators or combine patterns, for example searching test in all files except json ones could be !/json$/&c/test and searching carg both in file names and file contents would be carg|c/carg.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Riemann

    Riemann

    A network event stream processing system, in Clojure

    Riemann aggregates events from your servers and applications with a powerful stream processing language. Send an email for every exception in your app. Track the latency distribution of your web app. See the top processes on any host, by memory and CPU. Combine statistics from every Riak node in your cluster and forward to Graphite. Track user activity from second to second. Riemann streams are just functions which accept an event. Events are just structs with some common fields like :host and :service You can use dozens of built-in streams for filtering, altering, and combining events, or write your own. Since Riemann's configuration is a Clojure program, its syntax is concise, regular, and extendable. Configuration-as-code minimizes boilerplate and gives you the flexibility to adapt to complex situations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the SDV project, or input your own data. Choose from any of the SDV synthesizers and baselines. Or write your own custom machine learning model. In addition to performance and memory usage, you can also measure synthetic data quality and privacy through a variety of metrics. Install SDGym using pip or conda. We recommend using a virtual environment to avoid conflicts with other software on your device.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    fluentbit

    fluentbit

    Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX

    Fluent Bit is a super-fast, lightweight, and highly scalable logging and metrics processor and forwarder. It is the preferred choice for cloud and containerized environments. A robust, lightweight, and portable architecture for high throughput with low CPU and memory usage from any data source to any destination. Proven across distributed cloud and container environments. Highly available with I/O handlers to store data for disaster recovery. Granular management of data parsing and routing. Filtering and enrichment to optimize security and minimize cost. The lightweight, asynchronous design optimizes resource usage: CPU, memory, disk I/O, network. No more OOM errors! Integration with all your technology, cloud-native services, containers, streaming processors, and data backends. Fully event-driven design leverages the operating system API for performance and reliability. All operations to collect and deliver data are asynchronous.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    memphis

    memphis

    Next-Generation Event Processing Platform

    Memphis enables building modern queue-based applications that require large volumes of streamed and enriched data, modern protocols, zero ops, up to x9 faster development, up to x46 fewer costs, and significantly lower dev time for data-oriented developers and data engineers. Queues and brokers are a mission-critical component in the modern application architecture and should be highly available and stable as possible. Provide great performance while maintaining efficient resource consumption. Increase observability, integrations with 3rd-party monitoring tools, real-time notifications, stream lineage, and therefore troubleshooting time reduction. Enable rapid development and ultra-short time-to-production.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    jpcap is a set of Java classes which provide an interface and system for network packet capture. A protocol library and tool for visualizing network traffic is included. jpcap utilizes libpcap, a widely deployed system library for packet capture.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB