Stars
DuckDB is an analytical in-process SQL database management system
Apache Doris is an easy-to-use, high performance and unified analytics database.
Apache Spark - A unified analytics engine for large-scale data processing
JDK main-line development https://openjdk.org/projects/jdk
ClickHouse® is a real-time analytics database management system
Spring Boot helps you to create Spring-powered, production-grade applications and services with absolute minimum fuss.
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
An open-source AI agent that brings the power of Gemini directly into your terminal.
A composable and fully extensible C++ execution engine library for data management systems.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Production-ready platform for agentic workflow development.
The java implementation of Apache Dubbo. An RPC and microservice framework.
Confluent Schema Registry for Kafka
TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Apache Druid: a high performance real-time analytics database.
Apache Fluss is a streaming storage built for real-time analytics.
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
Upserts, Deletes And Incremental Processing on Big Data.
An Open Standard for lineage metadata collection
The official home of the Presto distributed SQL query engine for big data