Starred repositories
Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems ca…
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot. Simple. Fast. Scalable.
AutoMQ is a diskless Kafka® on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.
Kubernetes-native platform to run massively parallel data/streaming jobs
High performance Rust stream processing engine seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis.
Harness Open Source is an end-to-end developer platform with Source Control Management, CI/CD Pipelines, Hosted Developer Environments, and Artifact Registries.
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on und…
Java demos for the General SQL Parser library
Document, sample code and other materials for SQLFlow
Apache Doris is an easy-to-use, high performance and unified analytics database.
Real-time observability system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building capabilities.
🐳 A most popular sql audit platform for mysql
阿里云计算平台DataWorks(https://help.aliyun.com/document_detail/137663.html) 团队出品,为监控而生的数据库连接池
😎 Awesome lists about all kinds of interesting topics
Interact with your SQL database, Natural Language to SQL using LLMs
Chat-based SQL Client and Editor for the next decade
SuperSonic is the next-generation AI+BI platform that unifies Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
A powerful tool for creating fine-tuning datasets for LLM
🔥🔥🔥AI-driven database tool and SQL client, The hottest GUI client, supporting MySQL, Oracle, PostgreSQL, DB2, SQL Server, DB2, SQLite, H2, ClickHouse, and more.
The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥
Production-ready platform for agentic workflow development.