Factor House Local v2.1: Introducing Full-Stack Observability & Data Lineage
We are excited to announce the release of Factor House Local v2.1, a significant update that brings comprehensive observability and data lineage capabilities to our suite of pre-configured Docker Compose environments. This release introduces a brand new, out-of-the-box Observability Stack featuring Marquez (the reference implementation of OpenLineage) alongside the industry-standard Prometheus, Grafana, and Alertmanager.
This enhancement moves beyond simply running data platforms, allowing you to gain deep, actionable insights into both system health and data provenance.
Release Highlights
🚀 Introducing the Centralized Observability & Data Lineage Stack
The cornerstone of this release is a new, pre-configured stack designed to provide a unified view of your entire data ecosystem. It seamlessly integrates two critical aspects of modern data platform management: data lineage and systems monitoring.
- Benefit: Holistic Visibility. This new stack provides a single pane of glass to understand both the "what" and the "how" of your data pipelines. By combining data lineage with performance metrics, you get a complete picture that accelerates development and simplifies debugging.
- Automated Data Lineage: Powered by Marquez and OpenLineage, the stack automatically captures metadata from Flink and Spark jobs. This allows you to answer critical questions like "What downstream services will this change affect?" or "Where did this bad data originate?" by visually tracing data's journey.
- Centralized System Monitoring: The industry-standard Prometheus, Grafana, and Alertmanager suite provides robust, real-time monitoring. You can track resource utilization, monitor application performance, visualize trends on pre-built dashboards, and set up proactive alerts to identify issues before they become critical.
Core Environments
This release includes the following updated and refined local development stacks:
- Kafka Development & Monitoring with Kpow: A robust, 3-node Apache Kafka environment including Schema Registry, Kafka Connect, and the Kpow UI/API for enterprise-grade observability and management.
- Unified Analytics Platform with Flex, Flink, Spark, Iceberg & HMS: A comprehensive lakehouse environment featuring Flink, Spark, Iceberg, Hive Metastore, PostgreSQL (CDC-ready), and MinIO (S3), managed with the Flex UI.
- Apache Pinot Real-Time OLAP Cluster: A real-time distributed OLAP datastore designed for ultra-low-latency, user-facing analytics and dashboards.
- NEW! Centralized Observability & Data Lineage: A complete monitoring and data provenance solution featuring Marquez, Prometheus, Grafana, and Alertmanager to provide a unified view of your entire data platform's health and data flows.