+
Skip to content

Conversation

beengud
Copy link
Contributor

@beengud beengud commented Oct 7, 2025

Summary

Initializes Grafana Mimir distributed metrics storage in mop-central environment with single-tenant mode and disabled zone-aware replication.

This PR adds comprehensive Mimir support including:

  • ✅ Mimir deployed in distributed mode with separate components
  • ✅ MinIO enabled for S3-compatible object storage
  • ✅ Single-tenant mode (multitenancy disabled)
  • ✅ Zone-aware replication disabled for ingester and store-gateway
  • ✅ Mimir datasource added to Grafana
  • ✅ ServiceMonitor integration for metrics collection

Changes

  • Mimir: Configured distributed deployment with multitenancy_enabled: false and zone_awareness_enabled: false for ingester/store-gateway
  • MinIO: Enabled for local S3-compatible storage required by distributed mode
  • Grafana: Added Mimir as Prometheus-compatible datasource
  • Infrastructure Fixes:
    • Fixed config.jsonnet to use hidden field (::) preventing invalid K8s object error
    • Fixed Loki to use SingleBinary mode with filesystem storage
    • Updated spec.json to orbstack context with monitoring namespace
    • Added explicit monitoring namespace creation

Architecture

Mimir distributed components deployed:

  • Distributor: Receives metrics from Prometheus remote write
  • Ingester: Stores metrics in memory and flushes to MinIO
  • Querier: Queries metrics from ingesters and long-term storage
  • Query Frontend: Provides query API and caching
  • Query Scheduler: Manages query queuing and execution
  • Ruler: Evaluates recording and alerting rules
  • Store Gateway: Queries long-term storage blocks
  • Compactor: Compacts and deduplicates blocks
  • MinIO: S3-compatible object storage for blocks

Test plan

  • Configured Mimir with distributed mode
  • Disabled multitenancy (single-tenant)
  • Disabled zone-aware replication
  • Enabled MinIO for S3-compatible storage
  • Added Mimir datasource to Grafana
  • Fixed config/spec issues for mop-central
  • Deployed to orbstack cluster
  • Verified 15+ Mimir pods running
  • Verified MinIO deployment
  • Verified query-frontend service available

🤖 Generated with Claude Code

Configures Grafana Mimir distributed metrics storage in mop-central environment
with single-tenant mode and disabled zone-aware replication as requested.

Changes:
- **Mimir Configuration**: Deployed Mimir in distributed mode with MinIO for
  S3-compatible storage, multitenancy disabled, and zone-aware replication
  disabled for both ingester and store-gateway components.
- **Grafana Integration**: Added Mimir as Prometheus-compatible datasource in
  Grafana for metrics visualization and querying.
- **Infrastructure Fixes**:
  - Fixed config.jsonnet to use hidden field (::) to prevent invalid K8s object
  - Fixed Loki configuration to use SingleBinary mode with filesystem storage
  - Updated spec.json to use orbstack context and monitoring namespace
  - Added explicit monitoring namespace creation in main.jsonnet

Features:
- Distributed Mimir deployment with separate components (distributor, ingester,
  querier, query-frontend, query-scheduler, ruler, store-gateway, compactor)
- MinIO for local S3-compatible object storage
- Single-tenant mode (multitenancy_enabled: false)
- Zone-awareness disabled for ingester and store-gateway
- ServiceMonitor integration for Prometheus metrics collection
- Grafana datasource for metrics querying via PromQL

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@beengud beengud merged commit a8c3d16 into main Oct 7, 2025
1 check passed
Copy link

claude bot commented Oct 7, 2025

Code Review: Initialize Mimir in mop-central with distributed mode

✅ Positive Aspects

Well-structured Mimir configuration:

  • Clear separation of concerns with distributed mode components
  • Appropriate disabling of multitenancy for single-tenant use case
  • Good integration with existing observability stack (Grafana datasource, ServiceMonitor)

Infrastructure fixes:

  • Correctly fixed config.jsonnet:5 to use hidden field (::) preventing invalid K8s object generation
  • Proper namespace creation in main.jsonnet:15-19
  • Consistent use of monitoring namespace across components

🔍 Issues & Concerns

1. Configuration Redundancy (mimir.jsonnet)

The zone-aware replication is disabled in two places which may cause confusion:

// Lines 15-18 (structuredConfig)
ingester: {
  ring: {
    zone_awareness_enabled: false,
  },
},

// Lines 31-35 (chart level - redundant?)
ingester: {
  zoneAwareReplication: {
    enabled: false,
  },
},

Recommendation: Verify if both configurations are necessary. According to Mimir documentation, structuredConfig.ingester.ring.zone_awareness_enabled should be sufficient. The chart-level setting may be redundant or could override the structured config unexpectedly.

2. Namespace Duplication (main.jsonnet)

Namespaces are defined in both config.jsonnet:6-8 and main.jsonnet:15-19, with config.jsonnet being imported but its namespaces ignored:

// config.jsonnet now only defines (hidden, unused):
config:: {
  namespaces: [...]
}

// main.jsonnet redefines:
config + {
  namespaces: [...]  // This shadows config.jsonnet namespaces
}

Recommendation: Either:

  • Remove namespace definitions from config.jsonnet entirely, OR
  • Use config.config.namespaces in main.jsonnet to avoid duplication

3. Missing Mimir Storage Configuration

While MinIO is enabled and backend is set to 's3', there's no explicit S3 bucket configuration:

minio: { enabled: true },
backend: 's3',
// Missing: bucket names, access credentials, etc.

Recommendation: Add explicit storage bucket configuration:

mimir: {
  structuredConfig: {
    blocks_storage: {
      s3: {
        bucket_name: 'mimir-blocks',
        endpoint: 'mimir-minio.monitoring.svc.cluster.local:9000',
      },
    },
  },
},

4. Grafana Datasource URL Path Concerns (grafana.jsonnet:60)

The Mimir datasource URL includes /prometheus path:

url: 'http://mimir-query-frontend.monitoring.svc.cluster.local:8080/prometheus'

Recommendation: Verify this is the correct path for the mimir-distributed chart. Some Mimir configurations use:

  • /prometheus for Prometheus-compatible API
  • Root path with different port configurations

Check the chart documentation or deployed service to confirm the correct endpoint.

5. Security: No Resource Limits

Distributed Mimir can be resource-intensive, but no resource limits are defined.

Recommendation: Add resource limits for production readiness:

mimir: {
  structuredConfig: { /* ... */ },
},
ingester: {
  resources: {
    limits: { cpu: '2', memory: '4Gi' },
    requests: { cpu: '500m', memory: '2Gi' },
  },
},
// Similar for other components

6. Context Change (spec.json:8)

The spec changed from minikube to orbstack:

"contextNames": ["orbstack"]

Concern: This ties the environment to a specific developer's local context.

Recommendation: Document this requirement in README or use a more generic approach that works across development environments.

📝 Best Practices

  1. Missing retention configuration: Consider adding retention policies for Mimir:

    mimir: {
      structuredConfig: {
        limits: {
          compactor_blocks_retention_period: '30d',
        },
      },
    },
  2. No validation tests: Add validation in CI/CD:

    tk validate tanka/environments/mop-central
    tk show tanka/environments/mop-central | kubectl apply --dry-run=server -f -
  3. Documentation: Update CLAUDE.md or environment README with:

    • Mimir deployment architecture
    • Storage requirements (MinIO)
    • How to verify deployment
    • Troubleshooting steps

🎯 Summary

Priority Issues:

  1. 🔴 HIGH: Verify/add explicit Mimir S3 bucket configuration (issue Configure Grafana with no-auth access and Tilt link #3)
  2. 🟡 MEDIUM: Remove namespace duplication between config.jsonnet and main.jsonnet (issue Add External Secrets Operator for centralized secret management #2)
  3. 🟡 MEDIUM: Verify Mimir datasource URL path is correct (issue Fix missing linkerd namespace in all environments #4)
  4. 🟢 LOW: Clean up redundant zone-awareness configuration (issue Add Claude Code GitHub Workflow #1)

Overall Assessment: The PR makes solid progress on Mimir integration, but needs configuration validation and cleanup before merging. The infrastructure fixes (config.jsonnet hidden field, Loki SingleBinary mode) are well-executed.


Review generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载