这是indexloc提供的服务,不要输入任何密码
Skip to content

[Feature]: Multi-Tenant Proxy with Dynamic User Configuration and Webhook Telemetry #1021

@homanp

Description

@homanp

Feature Description

Overview

Implement multi-tenant support for the AI Firewall proxy to enable SaaS deployments while maintaining backward compatibility for self-hosted users. Add webhook-based telemetry for external analytics integration.

Problem Statement

Currently, the proxy uses a static vibekit.yaml configuration file, making it suitable only for single-user self-hosted deployments. To support SaaS use cases, we need:

  1. Per-user dynamic configuration loading
  2. Scalable caching layer for configuration retrieval
  3. Telemetry export to external services
  4. Backward compatibility with existing self-hosted deployments

Proposed Solution

Deployment Modes

The proxy will support two deployment modes, automatically detected via environment variables:

1. Static Mode (Self-Hosted - Default)

  • Environment: No special variables required
  • Config: Uses static vibekit.yaml file
  • Usage: Single user, traditional deployment
  • URL Pattern: https://proxy.example.com/v1/chat/completions

2. Multi-Tenant Mode (SaaS)

  • Environment: REDIS_URL and CONFIG_API_URL present
  • Config: Fetches user-specific configs from external API
  • Usage: Multiple users, SaaS deployment
  • URL Pattern: https://proxy.example.com/users/{user_id}/v1/chat/completions

Technical Architecture

User Identification

  • Extract user_id from URL path: /users/{user_id}/v1/chat/completions
  • Rewrite path for upstream requests: /v1/chat/completions

Configuration Flow

Client Request → Extract User ID → Redis Cache Check → Config API (if miss) → Cache & Route

Caching Strategy

  • L1 Cache: Redis with TTL (default 1 hour)
  • Cache Key: config:{user_id}
  • Storage Format: JSON for efficiency
  • Invalidation: TTL-based + manual via API

Telemetry Integration

  • Webhook URL: Configurable via TELEMETRY_WEBHOOK_URL
  • Format: JSON payloads with structured event data
  • Events: Request processing, config changes, errors
  • Reliability: Async POST requests with timeout

Implementation Details

Environment Variables

Multi-Tenant Mode

REDIS_URL=redis://localhost:6379
CONFIG_API_URL=https://your-config-api.com
CONFIG_CACHE_TTL=3600                    # Optional: Cache TTL in seconds
TELEMETRY_WEBHOOK_URL=https://analytics.com/webhook  # Optional: Telemetry endpoint

Static Mode

VIBEKIT_CONFIG=/app/vibekit.yaml         # Optional: Config file path
TELEMETRY_WEBHOOK_URL=https://analytics.com/webhook  # Optional: Telemetry endpoint

Config API Specification

Endpoint: GET /users/{user_id}/config

Response Format:

{
  "models": [
    {
      "model_name": "gpt-4",
      "provider": "openai",
      "api_base": "https://api.openai.com"
    },
    {
      "model_name": "claude-3-sonnet",
      "provider": "anthropic", 
      "api_base": "https://api.anthropic.com"
    }
  ],
  "default": {
    "provider": "anthropic",
    "api_base": "https://api.anthropic.com"
  }
}

Error Responses:

  • 404: User config not found
  • 500: Server error

Telemetry Webhook Specification

Endpoint: POST {TELEMETRY_WEBHOOK_URL}

Payload Format:

{
  "event_type": "request_processed",
  "timestamp": "2025-08-28T10:00:00Z",
  "user_id": "user_123",              // Only in multi-tenant mode
  "model": "gpt-4",
  "provider": "openai",
  "duration_ms": 1250,
  "status": "success",
  "input_tokens": 45,
  "output_tokens": 123,
  "request_id": "req_abc123",
  "deployment_mode": "multi_tenant"   // "static" or "multi_tenant"
}

Event Types:

  • request_processed: Successful/failed request completion
  • config_cache_hit: Config found in Redis cache
  • config_cache_miss: Config fetched from API
  • config_load_failed: Unable to load user config
  • server_start: Proxy startup
  • server_stop: Proxy shutdown

Dependencies

New Rust Dependencies

redis = { version = "0.24", features = ["tokio-comp"] }
regex = "1.0"
moka = { version = "0.12", features = ["future"] }  # Optional: L1 memory cache

Deployment Examples

Railway Multi-Tenant Setup

railway add redis
railway variables set REDIS_URL=redis://...
railway variables set CONFIG_API_URL=https://config.example.com
railway variables set TELEMETRY_WEBHOOK_URL=https://analytics.example.com/webhook
railway deploy

Self-Hosted Docker

docker run -p 8080:8080 \
  -v ./vibekit.yaml:/app/vibekit.yaml \
  -e TELEMETRY_WEBHOOK_URL=https://analytics.example.com/webhook \
  superagentai/proxy

Performance Requirements

  • Cache Hit Latency: <1ms (Redis lookup)
  • Cache Miss Latency: <50ms (API fetch + Redis store)
  • Config API Timeout: 5 seconds
  • Telemetry Timeout: 1 second (non-blocking)
  • Redis Connection Pool: Reuse connections

Error Handling

  • Redis Unavailable: Log warning, fetch from API each time
  • Config API Unavailable: Return 503 Service Unavailable
  • Invalid Config: Return 500 Internal Server Error
  • Telemetry Failure: Log warning, continue processing

Backward Compatibility

  • Existing self-hosted deployments continue working unchanged
  • No breaking changes to CLI arguments or Docker image
  • Static vibekit.yaml format remains the same
  • Default behavior remains single-user static mode

Testing Requirements

  • Unit tests for user ID extraction
  • Integration tests for Redis caching
  • Mock Config API for testing
  • Load testing with 1000+ concurrent users
  • Telemetry webhook delivery verification

Documentation Updates

  • README with deployment mode examples
  • Environment variable reference
  • Config API specification
  • Telemetry webhook documentation
  • Migration guide for SaaS deployments

Success Metrics

  • Support 10,000+ concurrent users in multi-tenant mode
  • <1ms config lookup latency for cached configs
  • <50ms end-to-end request latency overhead
  • 99.9% telemetry delivery success rate
  • Zero downtime for existing self-hosted users

Timeline

  • Week 1: Core multi-tenant architecture and Redis integration
  • Week 2: Config API integration and caching logic
  • Week 3: Telemetry webhook implementation
  • Week 4: Testing, documentation, and Railway deployment guide

Labels: enhancement, multi-tenant, saas, redis, telemetry
Priority: High
Estimated Effort: 3-4 weeks

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions