Model Context Protocol (MCP) server for Ray distributed computing. Enables LLM agents to manage Ray clusters, submit jobs, and monitor workloads through natural language prompts.
- Single Tool:
ray
with automatic operation detection - Natural Language Interface: Single prompt parameter per tool
- Kubernetes-Only: Focused on KubeRay, GKE, and AWS EKS clusters
- Intelligent Routing: Direct cloud provider and operation detection
uv add ray-mcp
Add to your MCP client configuration:
{
"mcpServers": {
"ray-mcp": {
"command": "uv",
"args": ["run", "ray-mcp"],
"cwd": "/path/to/ray-mcp"
}
}
}
# Job operations
ray: "submit job with script train.py"
ray: "list all running jobs"
ray: "get logs for job raysubmit_123"
# Service operations
ray: "deploy service with inference model serve.py"
ray: "list all services"
ray: "scale service model-api to 3 replicas"
# Cloud providers
ray: "authenticate with GCP project ml-experiments"
ray: "list all GKE clusters"
ray: "authenticate with AWS region us-west-2"
ray: "list all EKS clusters"
ray: "connect to cluster production-cluster"
Unified Ray management with automatic operation detection.
Job Operations:
"submit job with script train.py"
"submit job with script train.py and 2 CPUs"
"list all running jobs"
"get logs for job raysubmit_123"
"cancel job raysubmit_456"
Service Operations:
"deploy service with inference model serve.py"
"create service named image-classifier with model classifier.py"
"list all services"
"scale service model-api to 5 replicas"
"get status of service inference-engine"
"delete service recommendation-api"
Cloud Operations:
"authenticate with GCP project ml-experiments"
"list all GKE clusters"
"authenticate with AWS region us-west-2"
"list all EKS clusters"
"connect to GKE cluster production-cluster"
"connect to EKS cluster training-cluster in region us-west-2"
"check environment setup"
"create GKE cluster ml-cluster with 3 nodes"
Key Components:
- LLM Parser: Uses OpenAI to convert natural language prompts to structured actions
- Kubernetes Managers: Direct operation routing to cloud providers
- MCP Tool: Single
ray
tool with automatic operation routing
# Install with GKE support
uv add "ray-mcp[gke]"
# Set up authentication
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
# Install with EKS support
uv add "ray-mcp[eks]"
# Set up authentication
export AWS_ACCESS_KEY_ID="your_access_key"
export AWS_SECRET_ACCESS_KEY="your_secret_key"
export AWS_DEFAULT_REGION="us-west-2"
# Install KubeRay operator
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/release-0.8/deploy/kuberay-operator.yaml
# LLM Processing Configuration (Required)
export OPENAI_API_KEY="your_api_key_here" # Required for natural language parsing
# LLM Processing Configuration (Optional)
export LLM_MODEL="gpt-3.5-turbo" # OpenAI model for prompt processing
# Output and Logging
export RAY_MCP_ENHANCED_OUTPUT=true # Enhanced LLM-friendly responses
export RAY_MCP_LOG_LEVEL=INFO # Logging level (DEBUG, INFO, WARNING, ERROR)
# Ray Configuration
export RAY_DISABLE_USAGE_STATS=1 # Disable Ray usage statistics
# Install development dependencies
make dev-install
# Run tests
make test-fast # Unit tests with mocking
make test # Complete test suite
# Code quality
make lint # Run linting
make format # Format code
- Python: 3.11+
- Ray: 2.47.0+
- Kubernetes: 1.20+ (for KubeRay features)
Optional:
- Google Cloud SDK: For GKE integration
- AWS SDK: For EKS integration
- kubectl: For Kubernetes management
Common usage patterns for Ray MCP Server (Kubernetes-only).
# Submit a job
ray: "submit job with script train.py"
# Submit job with resources
ray: "submit job with script train.py requiring 2 CPUs and 1 GPU"
# Submit job with runtime environment
ray: "submit job with script train.py and pip packages pandas numpy"
# List jobs
ray: "list all running jobs"
ray: "list jobs in namespace production"
# Get job status and logs
ray: "get status for job raysubmit_123"
ray: "get logs for job raysubmit_123"
# Cancel job
ray: "cancel job raysubmit_123"
# Deploy a service
ray: "deploy service with inference model serve.py"
# Deploy named service
ray: "create service named image-classifier with model classifier.py"
# List services
ray: "list all services"
ray: "list services in namespace production"
# Manage services
ray: "get status of service image-classifier"
ray: "scale service model-api to 5 replicas"
ray: "get logs for service text-analyzer"
ray: "delete service old-model-service"
# Google Cloud (GKE)
ray: "authenticate with GCP project ml-experiments"
ray: "authenticate with GCP"
# Amazon Web Services (EKS)
ray: "authenticate with AWS region us-west-2"
ray: "authenticate with AWS"
# Azure (AKS)
ray: "authenticate with Azure"
# List clusters
ray: "list all GKE clusters"
ray: "list all EKS clusters"
ray: "list all AKS clusters"
# Connect to clusters
ray: "connect to GKE cluster production-cluster in us-west1-c"
ray: "connect to EKS cluster training-cluster in us-west-2"
ray: "connect to AKS cluster ml-cluster in eastus2"
# Check environment
ray: "check environment setup"
# 1. Authenticate
ray: "authenticate with GCP project my-ml-project"
# 2. Connect to cluster
ray: "connect to cluster dev-cluster in us-central1-a"
# 3. Submit test job
ray: "submit job with script test_model.py"
# 4. Check results
ray: "get logs for job raysubmit_123"
# 1. Connect to production cluster
ray: "connect to cluster production-cluster in us-west1-c"
# 2. Deploy service
ray: "create service named prod-inference with model production_model.py in namespace production"
# 3. Scale for load
ray: "scale service prod-inference to 10 replicas"
# 4. Monitor
ray: "get status of service prod-inference"
# 1. Connect to compute cluster
ray: "connect to cluster batch-cluster"
# 2. Submit batch job
ray: "submit job with script batch_processing.py requiring 8 CPUs"
# 3. Monitor progress
ray: "get status for job batch-processing-job"
# 4. Get results
ray: "get logs for job batch-processing-job"
# Development
ray: "authenticate with GCP project dev-project"
ray: "connect to cluster dev-cluster"
ray: "submit job with script experiment.py"
# Staging
ray: "authenticate with AWS region us-east-1"
ray: "connect to cluster staging-cluster"
ray: "submit job with script validation.py"
# Production
ray: "authenticate with GCP project prod-project"
ray: "connect to cluster prod-cluster"
ray: "deploy service with model final_model.py"
# CPU-intensive job
ray: "submit job with script data_processing.py requiring 4 CPUs"
# GPU training job
ray: "submit job with script gpu_training.py requiring 2 GPUs"
# Mixed workload
ray: "submit job with script hybrid_task.py requiring 2 CPUs and 1 GPU"
# Check cluster status
ray: "check environment setup"
# Get detailed logs
ray: "get error logs for job raysubmit_456"
# List failed jobs
ray: "list failed jobs"
# Restart failed job
ray: "cancel job raysubmit_456"
ray: "submit job with script fixed_model.py"
Licensed under the Apache License 2.0.