How to Set Up Multi-Machine GPU Orchestration in 5 Minutes
Complete step-by-step tutorial: Install rbee, configure hives, deploy workers, download models. Multi-machine GPU orchestration in 5 minutes. No Kubernetes required.
Prerequisites: What You Need
Before you begin, make sure you have:
- 2+ machines with GPUs (NVIDIA, AMD, or Apple Silicon)
- SSH access to all machines
- Linux, macOS, or Windows (WSL2)
- Basic command-line knowledge
Step 1: Install rbee on Your Main Machine
Install rbee on your main machine (the "keeper"):
# Download and install rbeecurl -fsSL https://rbee.dev/install.sh | sh
# Verify installationrbee --versionStep 2: Configure Your Hives with SSH
A "hive" in rbee terminology is a machine that can run workers. Configure your hives using SSH connection details:
# Add your first machinerbee hive add gaming-pc 192.168.1.100 \ --user your-username \ --ssh-key ~/.ssh/id_rsa
# Add a Mac Studiorbee hive add mac-studio 192.168.1.101 \ --user your-username
# Add a remote serverrbee hive add cloud-gpu ssh.example.com \ --user ubuntu \ --port 2222
# List all configured hivesrbee hive listrbee reads from your ~/.ssh/config file. If you already have SSH aliases set up, you can reference them directly:
rbee hive add gaming-pc my-ssh-aliasStep 3: Deploy Workers to Your Hives
rbee will automatically deploy workers to your hives, but you can also install manually:
# SSH into each machine and run:curl -fsSL https://rbee.dev/install-worker.sh | sh
# Or let rbee deploy automatically:rbee worker deploy --all-hivesStep 4: Download AI Models to Your Hives
Download a model to one or more of your machines:
# Download Llama 3.1 8B to all hivesrbee model download llama-3.1-8b --all-hives
# Or download to specific hivesrbee model download llama-3.1-70b \ --hive gaming-pc \ --hive mac-studio
# List available modelsrbee model listStep 5: Start the Keeper (Central Coordinator)
The keeper is the central coordinator that manages workers and routes requests:
# Start the keeper (runs on port 7833 by default)rbee keeper start
# Or run in the backgroundrbee keeper start --daemon
# Check statusrbee keeper statusStep 6: Start Workers on All Hives
Start workers on your hives to begin processing requests:
# Start workers on all hivesrbee worker start --all-hives
# Or start on specific hivesrbee worker start --hive gaming-pc --hive mac-studio
# Check worker statusrbee worker listStep 7: Test Your Setup with API Requests
Make your first API request to verify everything works:
# Test with curlcurl http://localhost:7833/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "llama-3.1-8b", "messages": [ {"role": "user", "content": "Hello! What can you do?"} ] }'
# Or use the rbee CLIrbee chat "Tell me about GPU orchestration"Configuration File: ~/.config/rbee/config.toml
All configuration is stored in ~/.config/rbee/config.toml. Here's an example:
[keeper]host = "0.0.0.0"port = 7833log_level = "info"
[[hives]]name = "gaming-pc"host = "192.168.1.100"user = "your-username"ssh_key = "~/.ssh/id_rsa"
[[hives]]name = "mac-studio"host = "192.168.1.101"user = "your-username"
[routing]strategy = "round-robin" # or "least-loaded", "custom"Next Steps: Advanced Features
Now that you have rbee running, explore these advanced features:
- Custom routing: Use Rhai scripts for A/B testing and canary deployments
- Model management: Download and manage multiple models across your system
- Monitoring: Set up the web UI for real-time monitoring
- GDPR compliance: Enable audit logging and data retention policies
Troubleshooting: Common Issues
Workers not connecting?
Check SSH connectivity: ssh user@host
Verify firewall rules allow port 7833
Model not found?
Run rbee model list to see available models
Download with rbee model download MODEL_NAME
Slow inference?
Check GPU utilization: rbee worker stats
Consider using smaller quantized models (Q4, Q5)
You've successfully set up multi-machine GPU orchestration with rbee! Your GPUs are now unified into a single system, accessible via an OpenAI-compatible API.
What's next? Explore advanced routing, set up monitoring, or integrate rbee into your existing applications.
Get Help: Community & Support
Need assistance? We're here to help:
Ready to orchestrate your GPU system?
You've completed the setup tutorial. Now explore advanced features like custom routing, monitoring, and GDPR compliance. Free core (GPL-3.0) + optional premium modules (€129-€499 lifetime).