How to Set Up Multi-Machine GPU Orchestration in 5 Minutes

Prerequisites: What You Need

Before you begin, make sure you have:

2+ machines with GPUs (NVIDIA, AMD, or Apple Silicon)
SSH access to all machines
Linux, macOS, or Windows (WSL2)
Basic command-line knowledge

5 min

Setup time

3 steps

To first inference

SSH only

No complex tools

Step 1: Install rbee on Your Main Machine

Install rbee on your main machine (the "keeper"):

Install rbee

# Download and install rbee
curl -fsSL https://rbee.dev/install.sh | sh

# Verify installation
rbee --version

Step 2: Configure Your Hives with SSH

A "hive" in rbee terminology is a machine that can run workers. Configure your hives using SSH connection details:

Configure Hives

# Add your first machine
rbee hive add gaming-pc 192.168.1.100 \
  --user your-username \
  --ssh-key ~/.ssh/id_rsa

# Add a Mac Studio
rbee hive add mac-studio 192.168.1.101 \
  --user your-username

# Add a remote server
rbee hive add cloud-gpu ssh.example.com \
  --user ubuntu \
  --port 2222

# List all configured hives
rbee hive list

💡 Pro Tip: SSH Config

rbee reads from your ~/.ssh/config file. If you already have SSH aliases set up, you can reference them directly:

rbee hive add gaming-pc my-ssh-alias

Step 3: Deploy Workers to Your Hives

rbee will automatically deploy workers to your hives, but you can also install manually:

Deploy Workers

# SSH into each machine and run:
curl -fsSL https://rbee.dev/install-worker.sh | sh

# Or let rbee deploy automatically:
rbee worker deploy --all-hives

Step 4: Download AI Models to Your Hives

Download a model to one or more of your machines:

Download Models

# Download Llama 3.1 8B to all hives
rbee model download llama-3.1-8b --all-hives

# Or download to specific hives
rbee model download llama-3.1-70b \
  --hive gaming-pc \
  --hive mac-studio

# List available models
rbee model list

Step 5: Start the Keeper (Central Coordinator)

The keeper is the central coordinator that manages workers and routes requests:

Start Keeper

# Start the keeper (runs on port 7833 by default)
rbee keeper start

# Or run in the background
rbee keeper start --daemon

# Check status
rbee keeper status

Step 6: Start Workers on All Hives

Start workers on your hives to begin processing requests:

Start Workers

# Start workers on all hives
rbee worker start --all-hives

# Or start on specific hives
rbee worker start --hive gaming-pc --hive mac-studio

# Check worker status
rbee worker list

Step 7: Test Your Setup with API Requests

Make your first API request to verify everything works:

Test Your Setup

# Test with curl
curl http://localhost:7833/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-8b",
    "messages": [
      {"role": "user", "content": "Hello! What can you do?"}
    ]
  }'

# Or use the rbee CLI
rbee chat "Tell me about GPU orchestration"

Configuration File: ~/.config/rbee/config.toml

All configuration is stored in ~/.config/rbee/config.toml. Here's an example:

config.toml

[keeper]
host = "0.0.0.0"
port = 7833
log_level = "info"

[[hives]]
name = "gaming-pc"
host = "192.168.1.100"
user = "your-username"
ssh_key = "~/.ssh/id_rsa"

[[hives]]
name = "mac-studio"
host = "192.168.1.101"
user = "your-username"

[routing]
strategy = "round-robin"  # or "least-loaded", "custom"

Next Steps: Advanced Features

Now that you have rbee running, explore these advanced features:

Custom routing: Use Rhai scripts for A/B testing and canary deployments
Model management: Download and manage multiple models across your system
Monitoring: Set up the web UI for real-time monitoring
GDPR compliance: Enable audit logging and data retention policies

Troubleshooting: Common Issues

⚠️ Common Issues

Workers not connecting?

Check SSH connectivity: ssh user@host
Verify firewall rules allow port 7833

Model not found?

Run rbee model list to see available models
Download with rbee model download MODEL_NAME

Slow inference?

Check GPU utilization: rbee worker stats
Consider using smaller quantized models (Q4, Q5)

✅ Congratulations! 🎉

You've successfully set up multi-machine GPU orchestration with rbee! Your GPUs are now unified into a single system, accessible via an OpenAI-compatible API.

What's next? Explore advanced routing, set up monitoring, or integrate rbee into your existing applications.

Get Help: Community & Support

Need assistance? We're here to help:

Ready to orchestrate your GPU system?

You've completed the setup tutorial. Now explore advanced features like custom routing, monitoring, and GDPR compliance. Free core (GPL-3.0) + optional premium modules (€129-€499 lifetime).

⭐ Star on GitHub 📚 Full Documentation 💎 View Pricing

📖 Continue Reading

→ Introducing rbee: Why We Built It → Running NVIDIA, Apple, and AMD GPUs Together → Advanced Routing with Rhai Scripting → GDPR Compliance for Self-Hosted AI