Introducing rbee: Multi-Machine GPU Orchestration Without Kubernetes

The Problem: Scattered GPUs, Expensive Solutions

We had a problem. Like many developers, we had powerful GPUs sitting across multiple machines: a gaming PC with an RTX 4090, a Mac Studio with M2 Ultra, and a few old servers with Tesla cards. Each machine could run AI models, but coordinating them was a nightmare. And the alternatives were all terrible.

❌ The Pain Points: Why Existing Solutions Fail

Ollama: Great for single machines, but can't orchestrate multiple GPUs across your network
Kubernetes + Ray/KServe: 6 months of setup, requires a dedicated DevOps team, overkill for most use cases
Cloud APIs: $100-3,000/month when you already own the hardware
vLLM: Powerful but requires Kubernetes and doesn't support heterogeneous hardware (NVIDIA + Apple + AMD)

We wanted something simple: SSH into our machines, start workers, and get one unified API. That's rbee.

What is rbee? The Solution

rbee is an open-source AI orchestration platform that turns scattered GPUs into one unified system. Think of it as "SSH + OpenAI API" for your own hardware. No Kubernetes. No Docker complexity. Just SSH.

Multi-machine orchestration

Connect multiple machines via SSH

Heterogeneous hardware

Mix NVIDIA, Apple, AMD, and CPU workers

OpenAI-compatible API

Drop-in replacement for cloud APIs

5-minute setup

No Kubernetes, no Docker complexity

User-scriptable routing

Rhai scripting for custom logic

GDPR-compliant

Built-in compliance features

Who Is This For? Homelabbers to Enterprises

Homelabbers

Turn your gaming PC, Mac Studio, and old servers into a unified AI cluster

Startups

Save $1,000+/month on cloud APIs by using your own hardware

Enterprises

GDPR-compliant AI with full data control and predictable costs

How It Works: Three Components

rbee has three main components working together:

Keeper (Central Coordinator)

Manages workers, routes requests, and exposes the OpenAI-compatible API. Runs on your main machine.

Workers (GPU Executors)

Run on each machine with a GPU. Execute model inference and report status back to the keeper.

Queen (Optional Advanced Routing)

Premium module for advanced scheduling, A/B testing, and custom routing logic via Rhai scripts.

Quick Start Example

# Install rbee
curl -fsSL https://rbee.dev/install.sh | sh

# Configure your machines (SSH-based)
rbee hive add home-pc 192.168.1.100
rbee hive add mac-studio 192.168.1.101

# Start the keeper
rbee keeper start

# Use the OpenAI-compatible API
curl http://localhost:7833/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-70b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Why SSH-Based Deployment? Universal, Simple, Secure

We chose SSH because it's universal, secure, and simple. Every developer already has SSH access to their machines. No need to learn Kubernetes, set up Docker registries, or manage complex networking.

Deployment Method Comparison

Feature	rbee (SSH)Recommended	Kubernetes	Docker Swarm	Manual Scripts
Setup Time	5 minutes	2-6 months	1-2 weeks	Varies
Complexity	Low	Very High	Medium	High
Prerequisites	SSH access	K8s cluster, DevOps expertise	Docker registry, networking	Custom tooling, maintenance

Configure your machines in ~/.config/rbee/hives.conf (like SSH config), and rbee handles the rest.

Open Source + Lifetime Pricing: No Subscriptions

rbee core is GPL-3.0 open source and free forever. We believe AI infrastructure should be accessible. Premium modules are one-time purchases (€129-€499 lifetime), not monthly subscriptions.

💎 Transparent Pricing Model: Free Forever + Optional Premium

Free Forever (GPL-3.0)

Core orchestration, multi-machine support, OpenAI API, basic routing

Premium Modules (€129-499 one-time)

Advanced scheduling, Rhai scripting, telemetry, GDPR auditing, priority support

✓ No subscriptions✓ No per-token fees✓ No vendor lock-in

Real-World Example: SaaS Startup ROI

SaaS Startup Use Case

ROI Analysis

Before rbee: Spending $800/month on OpenAI GPT-3.5 for customer support chatbot (100K requests/month)

After rbee: Deployed on 1× RTX 3090 ($800 hardware + €129 license)

Cost Breakdown:

• Hardwareone-time$800
• rbee Licenseone-time€129
• Power~$30/month

Monthly savings:$770

Annual savings:$9,240

Break-even:1.2 months

What's Next? Our Roadmap to Enterprise

We're currently in pre-launch with these milestones:

🎯

M1 (Q1 2026) - Foundation

In Progress

Core orchestration, chat models (GGUF), basic GUI, multi-machine support

🎨

M2 (Q2 2026) - Expansion

Planned

Image generation (Stable Diffusion), TTS, premium modules launch, advanced routing

🚀

M3 (Q3 2026) - Enterprise

Planned

Advanced scheduling, multi-tenant support, enterprise features, fine-tuning support

Get Started Today: Stop Overpaying for Cloud AI

rbee is available now on GitHub. Join our community, try it out, and let us know what you think. We're building this with feedback from homelabbers, startups, and enterprises. Pre-launch pricing available through Q2 2026.

Stop overpaying for cloud AI. Turn your GPUs into a unified system.

Free core (GPL-3.0) + premium modules (€129-€499 lifetime). No subscriptions. No per-token fees. No vendor lock-in. Pre-launch pricing available through Q2 2026.

⭐ Star on GitHub 📚 Setup Guide 💎 View Pricing

📖 Continue Reading

→ How to Set Up Multi-Machine GPU Orchestration → Cost Comparison: Self-Hosted vs Cloud → Running NVIDIA, Apple, and AMD GPUs Together → GDPR Compliance for Self-Hosted AI

The Problem: Scattered GPUs, Expensive Solutions

❌ The Pain Points: Why Existing Solutions Fail

Ollama: Great for single machines, but can't orchestrate multiple GPUs across your network
Kubernetes + Ray/KServe: 6 months of setup, requires a dedicated DevOps team, overkill for most use cases
Cloud APIs: $100-3,000/month when you already own the hardware
vLLM: Powerful but requires Kubernetes and doesn't support heterogeneous hardware (NVIDIA + Apple + AMD)

We wanted something simple: SSH into our machines, start workers, and get one unified API. That's rbee.

What is rbee? The Solution

Multi-machine orchestration

Connect multiple machines via SSH

Heterogeneous hardware

Mix NVIDIA, Apple, AMD, and CPU workers

OpenAI-compatible API

Drop-in replacement for cloud APIs

5-minute setup

No Kubernetes, no Docker complexity

User-scriptable routing

Rhai scripting for custom logic

GDPR-compliant

Built-in compliance features

Who Is This For? Homelabbers to Enterprises

Homelabbers

Turn your gaming PC, Mac Studio, and old servers into a unified AI cluster

Startups

Save $1,000+/month on cloud APIs by using your own hardware

Enterprises

GDPR-compliant AI with full data control and predictable costs

How It Works: Three Components

rbee has three main components working together:

Keeper (Central Coordinator)

Manages workers, routes requests, and exposes the OpenAI-compatible API. Runs on your main machine.

Workers (GPU Executors)

Run on each machine with a GPU. Execute model inference and report status back to the keeper.

Queen (Optional Advanced Routing)

Premium module for advanced scheduling, A/B testing, and custom routing logic via Rhai scripts.

Quick Start Example

# Install rbee
curl -fsSL https://rbee.dev/install.sh | sh

# Configure your machines (SSH-based)
rbee hive add home-pc 192.168.1.100
rbee hive add mac-studio 192.168.1.101

# Start the keeper
rbee keeper start

# Use the OpenAI-compatible API
curl http://localhost:7833/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-70b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Why SSH-Based Deployment? Universal, Simple, Secure

We chose SSH because it's universal, secure, and simple. Every developer already has SSH access to their machines. No need to learn Kubernetes, set up Docker registries, or manage complex networking.

Deployment Method Comparison

Feature	rbee (SSH)Recommended	Kubernetes	Docker Swarm	Manual Scripts
Setup Time	5 minutes	2-6 months	1-2 weeks	Varies
Complexity	Low	Very High	Medium	High
Prerequisites	SSH access	K8s cluster, DevOps expertise	Docker registry, networking	Custom tooling, maintenance

Configure your machines in ~/.config/rbee/hives.conf (like SSH config), and rbee handles the rest.

Open Source + Lifetime Pricing: No Subscriptions

rbee core is GPL-3.0 open source and free forever. We believe AI infrastructure should be accessible. Premium modules are one-time purchases (€129-€499 lifetime), not monthly subscriptions.

💎 Transparent Pricing Model: Free Forever + Optional Premium

Free Forever (GPL-3.0)

Core orchestration, multi-machine support, OpenAI API, basic routing

Premium Modules (€129-499 one-time)

Advanced scheduling, Rhai scripting, telemetry, GDPR auditing, priority support

✓ No subscriptions✓ No per-token fees✓ No vendor lock-in

Real-World Example: SaaS Startup ROI

SaaS Startup Use Case

ROI Analysis

Before rbee: Spending $800/month on OpenAI GPT-3.5 for customer support chatbot (100K requests/month)

After rbee: Deployed on 1× RTX 3090 ($800 hardware + €129 license)

Cost Breakdown:

• Hardwareone-time$800
• rbee Licenseone-time€129
• Power~$30/month

Monthly savings:$770

Annual savings:$9,240

Break-even:1.2 months

What's Next? Our Roadmap to Enterprise

We're currently in pre-launch with these milestones:

🎯

M1 (Q1 2026) - Foundation

In Progress

Core orchestration, chat models (GGUF), basic GUI, multi-machine support

🎨

M2 (Q2 2026) - Expansion

Planned

Image generation (Stable Diffusion), TTS, premium modules launch, advanced routing

🚀

M3 (Q3 2026) - Enterprise

Planned

Advanced scheduling, multi-tenant support, enterprise features, fine-tuning support

Get Started Today: Stop Overpaying for Cloud AI

Stop overpaying for cloud AI. Turn your GPUs into a unified system.

Free core (GPL-3.0) + premium modules (€129-€499 lifetime). No subscriptions. No per-token fees. No vendor lock-in. Pre-launch pricing available through Q2 2026.

⭐ Star on GitHub 📚 Setup Guide 💎 View Pricing

📖 Continue Reading

→ How to Set Up Multi-Machine GPU Orchestration → Cost Comparison: Self-Hosted vs Cloud → Running NVIDIA, Apple, and AMD GPUs Together → GDPR Compliance for Self-Hosted AI