Introducing rbee: Multi-Machine GPU Orchestration Without Kubernetes
Stop overpaying for cloud AI. rbee turns scattered GPUs into one unified system. SSH-based deployment, 5-minute setup, OpenAI-compatible API. Free core + premium modules (€129-€499).
The Problem: Scattered GPUs, Expensive Solutions
We had a problem. Like many developers, we had powerful GPUs sitting across multiple machines: a gaming PC with an RTX 4090, a Mac Studio with M2 Ultra, and a few old servers with Tesla cards. Each machine could run AI models, but coordinating them was a nightmare. And the alternatives were all terrible.
- Ollama: Great for single machines, but can't orchestrate multiple GPUs across your network
- Kubernetes + Ray/KServe: 6 months of setup, requires a dedicated DevOps team, overkill for most use cases
- Cloud APIs: $100-3,000/month when you already own the hardware
- vLLM: Powerful but requires Kubernetes and doesn't support heterogeneous hardware (NVIDIA + Apple + AMD)
We wanted something simple: SSH into our machines, start workers, and get one unified API. That's rbee.
What is rbee? The Solution
rbee is an open-source AI orchestration platform that turns scattered GPUs into one unified system. Think of it as "SSH + OpenAI API" for your own hardware. No Kubernetes. No Docker complexity. Just SSH.
Who Is This For? Homelabbers to Enterprises
How It Works: Three Components
rbee has three main components working together:
Keeper (Central Coordinator)
Manages workers, routes requests, and exposes the OpenAI-compatible API. Runs on your main machine.
Workers (GPU Executors)
Run on each machine with a GPU. Execute model inference and report status back to the keeper.
Queen (Optional Advanced Routing)
Premium module for advanced scheduling, A/B testing, and custom routing logic via Rhai scripts.
# Install rbeecurl -fsSL https://rbee.dev/install.sh | sh
# Configure your machines (SSH-based)rbee hive add home-pc 192.168.1.100rbee hive add mac-studio 192.168.1.101
# Start the keeperrbee keeper start
# Use the OpenAI-compatible APIcurl http://localhost:7833/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "llama-3.1-70b", "messages": [{"role": "user", "content": "Hello"}] }'Why SSH-Based Deployment? Universal, Simple, Secure
We chose SSH because it's universal, secure, and simple. Every developer already has SSH access to their machines. No need to learn Kubernetes, set up Docker registries, or manage complex networking.
Deployment Method Comparison
| Feature | rbee (SSH)Recommended | Kubernetes | Docker Swarm | Manual Scripts |
|---|---|---|---|---|
| Setup Time | 5 minutes | 2-6 months | 1-2 weeks | Varies |
| Complexity | Low | Very High | Medium | High |
| Prerequisites | SSH access | K8s cluster, DevOps expertise | Docker registry, networking | Custom tooling, maintenance |
Configure your machines in ~/.config/rbee/hives.conf (like SSH config), and rbee handles the rest.
Open Source + Lifetime Pricing: No Subscriptions
rbee core is GPL-3.0 open source and free forever. We believe AI infrastructure should be accessible. Premium modules are one-time purchases (€129-€499 lifetime), not monthly subscriptions.
Free Forever (GPL-3.0)
Core orchestration, multi-machine support, OpenAI API, basic routing
Premium Modules (€129-499 one-time)
Advanced scheduling, Rhai scripting, telemetry, GDPR auditing, priority support
Real-World Example: SaaS Startup ROI
SaaS Startup Use Case
Cost Breakdown:
- • Hardwareone-time$800
- • rbee Licenseone-time€129
- • Power~$30/month
What's Next? Our Roadmap to Enterprise
We're currently in pre-launch with these milestones:
M1 (Q1 2026) - Foundation
Core orchestration, chat models (GGUF), basic GUI, multi-machine support
M2 (Q2 2026) - Expansion
Image generation (Stable Diffusion), TTS, premium modules launch, advanced routing
M3 (Q3 2026) - Enterprise
Advanced scheduling, multi-tenant support, enterprise features, fine-tuning support
Get Started Today: Stop Overpaying for Cloud AI
rbee is available now on GitHub. Join our community, try it out, and let us know what you think. We're building this with feedback from homelabbers, startups, and enterprises. Pre-launch pricing available through Q2 2026.
Stop overpaying for cloud AI. Turn your GPUs into a unified system.
Free core (GPL-3.0) + premium modules (€129-€499 lifetime). No subscriptions. No per-token fees. No vendor lock-in. Pre-launch pricing available through Q2 2026.