Key Features

SLURM-as-a-Service

Controller + login node pre-configured; GPU compute nodes enrolled via Ansible. Standard SLURM 23.x CLI out of the box.

GPU Partitions

Queues for H100, B200, and A6000 nodes; fair-share scheduling enabled. No backfill or pre-emption at MVP.

Elastic Capacity

Submit a request, and we add or remove nodes. Hours, not weeks. Pay only for reserved GPUs.

Shared Storage

NFS home/project space plus local NVMe scratch. Parallel file system and object storage are roadmap items.

Essential Monitoring

Prometheus + Grafana dashboards; Buzz ops receive hardware alerts and swap failing nodes automatically.

Secure, Single-Tenant

VPN-isolated cluster; Unix user/group separation. Optional identity integration coming soon.

Expert Support

HPC veterans on call (9 × 5) with 24 × 7 hardware escalation.

Why Buzz HPC Managed SLURM

Bare-metal GPU horsepower, zero scheduler upkeep, and people who speak SLURM fluently. It’s the shortest path from research idea to results—no data-center build-out required.

Use Cases
University & Industrial Research
Port existing SLURM workloads to faster GPUs without rewriting job scripts.
Large-Scale AI Training
Schedule multi-node PyTorch jobs under a familiar batch system.
Burst Capacity for On-Prem HPC
Keep local clusters small; overflow to Buzz when demand spikes.
Teaching & Workshops
Provision a temporary GPU supercomputer for a course or hackathon, then spin it down.

Take the complexity out of HPC.

Get your SLURM cluster running on world-class GPUs in a matter of days.