GPU counts you can actually get — no more silent deploy timeouts

Some GPUs on the network are passed through to VMs in indivisible groups — paired consumer GPUs, for example, come as 2-GPU units. Previously, requesting a GPU count that didn't match that granularity (like a single GPU on a paired-GPU fleet) could never be placed: the request sat in the queue silently and only surfaced as a failure at the deploy timeout, 45 minutes later.

We've fixed this at every layer:

Availability API — /v1/gpu-availability now reports a per-model groupSize, the smallest GPU increment that model can be allocated in.
Fail-fast validation — creating a VM or cluster with a GPU count that isn't a multiple of the model's groupSize is now rejected immediately with a clear 400 error listing the valid counts, instead of being accepted and timing out later.
Dashboard wizard — the create-VM flow now shows a GPU count dropdown containing only counts that can actually be placed, and defaults to the first GPU model with free capacity.

If you use the API directly, check the new groupSize field on /v1/gpu-availability when choosing a GPU count. In the dashboard, nothing to do — invalid options simply no longer appear.

GPU counts you can actually get — no more silent deploy timeouts

Try it on OpenRelay