Unlimited Tokens
Send as many requests as you need every month. No per-token billing, no surprise invoices — just a flat rate under our fair-use policy.
Dedicated GPU infrastructure, a full dashboard, and API keys — all for a flat $1,000/mo. We spin up clusters in cohorts of 80–100 customers so the economics work for everyone. Join the waitlist to lock in your seat.
We're gauging interest for our first cohort. Drop your email and we'll notify you as soon as we hit the minimum to launch.
Send as many requests as you need every month. No per-token billing, no surprise invoices — just a flat rate under our fair-use policy.
A clean web dashboard to monitor usage, rotate keys, and manage your account. Standard REST API — drop it into any stack in minutes.
Each cohort runs on its own provisioned GPU infrastructure. No noisy neighbours, no throttling — consistent, low-latency inference.
We group customers into cohorts of 80–100 to share the cost of high-end GPU infrastructure fairly. Here's the process:
Drop your email. You'll be placed in the current forming cohort and kept updated on progress toward the minimum.
Once 80 confirmed customers commit, we lock in the cohort. You'll receive an invoice for the first month's payment.
First-month payment is collected (non-refundable, unless we fail to deliver). We provision dedicated GPUs and spin up your cluster.
You receive your dashboard login, API keys, and documentation. Start sending requests immediately.
If there's already an active cohort with fewer than 100 customers, you can skip the line by paying a premium on top of the regular monthly fee. Once the next cohort launches, you'll be moved over and your price reverts to the standard $1,000/mo — no penalty, completely fair.
Premium Tier
$1,000/mo
unlimited AI agent usage under fair use
Enterprise Tier
Custom pricing
for sustained high-volume workloads
Premium is designed for most teams and includes fair-use unlimited access. If your usage consistently pushes past Premium soft caps, we will guide you into Enterprise so you get higher quotas, faster queue priority, and a stronger support/SLA profile.
GLM 5.1 is a powerful large language model suitable for a wide range of tasks including code generation, content creation, data analysis, and conversational AI.
There is no hard cap on the number of tokens you can send or receive each month. Usage is governed by a fair-use policy (details to be published before launch) that prevents abuse while ensuring legitimate workloads run without interruption.
We collect signups until we reach 80–100 committed customers. At that point we collect the first month's payment, provision dedicated GPU infrastructure, and hand out dashboard credentials and API keys. Meanwhile, signups for the next cohort continue on a separate waitlist.
No payment is collected until the minimum is reached. If it takes longer than expected, we'll keep you updated — but you won't be charged a cent until the cohort is confirmed and infrastructure is being provisioned.
Yes. If there's an active cohort with fewer than 100 customers, you can pay a premium to join immediately. When the next cohort goes live you'll be moved there and your rate drops back to the standard $1,000/mo.
Only if we deliver. If we fail to provision the infrastructure or the service doesn't go live, you receive a full refund. The non-refundable clause protects us from covering large upfront hardware costs without committed revenue.
Yes. After the first month, the service is month-to-month. Cancel anytime before the next billing cycle and you won't be charged again.
Your inference requests are processed on dedicated infrastructure and are not used to train or fine-tune any models. We do not sell or share your data with third parties.
Each cohort runs on enterprise-grade NVIDIA GPUs provisioned specifically for that group. Exact specs will be shared before launch, but expect hardware optimised for high-throughput, low-latency LLM inference.
Also Available
Need your own isolated AI infrastructure instead of shared inference? We also offer one-off OpenClaw deployments — private, fully managed instances on dedicated hardware. We handle setup, configuration, and ongoing maintenance so you can focus on using it.
Seats are limited to 100 per cohort. Drop your email and be the first to know when we launch.