Usage-Based Pricing

Pay for what
you use.

Transparent, GPU-hour based metering with clear quotas. From solo developers to enterprise deployment fleets, only pay for the compute you consume.

Developer

Free

Start building with OpenAI-compatible inference. Perfect for prototyping and side projects.

5 GPU Hours / mo

OPENAI-COMPATIBLE API

3 Model Deployments

DEPLOYMENT-AWARE ROUTING

All SDK Languages

TYPESCRIPT, PYTHON, GO, RUST

Community Support

DISCORD + GITHUB

Recommended

Team

$49/mo

Production inference with usage metering and priority support. For teams shipping AI features.

100 GPU Hours / mo

MULTI-BACKEND ROUTING

Unlimited Deployments

PER-API-KEY QUOTAS

All SDK Languages

TYPESCRIPT, PYTHON, GO, RUST

Priority Support

4H RESPONSE GUARANTEE

Enterprise

Contact

Custom SLA, dedicated inference fleet, and white-glove onboarding for production at scale.

Unlimited GPU Hours

H100 CLUSTER READY

All SDK Languages

TYPESCRIPT, PYTHON, GO, RUST

Custom Governance

SAML SSO & RBAC

Dedicated Support

NAMED SOLUTIONS ENGINEER

Full Capability Matrix

Inference & Routing
Developer
Team
Enterprise
GPU Hours / mo
5
100
Unlimited
Model Deployments
3
Unlimited
Unlimited
OpenAI-Compatible API
Yes
Yes
Yes
Deployment-Aware Routing
Single
Multi-Region
Global
Training & Jobs
Developer
Team
Enterprise
Fine-Tuning Jobs
5 / mo
Unlimited
RL Training Jobs
2 / mo
Unlimited
Parallel Job Limit
10
Infinite
Job Observability
Basic
Advanced
Custom Dashboards
Platform & Security
Developer
Team
Enterprise
API Keys
3
25
Unlimited
Usage Metering & Quotas
Basic
Per-Key
Custom Policies
SSO / SAML
Yes
Data Retention
7 Days
90 Days
Custom
Developer Experience
Developer
Team
Enterprise
SDK Languages
All Languages
All Languages
All Languages
Dashboard
Basic
Full
White-Label
Support
Community
Priority (4H)
Dedicated Engineer
Docs & API Reference
Yes
Yes
Yes

Deploy your first model today.

OpenAI-compatible inference, GPU-hour based pricing, and per-API-key metering. No GPU cluster required.