Usage-Based Pricing

Pay for what
you use.

Transparent, GPU-hour based metering with clear quotas. From solo developers to enterprise deployment fleets, only pay for the compute you consume.

Developer

Free

Start building with OpenAI-compatible inference. Perfect for prototyping and side projects.

5 GPU Hours / mo

OPENAI-COMPATIBLE API

3 Model Deployments

DEPLOYMENT-AWARE ROUTING

All SDK Languages

TYPESCRIPT, PYTHON, GO, RUST

Community Support

DISCORD + GITHUB

Recommended

Team

$49/mo

Production inference with usage metering and priority support. For teams shipping AI features.

100 GPU Hours / mo

MULTI-BACKEND ROUTING

Unlimited Deployments

PER-API-KEY QUOTAS

All SDK Languages

TYPESCRIPT, PYTHON, GO, RUST

Priority Support

4H RESPONSE GUARANTEE

Enterprise

Contact

Custom SLA, dedicated inference fleet, and white-glove onboarding for production at scale.

Unlimited GPU Hours

H100 CLUSTER READY

All SDK Languages

TYPESCRIPT, PYTHON, GO, RUST

Custom Governance

SAML SSO & RBAC

Dedicated Support

NAMED SOLUTIONS ENGINEER

Full Capability Matrix

Inference & Routing

Developer

Team

Enterprise

GPU Hours / mo

100

Unlimited

Model Deployments

Unlimited

OpenAI-Compatible API

Yes

Deployment-Aware Routing

Single

Multi-Region

Global

Training & Jobs

Developer

Team

Enterprise

Fine-Tuning Jobs

—

5 / mo

Unlimited

RL Training Jobs

—

2 / mo

Unlimited

Parallel Job Limit

—

Infinite

Job Observability

Basic

Advanced

Custom Dashboards

Platform & Security

Developer

Team

Enterprise

API Keys

Unlimited

Usage Metering & Quotas

Basic

Per-Key

Custom Policies

SSO / SAML

—

Yes

Data Retention

7 Days

90 Days

Custom

Developer Experience

Developer

Team

Enterprise

SDK Languages

All Languages

Dashboard

Basic

Full

White-Label

Support

Community

Priority (4H)

Dedicated Engineer

Docs & API Reference

Yes

Deploy your first model today.

OpenAI-compatible inference, GPU-hour based pricing, and per-API-key metering. No GPU cluster required.

Documentation Contact Us

Pay for what you use.

Developer

Team

Enterprise

Full Capability Matrix

Deploy your first model today.

Pay for what
you use.