Pharos

One parameter · three lanes

Tune cost against quality, per request.

No other gateway lets you pivot model class on a single key. Set quality per request. Get the exact tradeoff for that exact call.

Cheapest

$0.003/ image

Open-weights backends. Draft speed, bulk jobs, high-volume pipelines where cost dominates quality.

Stable Diffusion XL · Flux Schnell

~1.4s p50

quality: "cheapest"

Best

$0.06/ image

Frontier models for customer-facing output. Hero shots, covers, anything that represents your brand.

GPT-Image-1 · Flux Ultra

~5.8s p50

quality: "best"

Balanced

$0.04/ image

The sensible default. Quality you can ship, cost you can justify. Covers most production traffic.

Flux Pro · Ideogram v2

~3.2s p50

quality: "balanced"

Six modalities · one base URL

Everything you call OpenAI for. Plus everything you don't.

All routes are OpenAI-compatible. Drop-in replacement, same SDK, same response shapes. Video is async (custom 202 + poll) because nobody returns a 30-second clip in under 30 seconds.

Scroll · six modalities01 → 06

01modality

Chat.

Streaming completions across frontier and open-weight models.

POST /v1/chat/completions

$0.003/ 1K input · streaming

gpt-5.1 · claude-4.6 · llama-4-405b

How do I light a lighthouse?

Start with a beacon. Then route requests by quality.

02modality

Embeddings.

1536-dim vectors for semantic search and retrieval, batched.

POST /v1/embeddings

$0.00013/ 1K tokens

text-embedding-3-large · voyage-3

03modality

Speech.

Text-to-speech, multi-voice, multi-language, low latency.

POST /v1/audio/speech

$0.015/ 1K chars

elevenlabs-v3 · openai-tts-hd

04modality

Transcription.

Speech-to-text with timestamps. Multipart upload, any audio.

POST /v1/audio/transcriptions

$0.006/ minute

whisper-large-v3 · gpt-4o-transcribe

recording · 00:14

[00:00.4] We need a single inference layer for the agent.

[00:06.1] One that doesn't sleep when traffic spikes at 3am.

[00:11.8] Pharos handles all six modalities

05modality

Image.

Generation across SDXL, Flux, GPT-Image-1. quality: param routes the call.

POST /v1/images/generations

$0.003 → $0.06/ image · quality param

flux-pro · gpt-image-1 · sdxl

06modality

Video.

Long-running generation. Returns 202 + poll_url. No timeouts.

POST /v1/videos/generations

$0.40 → $1.20/ second · async

veo-3 · runway-gen4 · kling-2

frame · 048 / 2404.0s @ 60fps

poll_url · 202 accepted~19s remaining

Routes through

GPT-5.1Claude 4.7Llama 4 405BMistral Large 3Gemini 2.5 ProGrok 3Command R+DeepSeek R1DeepSeek V3Qwen 2.5GPT-4oClaude 4.6Yi Largetext-embedding-3Voyage 3Cohere Embed v4ElevenLabs v3OpenAI TTS HDCartesia SonicPlayHT 2Whisper Large v3GPT-4o TranscribeDeepgram Nova 3AssemblyAI UniversalFlux ProFlux SchnellFlux UltraGPT-Image-1Stable Diffusion XLSD 3.5Ideogram v2Recraft v3Veo 3Runway Gen-4Kling 2Pika 2Minimax VideoLuma Ray 2ReplicateTogether AIFalGroqAnthropicCerebrasHyperbolicLeptonModalOpenAIMistralDeepInfra

Pricing · usage-based

No seat fees. No tier traps. Flat markup.

You pay 30% over what we pay upstream. Routing, retries, one consolidated invoice.

Start

$5

Minimum top-up. Every modality unlocked from the first dollar.

Pay as you go

+30%

Flat markup over upstream cost. No subscriptions, no seats.

Scale

Reload

Top up in $5 increments. One invoice across every provider.

Tune cost against quality, per request.

Routes through 50+ models.

Everything you call OpenAI for. Plus everything you don't.

Chat.

Embeddings.

Speech.

Transcription.

Image.

Video.

Two lines change. Everything else stays.

No seat fees. No tier traps. Flat markup.

PHAROS

Routes through 50+ models.

Chat.

Embeddings.

Speech.

Transcription.

Image.

Video.

No seat fees. No tier traps. Flat markup.

Six modalities. One key. Start with $5.