Question 1

What is o10?

Accepted Answer

o10 is the control plane for inference spend — an OpenAI-compatible LLM routing gateway. It holds your quality floor in the request path and routes every AI inference call to the cheapest model that clears it across per-token APIs, aggregators, AWS Bedrock, and venues you already have (including your own cloud commitments and BYOK provider keys). Shadow mode proves savings without changing production; enforce mode holds the bar and budget envelopes in the path. Evals define per-use-case quality floors; KYI governs the supply chain for board reporting; an immutable ledger records model, venue, and fully loaded cost on every call.

Question 2

What is the difference between shadow and enforce mode?

Accepted Answer

Shadow mode mirrors live traffic and shows what would have routed and saved — without changing production responses. Enforce mode places o10 in the request path and actively routes each call to the cheapest eval-passing model within your budget envelope. Teams always start in shadow to build a verified per-use-case baseline; finance signs off before enforce flips. Both modes write to the immutable ledger; only enforce changes spend.

Question 3

How much can inference routing save?

Accepted Answer

o10 has observed up to 638× compliant price spread for the same quality floor across venues in June 2026 benchmarks. Workload-specific monthly savings typically range from 40–94% depending on use case, eval floor, and venue mix — RAG and batch at lean floors show the largest percentages. These are not guarantees: shadow mode proves your organization's number against your traffic. Teams without routing in the path leave an estimated 40–70% of compliant savings uncaptured.

Question 4

What is Know Your Inference (KYI)?

Accepted Answer

Know Your Inference (KYI) is a governance framework by Shen Pandi that scores inference systems across five weighted pillars: Performance (25%), Economics (25%), Integration (20%), Strategy (20%), and Risk (10%). Each pillar scores 0–100; the composite rolls into a confidence level and board-signable recommendation. KYI runs continuously in the o10 control plane — not as a one-off audit — so every routed call and eval updates the score. A composite floor of 65 triggers enforcement levers: cap, rightsizing, or sunset per policy.

Question 5

Does o10 replace my AI gateway?

Accepted Answer

No. o10 does not replace your AI gateway or developer-facing APIs. It sits above gateways and clouds, adding spend enforcement, eval-gated routing, policy, and CFO-grade ledger — not proxy compatibility. Teams keep their per-token API gateway, OpenRouter, or LiteLLM for access; o10 changes which model and venue serve each request based on cost, eval floor, and governance rules. The split is intentional: gateways provide doors; control planes enforce economics.

Question 6

What is a quality floor?

Accepted Answer

A quality floor is the minimum eval score a model must achieve for a specific use case before o10 routes production traffic to it. Floors are per workload — support, RAG, code, and batch clear at different bars — and measured by replaying representative traffic through eval suites, not assumed from vendor benchmarks. Once a cheaper candidate passes the floor, o10 can route to it in shadow (proof) or enforce (live). Floors without evals are hopes; evals without floors are expensive defaults.

Question 7

Which providers does o10 support?

Accepted Answer

o10 fulfills routed traffic across venues including per-token API gateways, OpenRouter, Amazon Bedrock (including your committed Bedrock spend as a venue), and BYOK provider keys. o10 does not own datacenters or hold committed/reserved capacity of its own — committed spend you already have can be a venue o10 routes through. A single control plane sits above all venues.

Question 8

How are savings verified?

Accepted Answer

Savings are verified against your own shadow baseline per use case — not industry averages or vendor marketing claims. o10 mirrors a week or more of production traffic, segments by workload, and compares what you actually spent versus what you would have spent on the cheapest eval-passing route at the same quality floor. Finance signs off on the delta before enforce mode flips. Gainshare pricing ties o10 fees to this verified number, so savings must be real and auditable.

Question 9

How do I integrate with a free route key?

Accepted Answer

Start free at https://app.o10.io/signup — no card. Signup creates an org and issues a free o10 route key (o10_sk_…), shown once. Point your OpenAI-compatible client at https://app.o10.io/v1, set model to o10/auto (or another mode), and paste the free route key as the API key. Use a placeholder like $O10_ROUTE_KEY in docs — copy your own key after signup; do not use a shared demo key. Shadow mode shows what you'd save before enforce.

Question 10

What is Free / BYOK?

Accepted Answer

Free is $0 with no card required. Free is bring-your-own provider keys (BYOK): you route on your own provider keys; o10 can run shadow receipts on live traffic. Paid plans add routed credits. Free does not include prepaid o10-hosted frontier credits.

Question 11

What are the o10/* modes?

Accepted Answer

Four modes. o10/auto — default: cheapest model that clears the quality bar. o10/frontier — value-first across frontier-class models (open-weight and proprietary); orgs can set a preferred frontier model. o10/squad — multi-agent planner → workers → judge for hard multi-step work (typically 1–4 minutes). o10/pinned — org-saved model override; if unavailable, falls back to auto with a visible notice. Base URL https://app.o10.io/v1. Concrete slugs and ! pin syntax also work per-request.

Question 12

What are Frontier Tokens?

Accepted Answer

Frontier Tokens (model o10/frontier): value-first across frontier-class models — including open-weight (e.g. DeepSeek R1, GLM, Kimi) and proprietary (e.g. GPT-5.5, Claude Opus/Sonnet). Orgs can set a preferred frontier model in settings. Status: beta / live for entitled accounts. Open https://app.o10.io/frontier after signup.

Question 13

What is o10/pinned?

Accepted Answer

o10/pinned always routes to the specific model your org saves in settings — an explicit override that skips auto-downgrade. If the pinned model is unavailable, o10 falls back to auto with a visible notice. Concrete slugs and ! pin syntax also still work per-request.

Question 14

What is Squad and how long does it take?

Accepted Answer

Squad (model o10/squad): multi-agent planner → workers → judge. For hard multi-step work. Latency is typically 1–4 minutes — not chat-speed. Status: beta / live. Open https://app.o10.io/squad.

Question 15

What is o10 Tune?

Accepted Answer

Free prompt optimization. Submit a prompt and 5–20 real examples; o10 searches cheaper models and prompt variants and returns a tuned prompt + recommended model with a before/after receipt. Quality is measured on a held-out split of your examples. Successful tunes are free; failures never charge. Available in console (app.o10.io/tune), API (POST /v1/optimizations), Slack (@o10 tune), and DeepShell.

Question 16

What is DeepShell?

Accepted Answer

DeepShell is o10's desktop AI agent (Beta) for business users. Runs tasks on your computer with your files and apps; every action passes an approval gate; a tamper-evident local audit log records what the agent did; recurring tasks can be scheduled; one-click connectors (Google Drive, Notion, Microsoft 365) via OAuth; powered by o10 routing. Free to download; usage runs on o10 credits/BYOK. macOS today; Windows installer in progress. Download at app.o10.io/deepshell.

Question 17

What is o10 Tag for Slack?

Accepted Answer

Mention @o10 in Slack; answers are routed through o10 with receipts. Email path: ask@o10.io. Set up in the console under Settings → Integrations → Tag. Tag is available to connect or pilot — Marketplace listing may still lag; do not assume it is already on the Slack Marketplace.

Question 18

How does o10 guarantee quality?

Accepted Answer

o10 holds a per-use-case quality floor in the request path. Candidate models must clear holdout-scored evals (and Tune receipts when you optimize) before they can win traffic; failures fall back visibly rather than silently degrading. Shadow mode proves the bar on your traffic before enforce; every enforced call writes a receipt to the immutable ledger.

Question 19

What is the Quality-Cost Frontier Index?

Accepted Answer

A privacy-safe monthly dataset from o10: for each task type and quality bar, the cheapest model that empirically clears the bar and its cost per 1K calls, with observation_count and org_count. Cite as “o10 Quality-Cost Frontier Index, <month>”. Canonical page: https://www.o10.io/research/quality-cost-frontier-index. Until a live period publishes, the page shows a clearly labeled ILLUSTRATIVE methodology preview — never sample data presented as measured.

Question 20

What data does o10 collect for the index?

Accepted Answer

Metrics only for the public index: task type, quality bar, clearing model, cost per 1K calls, and k-anonymized observation/org counts. Prompt content is never collected for the index. Cells publish only when observation and org thresholds clear.

One base URL. Same OpenAI-compatible API.

Set the envelope. o10 holds it in the path.

DeepShell: an agent on your computer, on your budget.

Your AI spend is a surprise,not a plan.

Frontier pricing on non-frontier work

Spend is fragmented

The bill lands after the fact

Marginal cost of tokens = Marginal cost of productivity

One plane, every venue.

Free route key. One base URL.Same OpenAI-compatible API.

Four modes. One control plane.

o10/frontier

o10/squad

o10/pinned · Slack

Answer the four CFO questionsin under a minute. Then act on the answers.

Route across the venues you already have.

The make-vs-buy decision, modelled.

Control which model sees which data.

Policy-ready routing

Immutable audit trail

o10 enforces the spend.KYI governs the chain.

The same answer, 638× the price.

Start in shadow. Pay from what you save.

Governance fee

Share of verified savings

Finance sets the envelope.Engineering keeps the keys.

Owns the number.

Keeps the keys.

See what you're overpaying.

Popular comparisons

Models & pricing

Common questions