How does context window affect content moderation?
A context window is the maximum tokens a model accepts in one request (prompt + completion). Larger windows enable richer prompts but increase per-call cost linearly with tokens used. For content moderation at 28.0B/mo, context window ties to Up to 91% compliant routing opportunity at a lean floor.
Up to 638× spread between most and least expensive compliant routes for identical workloads at the same quality floor (o10 State of Inference Spend 2026).