AI Model Service Tier - Open Dev Kit Documentation

Open Dev Kit Documentation :: Open AI :: AI Model Service Tier

OpenAI.AIModelServiceTier

Auto (`auto`): Uses the OpenAI projectâ€™s configured service tier. On Scale Tierâ€“enabled projects this consumes Scale Tier capacity while available; otherwise it behaves like 'default'.
Default (`default`): Uses the standard processing tier (normal pricing and availability, with no special latency or uptime guarantees).
Flex (`flex`): Uses the Flex processing tier: lower per-token cost with higher and more variable latency; requests may queue or be rate-limited during busy periods.
Priority (`priority`): Uses the Priority processing tier: higher per-token cost for faster and more consistent latency, especially under load.

If you think anything is missing, please feel free to: submit documentation feedback on this page