LLM Providers - OpenSRE Documentation

OpenSRE is provider-agnostic: bring your own model. Selection is controlled by the LLM_PROVIDER environment variable, with LLM_AUTH_METHOD selecting API-key or OAuth auth where both are supported. Defaults are tracked in config/config.py and routing lives in core/llm/llm_client.py.

Quick reference

Provider	`LLM_PROVIDER`	Auth	Reasoning model default	Toolcall model default
Anthropic API key	`anthropic` + `LLM_AUTH_METHOD=api_key`	`ANTHROPIC_API_KEY`	`claude-sonnet-4-6`	`claude-haiku-4-5-20251001`
Anthropic OAuth	`anthropic` + `LLM_AUTH_METHOD=oauth`	Onboarding launches `claude auth login`	Claude Code CLI default	Claude Code CLI default
OpenAI API key	`openai` + `LLM_AUTH_METHOD=api_key`	`OPENAI_API_KEY`	`gpt-5.4-mini`	`gpt-5.4-mini`
OpenAI OAuth	`openai` + `LLM_AUTH_METHOD=oauth`	OpenSRE opens `localhost:1455` for Codex-compatible OAuth	Codex CLI default	Codex CLI default
OpenRouter	`openrouter`	`OPENROUTER_API_KEY`	`openrouter/auto`	`openrouter/auto`
DeepSeek	`deepseek`	`DEEPSEEK_API_KEY`	`deepseek-v4-pro`	`deepseek-v4-flash`
Google Gemini API key	`gemini`	`GEMINI_API_KEY`	`gemini-3.1-pro-preview`	`gemini-3.1-flash-lite-preview`
Google Gemini CLI	`gemini-cli`	`gemini` interactive login or API key env	Gemini CLI default	Gemini CLI default
Google Antigravity CLI	`antigravity-cli`	`agy` browser OAuth / OS keyring	Antigravity CLI configured model	same as reasoning model
NVIDIA NIM	`nvidia`	`NVIDIA_API_KEY`	`meta/llama-3.1-405b-instruct`	`meta/llama-3.1-8b-instruct`
MiniMax	`minimax`	`MINIMAX_API_KEY`	`MiniMax-M3`	`MiniMax-M2.7-highspeed`
Amazon Bedrock	`bedrock`	AWS IAM (`AWS_REGION`)	`us.anthropic.claude-sonnet-4-6`	`us.anthropic.claude-haiku-4-5-20251001-v1:0`
Ollama (local)	`ollama`	None (local daemon)	`llama3.2`	`llama3.2`
GitHub Copilot CLI	`copilot`	`copilot login` or `gh auth login` (CLI)	Copilot CLI default	Copilot CLI default
xAI Groq API key	`groq`	`GROQ_API_KEY`	`llama-3.3-70b-versatile`	`llama-3.1-8b-instant`
xAI Grok Build CLI	`grok-cli`	`grok login` (CLI)	Grok Build CLI default	Grok Build CLI default
Pi CLI (BYOK)	`pi`	provider API key env or `pi` → `/login`	Pi configured model (`PI_MODEL` to override)	same as reasoning model

OpenSRE distinguishes two model slots per provider:

Reasoning model — full-capability model used for diagnosis, claim validation, and multi-step analysis.
Toolcall model — lightweight, lower-cost model used for tool selection and routing.

Selecting a provider

Set LLM_PROVIDER (default: anthropic) in your environment or .env file:

export LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...

Or run the onboarding wizard, which writes the same values to .env:

opensre onboard

When a provider has more than one supported auth route, onboarding asks for the provider first, then the auth method. For example, choose Anthropic and then OAuth to use a Claude subscription through the onboarding flow, or choose API key to paste ANTHROPIC_API_KEY. OpenSRE keeps the provider as anthropic or openai; LLM_AUTH_METHOD=oauth selects the OAuth-backed runtime. OAuth browser login, token storage, refresh, and logout are delegated to the vendor CLI that owns that account session. OpenSRE owns the onboarding UX and does not persist OAuth tokens directly. In the interactive shell, /model shows curated quick-pick choices for common models. Providers with fast-changing or account-gated catalogs (OpenAI, OpenRouter, Gemini, NVIDIA, Bedrock, local CLIs, Ollama, and DeepSeek) also accept custom model IDs:

/model set openai gpt-5.5
/model set openai gpt-5.5 --toolcall-model gpt-5.4-mini

Override the default model for a slot via env vars:

export OPENAI_REASONING_MODEL=gpt-5.4-mini
export OPENAI_TOOLCALL_MODEL=gpt-5.4-mini

A shared LLM_MAX_TOKENS (default 4096) controls the response token budget for every provider. Use opensre auth for provider login without writing secrets to .env:

Command	What it does
`opensre auth`	Show auth status for subscription and API-key providers
`opensre auth login deepseek`	Open DeepSeek setup guidance, validate `DEEPSEEK_API_KEY`, store it in the system keychain, and select DeepSeek
`opensre auth login claude`	Configure the `claude-code` provider through Claude Code CLI subscription login
`opensre auth login chatgpt`	Configure the `codex` provider through OpenSRE-managed ChatGPT OAuth
`opensre auth verify deepseek`	Intentionally resolve DeepSeek credentials and refresh stale local metadata
`opensre auth logout deepseek`	Remove OpenSRE-managed DeepSeek credentials and metadata

opensre auth login never reads browser cookies, browser profiles, browser local storage, or IndexedDB. API-key providers use hidden paste prompts plus keyring storage. OpenAI OAuth is handled by OpenSRE’s local Codex-compatible callback server; other subscription providers delegate OAuth/session handling to the vendor CLI that owns the browser login flow. opensre auth and /auth status are prompt-safe: they do not read API-key secrets from Keychain. For API-key providers they inspect environment variables plus non-secret metadata in ~/.opensre/llm-auth.json. If a key was deleted directly from Keychain, status may show the old metadata until you run opensre auth verify <provider> or start a request; that verification marks the provider stale when the secret is gone. For Codex CLI auth, status checks do not run codex login status by default, because some Codex versions can open browser OAuth while checking a session. Run /login chatgpt or opensre auth login chatgpt from an interactive terminal when you need to refresh the browser login. OpenSRE starts its own temporary callback server on http://localhost:1455/auth/callback, exchanges the short-lived OAuth code, and writes Codex-compatible tokens to the local Codex auth store before redirecting the browser to the Codex-style /success?id_token=... completion page. If a browser flow reaches /success with token material directly, OpenSRE stores that token material instead of dropping the callback. Use codex login only as a direct CLI fallback. Inside the interactive shell, use the same flows through /auth or /login:

/login chatgpt
/login claude
/login deepseek
/auth status

API providers

Anthropic

export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
# Optional overrides:
export ANTHROPIC_REASONING_MODEL=claude-sonnet-4-6
export ANTHROPIC_TOOLCALL_MODEL=claude-haiku-4-5-20251001

The default. Uses the Anthropic Python SDK directly. Get an API key at console.anthropic.com.

OpenAI

export LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...
# Optional overrides:
export OPENAI_REASONING_MODEL=gpt-5.4-mini
export OPENAI_TOOLCALL_MODEL=gpt-5.4-mini

Uses the OpenAI SDK. Reasoning models (o1, o3, o4, gpt-5*) automatically use max_completion_tokens instead of max_tokens.

OpenRouter

export LLM_PROVIDER=openrouter
export OPENROUTER_API_KEY=sk-or-...
# Optional override (single value applies to both slots if set):
export OPENROUTER_MODEL=openrouter/auto
# Or per-slot:
export OPENROUTER_REASONING_MODEL=anthropic/claude-sonnet-4-6
export OPENROUTER_TOOLCALL_MODEL=openai/gpt-4o-mini

OpenAI-compatible proxy — pick any model on openrouter.ai/models. Base URL: https://openrouter.ai/api/v1.

DeepSeek

export LLM_PROVIDER=deepseek
export DEEPSEEK_API_KEY=sk-...
# Optional override (single value applies to all slots if set):
export DEEPSEEK_MODEL=deepseek-v4-pro
# Or per-slot:
export DEEPSEEK_REASONING_MODEL=deepseek-v4-pro
export DEEPSEEK_TOOLCALL_MODEL=deepseek-v4-flash

Uses DeepSeek’s official OpenAI-compatible API endpoint at https://api.deepseek.com. Run opensre auth login deepseek for browser-assisted key setup and secure local storage.

Google Gemini

export LLM_PROVIDER=gemini
export GEMINI_API_KEY=...
# Optional override:
export GEMINI_MODEL=gemini-3.1-pro-preview
# Or per-slot:
export GEMINI_REASONING_MODEL=gemini-3.1-pro-preview
export GEMINI_TOOLCALL_MODEL=gemini-3.1-flash-lite-preview

Uses Google’s OpenAI-compatible endpoint at https://generativelanguage.googleapis.com/v1beta/openai/. Get an API key at aistudio.google.com.

NVIDIA NIM

export LLM_PROVIDER=nvidia
export NVIDIA_API_KEY=nvapi-...
# Optional override:
export NVIDIA_MODEL=meta/llama-3.1-405b-instruct
# Or per-slot:
export NVIDIA_REASONING_MODEL=meta/llama-3.1-405b-instruct
export NVIDIA_TOOLCALL_MODEL=meta/llama-3.1-8b-instruct

Uses NVIDIA’s OpenAI-compatible API at https://integrate.api.nvidia.com/v1. Browse available models on build.nvidia.com.

MiniMax

export LLM_PROVIDER=minimax
export MINIMAX_API_KEY=...
# Optional override (single value applies to both slots if set):
export MINIMAX_MODEL=MiniMax-M3
# Or per-slot:
export MINIMAX_REASONING_MODEL=MiniMax-M3
export MINIMAX_TOOLCALL_MODEL=MiniMax-M2.7-highspeed

OpenAI-compatible endpoint at https://api.minimax.io/v1. Temperature is fixed to 1.0 to match MiniMax recommendations.

Amazon Bedrock

export LLM_PROVIDER=bedrock
export AWS_REGION=us-east-1
# Optional overrides:
export BEDROCK_REASONING_MODEL=us.anthropic.claude-sonnet-4-6
export BEDROCK_TOOLCALL_MODEL=us.anthropic.claude-haiku-4-5-20251001-v1:0

No API key — auth uses the AWS credential chain (environment variables, shared credentials file, or IAM role). Your principal needs permission to invoke the model IDs you configure (for example Bedrock InvokeModel / Converse access scoped to those resources in IAM). Model routing:

Anthropic Claude on Bedrock (anthropic.claude-*, us.anthropic.claude-*, and foundation-model ARNs that contain anthropic.claude) use the existing AnthropicBedrock SDK path.
Other Bedrock foundation models (for example Mistral, Meta Llama, Amazon Titan IDs you enable in your account) use the Bedrock Converse API via boto3, so you can set BEDROCK_REASONING_MODEL to a non-Claude model ID when your use case requires it.
Application inference profile ARNs (…:application-inference-profile/…) do not encode the vendor in the ID; those are always sent through Converse, which works for any backing model in the profile.

Defaults in config/config.py are US cross-region inference profile IDs for Anthropic Claude; override with IDs or ARNs that are inference-access enabled in your account and region.

Ollama (local)

export LLM_PROVIDER=ollama
# Optional overrides:
export OLLAMA_HOST=http://localhost:11434
export OLLAMA_MODEL=llama3.2

Run any local model exposed by an Ollama daemon. No API key required — OpenSRE talks to Ollama’s OpenAI-compatible endpoint at ${OLLAMA_HOST}/v1.

CLI providers (subprocess)

CLI-backed providers shell out to a vendor CLI instead of an HTTP API during inference. OpenSRE detects the binary on PATH (or via an explicit env var) and reuses the existing session. OpenAI OAuth is stored by OpenSRE in Codex-compatible auth format; other CLI-backed providers authenticate via the vendor’s own login command. Investigation timeouts: Each ReAct turn runs one full CLI subprocess with the system prompt, tool schemas, and conversation history. The shared default subprocess budget is 300 seconds (Python adds a small buffer). Override per provider when needed, for example GEMINI_CLI_TIMEOUT_SECONDS, CLAUDE_CODE_TIMEOUT_SECONDS, or ANTIGRAVITY_CLI_TIMEOUT_SECONDS (clamped 30–600 where the adapter supports it).

OpenAI OAuth backend

export LLM_PROVIDER=openai
export LLM_AUTH_METHOD=oauth
# Authenticate through onboarding or `/login chatgpt`:
opensre auth login chatgpt
# Optional overrides (all blank-by-default):
export CODEX_MODEL=
export CODEX_BIN=

Requires the OpenAI Codex CLI. If CODEX_MODEL is unset, OpenSRE omits -m so codex exec uses the CLI’s currently configured model. If CODEX_BIN is unset, the binary is resolved via PATH and known install locations. Run opensre onboard, /login chatgpt, or opensre auth login chatgpt to launch OpenSRE-managed Codex browser login on localhost:1455 and persist LLM_PROVIDER=openai with LLM_AUTH_METHOD=oauth. Existing LLM_PROVIDER=codex configs still work for backward compatibility.

Anthropic OAuth backend

export LLM_PROVIDER=anthropic
export LLM_AUTH_METHOD=oauth
# Authenticate through onboarding, `/login claude`, or the Claude Code CLI directly:
claude auth login
# Optional overrides (all blank-by-default):
export CLAUDE_CODE_MODEL=
export CLAUDE_CODE_BIN=

Requires the Claude Code CLI (npm i -g @anthropic-ai/claude-code). If CLAUDE_CODE_MODEL is unset, OpenSRE omits the --model flag and the CLI uses its configured default. If CLAUDE_CODE_BIN is unset, the binary is resolved via PATH and known install locations. Run opensre onboard, /login claude, or opensre auth login claude to launch Claude browser login when needed and persist LLM_PROVIDER=anthropic with LLM_AUTH_METHOD=oauth. Existing LLM_PROVIDER=claude-code configs still work for backward compatibility.

GitHub Copilot

export LLM_PROVIDER=copilot
# Authenticate the Copilot CLI separately. Either flow works — the adapter
# detects both. The interactive `/login` slash command inside `copilot` writes
# to the platform credential store; `gh auth login` is an equivalent path that
# Copilot CLI delegates to automatically.
copilot login          # OAuth device flow; preferred CLI-first onboarding
# or:
gh auth login          # logs you into the gh CLI; Copilot will use that token
# Optional overrides (all blank-by-default):
export COPILOT_MODEL=
export COPILOT_BIN=
# Optional auth bypass for automation (only used when no CLI login is detected):
# export COPILOT_GITHUB_TOKEN=
# export GH_TOKEN=
# export GITHUB_TOKEN=

Requires the GitHub Copilot CLI (npm i -g @github/copilot). Login uses the interactive /login slash command or copilot login. OpenSRE detects auth in this order: (1) COPILOT_GITHUB_TOKEN / GH_TOKEN / GITHUB_TOKEN env, (2) gh auth status when gh is on PATH (including ✓ Logged in to github.com account …, - Active account: true, or a supported - Token: prefix: gho_, github_pat_, ghu_ per Copilot docs — not ghp_), with gh auth status --hostname … when COPILOT_GH_HOST or GH_HOST targets a non-github.com host. It does not read plaintext $COPILOT_HOME/config.json (keychain-backed installs may omit it; mis-parsing arbitrary JSON risks false positives). If nothing matches, detection reports logged_in=None and the runner verifies at invoke time. If COPILOT_MODEL is unset, OpenSRE omits --model. Invocations run as copilot -p PROMPT --no-color --no-ask-user --silent so they never block on user input. BYOK / COPILOT_OFFLINE: GitHub auth may be unnecessary; a None probe can still be fine if Copilot is configured for offline or external providers only.

Google Antigravity CLI

export LLM_PROVIDER=antigravity-cli
# Authenticate the Antigravity CLI separately (browser OAuth on first run):
agy                       # interactive launch triggers Google Sign-In; token cached by OS keyring
# Stay current — 1.0.0 had OAuth hangs (fixed in 1.0.1):
agy update
# Optional overrides (all blank-by-default):
export ANTIGRAVITY_CLI_BIN=
export ANTIGRAVITY_CLI_TIMEOUT_SECONDS=300   # default 300; clamped 30–600; maps to `--print-timeout {N}s`
# Note: ANTIGRAVITY_CLI_MODEL is registered for forward-compat but currently no-op
# (agy v1.0.2 does not expose --model in headless `-p` mode). Each invocation uses
# whatever model is persisted in agy's local config; switch it interactively with
# `/models` inside the `agy` REPL. The wizard's model picker is a forward-compat
# catalog: once Google ships `--model` in headless, picking a value here will start
# being forwarded to agy via a one-line change in the adapter.

Antigravity CLI (agy) is Google’s successor to Gemini CLI. Install via curl -fsSL https://antigravity.google/cli/install.sh | bash, then run agy install to configure your shell PATH. The minimum tested version is 1.0.1 — older builds log a warning via the probe and direct you to agy update. Why two Google providers? Google’s transition announcement states that on 2026-06-18 Gemini CLI stops serving Pro/Ultra and free users. Paid Gemini Code Assist licences keep Gemini CLI indefinitely. OpenSRE keeps both gemini-cli (deprecated alias with a probe-time notice) and antigravity-cli so either group can run without surprises. As a best-effort fallback, the probe treats explicit GEMINI_API_KEY / GOOGLE_API_KEY / GOOGLE_APPLICATION_CREDENTIALS env credentials as authenticated (mirroring the Gemini CLI adapter), so users migrating across the two CLIs can keep their existing env-var-based auth without re-running the browser flow. Invocations run as agy -p PROMPT --print-timeout {N}s. The adapter never passes --continue / --conversation / --sandbox / --dangerously-skip-permissions, keeping every opensre call ephemeral.

xAI Grok Build CLI

export LLM_PROVIDER=grok-cli
# Authenticate the Grok Build CLI separately. Either path works:
grok login                 # OAuth sign-in with a SuperGrok / X Premium+ account
# ...or, for headless / CI runs, use an API key instead of a browser login:
export XAI_API_KEY=xai-...  # get one from the xAI console
# Optional overrides (all blank-by-default):
export GROK_CLI_MODEL=          # e.g. grok-build; unset → CLI configured default
export GROK_CLI_BIN=            # explicit path to the `grok` binary
export GROK_CLI_TIMEOUT_SECONDS=300   # default 300; clamped 30-600

Requires the xAI Grok Build CLI (binary: grok). Install with curl -fsSL https://x.ai/cli/install.sh | bash (macOS/Linux) or irm https://x.ai/cli/install.ps1 | iex (Windows). If GROK_CLI_MODEL is unset, OpenSRE omits -m and the CLI uses its configured default. The wizard populates the model list live from grok models at onboarding time so newly released models appear without an OpenSRE update. Invocations run as grok -p PROMPT --output-format plain, so each opensre call is a single non-interactive turn. The adapter deliberately omits --always-approve: OpenSRE drives its own tools, so Grok is used purely as a text responder and never auto-executes shell commands or file edits. Auth detection: auth is probed via grok models (~0.5 s, no LLM call), which prints “You are logged in” on success. XAI_API_KEY is treated as an authenticated fallback for headless / CI runs even when the probe result is unclear. XAI_API_KEY is forwarded only to the Grok subprocess (never via the shared CLI env allowlist), so it cannot leak into other CLI adapters.

Not to be confused with groq. The grok-cli provider is xAI’s Grok Build CLI. The separate groq provider is the Groq HTTP API (a different company); the two are unrelated.

Pi CLI

export LLM_PROVIDER=pi
# Authenticate Pi separately. Either path works — the adapter detects both:
pi                       # then run /login for an OAuth subscription or to store a key
# …or export a provider API key Pi understands (BYOK), e.g. for Gemini:
export GEMINI_API_KEY=...

export PI_MODEL=google/gemini-2.5-flash-lite  # provider/model; unset → Pi configured default
export PI_BIN=                                # explicit path to the `pi` binary (optional)

Requires the Pi CLI (npm i -g @earendil-works/pi-coding-agent). Pi is bring-your-own-key across ~30 providers, so PI_MODEL uses the provider/model form (for example google/gemini-2.5-flash-lite, anthropic/claude-haiku-4-5, openai/gpt-4o-mini); run pi --list-models for the full catalog. If PI_MODEL is unset, OpenSRE omits --model and Pi uses its configured default. If PI_BIN is unset, the binary is resolved via PATH and known install locations. Invocations run as pi -p PROMPT (non-interactive print mode), so each OpenSRE call is a single headless turn with no TTY. Auth detection: Pi has no non-interactive auth-status command, so OpenSRE detects auth from state: (1) a supported provider API key in the environment (GEMINI_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, …) → authenticated; (2) otherwise, credentials stored in ~/.pi/agent/auth.json (written by pi’s /login, covering OAuth subscriptions and stored keys) → authenticated; (3) neither → not authenticated. Provider API keys are forwarded only to the Pi subprocess, never via the shared CLI env allowlist, so they cannot leak into other CLI adapters. See integrations/llm_cli/AGENTS.md for the adapter pattern used to add new CLI providers.

Reasoning effort (interactive shell)

In the TTY REPL (opensre with no subcommand), /effort stores a session preference for how strongly reasoning models should think before answering. It applies only when LLM_PROVIDER is openai (HTTP API) or codex (Codex CLI); other providers ignore the setting and the shell notes that.

Input	Sent to the model
`low`, `medium`, `high`, `xhigh`	same string
`max`	`xhigh`

Run /effort alone to show the current choice (or (default) when unset) and the usage line. /new starts a fresh session but keeps /effort (and trust mode), consistent with other session prefs. Outside the REPL, optional defaults use the environment variable:

export OPENSRE_REASONING_EFFORT=high   # low | medium | high | xhigh

Session /effort overrides this for interactive runs. Implementation: config/llm_reasoning_effort.py.

Provider diagnostics

OpenSRE does not silently switch LLM providers when the provider in LLM_PROVIDER is missing credentials. It keeps the configured provider selected and reports missing or stale auth status before starting LLM work.

opensre auth and /auth status show prompt-safe status from environment variables, provider metadata, CLI probes, or ambient/local config.
opensre auth verify <provider> intentionally checks request-time credentials and refreshes metadata.
opensre config llm and opensre doctor report the configured provider plus credential status without resolving secrets.

Provider errors are prefixed with the configured provider that served the request:

[LLM provider: openai]
Missing credential for LLM provider 'openai'. Set OPENAI_API_KEY or run `opensre auth login openai`.

If credentials are missing, set the provider’s API-key environment variable, run opensre auth login <provider>, or change LLM_PROVIDER to a provider you have configured.

Switching providers at runtime

OpenSRE caches LLM clients on first use. To switch providers within a single process (tests, benchmarks), call reset_llm_singletons() from core.llm.llm_client after updating the env vars; otherwise a fresh process picks up the new LLM_PROVIDER automatically.

Where this lives in the code

Provider literals and defaults: config/config.py (LLMProvider, LLMSettings).
Runtime routing: core/llm/llm_client.py (_create_llm_client).
API-backed provider guide: core/llm/AGENTS.md.
CLI-backed provider guide: integrations/llm_cli/AGENTS.md.

​Quick reference

​Selecting a provider

​Login and secret storage

​API providers

​Anthropic

​OpenAI

​OpenRouter

​DeepSeek

​Google Gemini

​NVIDIA NIM

​MiniMax

​Amazon Bedrock

​Ollama (local)

​CLI providers (subprocess)

​OpenAI OAuth backend

​Anthropic OAuth backend

​GitHub Copilot

​Google Antigravity CLI

​xAI Grok Build CLI

​Pi CLI

​Reasoning effort (interactive shell)

​Provider diagnostics

​Switching providers at runtime

​Where this lives in the code