Nexus

Open-Source AI Gateway

One endpoint.
Every model.
Total control.

Nexus is a drop-in replacement for the OpenAI/Anthropic/Gemini API. BYOK, virtual keys, guardrails, semantic caching, and observability — without giving up your provider accounts.

Apache 2.0 · self-hostable · MIT-licensed UI · production-grade observability

from openai import OpenAI

# Same SDK, drop in Nexus as the base URL.
client = OpenAI(
    base_url="https://nexus.ffx.ai/v1",
    api_key="nxs_live_...",
)

resp = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Hello, Nexus."}],
)

Features

Everything you need between
your app and the LLM.

Built for teams shipping AI features in production — not demos.

BYOK, the right way

Users bring their own OpenAI / Anthropic / Gemini keys. Keys are encrypted at rest, never logged, and never leave your tenant.

Virtual keys, real isolation

Mint per-team or per-app virtual keys with their own budgets, RPM limits, allowed models, and audit trail.

Quality-aware routing

Route by speed, cost, or rolling quality score. Add provider fallbacks that kick in before users see errors.

Guardrails in-line

PII redaction, JSON-schema enforcement, and self-correction. Block bad inputs/outputs before they cost a cent.

Semantic cache

Embeddings-based cache cuts repeat traffic. 10× cheaper re-prompts, 0 code changes in your app.

OpenTelemetry out of the box

Every request is traced with model, tokens, cost, latency, cache hit, and quality score. Export to any OTLP backend.

SSO via OIDC / SAML

Connect Okta, Azure AD, Google Workspace, or self-hosted Keycloak. JIT provisioning + per-org policy.

Self-host in 5 minutes

Single binary, Postgres, Redis, ClickHouse. Docker Compose for dev, Helm for prod. No SaaS lock-in.

1-line
drop-in replacement
for the OpenAI / Anthropic / Gemini SDKs
10+
providers
OpenAI, Anthropic, Gemini, Mistral, Bedrock, Azure, ...
5 min
self-host time
Docker Compose for dev, Helm for prod
0
data leaves your tenant
no third-party LLM proxy, no telemetry mining

How it works

From pip install to $1M/mo spend — same code.

  1. 01

    Self-host or use the cloud

    Run our Helm chart in your own Kubernetes, or sign up for a managed tenant on nexus.ffx.ai.

  2. 02

    Plug in your provider keys

    Per-user BYOK or org-level platform credentials. Encrypted at rest with KMS-managed keys.

  3. 03

    Mint virtual keys for your apps

    Each app or team gets a virtual key with its own budget, allowed models, and rate limit.

  4. 04

    Watch the dashboard

    Traces, costs, eval scores, and quality — by user, team, model, or virtual key.

FAQ

Questions, answered.

How is Nexus different from LiteLLM?

LiteLLM was built for the shared-key model (operator holds the LLM key). Nexus is built for BYOK and SSO first — your users bring their own provider keys, and your admin team gets org-level policy, audit, and quota on top. Both are open source; pick the model that fits your org.

Do my users' LLM keys leave my infrastructure?

No. Keys are encrypted at rest in our Postgres using AES-GCM with a key managed by your deployment. We never see plaintext keys. Calls go from your Nexus pod directly to the LLM provider — no proxy hop.

Which providers are supported?

OpenAI, Anthropic, Gemini, Mistral, Cohere, Groq, AWS Bedrock, Azure OpenAI, and any OpenAI-compatible endpoint (Ollama, vLLM, Together, Fireworks, OpenRouter, etc.).

Can I self-host?

Yes. Apache 2.0. Docker Compose for dev, Helm chart for prod. Postgres + Redis + ClickHouse are the only required deps. The whole stack runs on a single 2-CPU node for small teams.

How does pricing work?

Self-hosted Nexus is free and Apache 2.0. We charge for managed deployments and enterprise features (SSO/SAML, multi-region, audit export). See the pricing page for tiers.

What's on the roadmap?

Multi-tenant Keycloak realms, platform credentials (org-level LLM keys), MCP gateway, prompt playground, and a Vercel-style Edge runtime. See the GitHub issues for the full backlog.

Ready to ship?

The fastest way to put an LLM in production.

Self-host in 5 minutes, or sign up for a managed tenant.