AI Gateway

Why AI Gateway?

Instead of managing API keys per-model and per-environment, the AI Gateway provides a centralized, authenticated entry point for all LLM interactions. It handles authentication, agent ID validation, and policy enforcement—while maintaining full OpenAI SDK compatibility.

The gateway wraps the OpenAI SDK—any framework that works with OpenAI works with AI Gateway. No code changes required.

Prerequisite: Ensure you’ve installed the SDK and configured your environment before proceeding.

Quick Start

1. Set Up Environment Variables

Before using the gateway, configure your credentials:

.env

# Required for AI Gateway
AI_GATEWAY_API_KEY=your-api-key
AI_GATEWAY_ENDPOINT=your-ai-gateway-endpoint

Never commit API keys to version control. Add .env to your .gitignore.

2. Make Your First Call

from bb_ai_sdk.ai_gateway import AIGateway

gateway = AIGateway.create(
    model_id="gpt-4o",
    agent_id="550e8400-e29b-41d4-a716-446655440000"
)

response = gateway.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

That’s it—you’re making LLM calls through the Backbase AI Platform.

Don’t have credentials yet? Refer to our Onboarding guide.

Auto-Instrumentation with Observability

When you initialize observability, all gateway calls are automatically traced—no additional code required:

from bb_ai_sdk.observability import init
from bb_ai_sdk.ai_gateway import AIGateway

# Initialize observability first
init(agent_name="my-agent")

# Gateway calls are now traced automatically
gateway = AIGateway.create(model_id="gpt-4o", agent_id="...")

response = gateway.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
# This call appears in LangFuse with full context: tokens, latency, cost

Initialize observability before creating the gateway to ensure all calls are captured. See Observability for full configuration options.

Sync vs Async

Choose based on your application architecture:

Sync
Async

Use AIGateway for synchronous applications (scripts, simple APIs):

from bb_ai_sdk.ai_gateway import AIGateway

gateway = AIGateway.create(
    model_id="gpt-4o",
    agent_id="550e8400-e29b-41d4-a716-446655440000"
)

response = gateway.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Use AsyncAIGateway for async applications (FastAPI, LangGraph):

from bb_ai_sdk.ai_gateway import AsyncAIGateway

gateway = AsyncAIGateway.create(
    model_id="gpt-4o",
    agent_id="550e8400-e29b-41d4-a716-446655440000"
)

response = await gateway.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Common Use Cases

Streaming Responses

For real-time responses (chatbots, interactive UIs), enable streaming:

Sync Streaming
Async Streaming

stream = gateway.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

stream = await gateway.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)

async for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Framework Adapters

The AI Gateway is OpenAI-compatible out of the box, but if you’re using LangChain, LangGraph, or Agno, adapters convert the gateway into framework-native objects—no manual configuration required.

LangChain

from bb_ai_sdk.ai_gateway import AIGateway
from bb_ai_sdk.ai_gateway.adapters.langchain import to_langchain

gateway = AIGateway.create(model_id="gpt-4o", agent_id="...")
model = to_langchain(gateway)  # Returns a ChatOpenAI-compatible model

# Use with LangChain components
from langchain.schema.output_parser import StrOutputParser

chain = model | StrOutputParser()
response = chain.invoke("Tell me a joke")

LangGraph

from bb_ai_sdk.ai_gateway import AsyncAIGateway
from bb_ai_sdk.ai_gateway.adapters.langchain import to_langchain_async

gateway = AsyncAIGateway.create(model_id="gpt-4o", agent_id="...")
model = to_langchain_async(gateway)  # Returns async-compatible model

# Use in LangGraph nodes
async def generate(state):
    response = await model.ainvoke(state["messages"])
    return {"messages": [response]}

Agno

from bb_ai_sdk.ai_gateway import AIGateway
from bb_ai_sdk.ai_gateway.adapters.agno import to_agno
from agno import Agent

gateway = AIGateway.create(model_id="gpt-4o", agent_id="...")
model = to_agno(gateway)  # Returns Agno-compatible model

agent = Agent(
    name="Assistant",
    model=model,
    instructions="You are helpful."
)
response = agent.run("Hello!")

Common Patterns

Token Usage Tracking

Extract token consumption from responses:

from bb_ai_sdk.ai_gateway import get_token_usage

response = gateway.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

usage = get_token_usage(response)
if usage:
    print(f"Prompt tokens: {usage.prompt_tokens}")
    print(f"Completion tokens: {usage.completion_tokens}")
    print(f"Total tokens: {usage.total_tokens}")

Error Handling

Handle errors gracefully using the SDK’s specific exception types for different failure scenarios:

from bb_ai_sdk.ai_gateway import (
    AIGateway,
    InvalidAgentIdError,
    ConfigurationError,
    AuthenticationError,
    RateLimitError,
)

# Handle creation errors
try:
    gateway = AIGateway.create(
        model_id="gpt-4o",
        agent_id="invalid"
    )
except InvalidAgentIdError:
    print("Invalid agent ID format—must be UUID v4")
except ConfigurationError:
    print("Missing API key or gateway URL")

# Handle request errors
try:
    response = gateway.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limit exceeded—implement backoff")

Error Types Reference

Error	HTTP Code	Description
`InvalidAgentIdError`	—	Agent ID not in UUID v4 format
`ConfigurationError`	—	Missing API key or invalid gateway URL
`AuthenticationError`	401	Invalid or expired API key
`AuthorizationError`	403	Insufficient permissions for this operation
`RateLimitError`	429	Rate limit exceeded
`ValidationError`	400	Invalid request parameters
`ModelNotFoundError`	404	Requested model not available
`ServiceError`	500+	Server-side error
`NetworkError`	—	Connection failed

Configuration

Environment Variables

Configure credentials via environment variables (recommended):

.env

# Required
AI_GATEWAY_API_KEY=your-api-key
AI_GATEWAY_ENDPOINT=your-ai-gateway-endpoint

Never commit API keys to version control. Add .env to your .gitignore.

Create Parameters

model_id

string

required

Model identifier (e.g., gpt-4o, gpt-4o-mini, gpt-4-turbo).

agent_id

string

required

Your agent’s unique identifier in UUID v4 format. Obtained from the platform when you register your agent.

api_key

string

API key for authentication. Falls back to AI_GATEWAY_API_KEY environment variable if not provided.

base_url

string

Gateway URL. Falls back to AI_GATEWAY_ENDPOINT environment variable or platform default.

api_version

string

default:"2024-10-21"

API version for the gateway.

Advanced: Accessing the Underlying Client

Access the underlying OpenAI client or raw configuration for advanced use cases:

get_client()

Get the underlying OpenAI client for direct SDK access:

client = gateway.get_client()
# Returns OpenAI (sync) or AsyncOpenAI (async) instance

get_config()

Get configuration dictionary for manual framework setup:

config = gateway.get_config()
# Returns:
# {
#   "api_key": "...",
#   "base_url": "...",
#   "default_headers": {"x-agent-id": "...", "api-key": "..."},
#   "default_query": {"api-version": "2024-10-21"},
#   "model": "gpt-4o"
# }

API Reference

AIGateway

Property/Method	Returns	Description
`chat`	Chat interface	OpenAI-compatible chat completions
`model_id`	`str`	Configured model ID
`agent_id`	`str`	Validated agent ID
`get_client()`	`OpenAI`	Underlying OpenAI client
`get_config()`	`dict`	Configuration dictionary
`create()`	`AIGateway`	Factory method (class method)

AsyncAIGateway

Same interface as AIGateway but returns AsyncOpenAI client and supports async operations.

Next Steps

Observability

Add tracing and monitoring to your agents

Starter Kits

See AI Gateway integrated in production templates

Get Started

Build your first agent end-to-end

Examples

View complete working examples

Overview

Architecture & Technology

Getting Started

ADLC - Agent Development Lifecycle

Starter Kits

BB AI SDK

CI/CD Workflows

Why AI Gateway?

Quick Start

1. Set Up Environment Variables

2. Make Your First Call

Auto-Instrumentation with Observability

Sync vs Async

Common Use Cases

Streaming Responses

Framework Adapters

LangChain

LangGraph

Agno

Common Patterns

Token Usage Tracking

Error Handling

Error Types Reference

Configuration

Environment Variables

Create Parameters

Advanced: Accessing the Underlying Client

get_client()

get_config()

API Reference

AIGateway

AsyncAIGateway

Next Steps

Observability

Starter Kits

Get Started

Examples

Overview

Architecture & Technology

Getting Started

ADLC - Agent Development Lifecycle

Starter Kits

BB AI SDK

CI/CD Workflows

​Why AI Gateway?

​Quick Start

​1. Set Up Environment Variables

​2. Make Your First Call

​Auto-Instrumentation with Observability

​Sync vs Async

​Common Use Cases

​Streaming Responses

​Framework Adapters

​LangChain

​LangGraph

​Agno

​Common Patterns

​Token Usage Tracking

​Error Handling

​Error Types Reference

​Configuration

​Environment Variables

​Create Parameters

​Advanced: Accessing the Underlying Client

​get_client()

​get_config()

​API Reference

​AIGateway

​AsyncAIGateway

​Next Steps

Observability

Starter Kits

Get Started

Examples

Why AI Gateway?

Quick Start

1. Set Up Environment Variables

2. Make Your First Call

Auto-Instrumentation with Observability

Sync vs Async

Common Use Cases

Streaming Responses

Framework Adapters

LangChain

LangGraph

Agno

Common Patterns

Token Usage Tracking

Error Handling

Error Types Reference

Configuration

Environment Variables

Create Parameters

Advanced: Accessing the Underlying Client

get_client()

get_config()

API Reference

AIGateway

AsyncAIGateway

Next Steps