Why AI Gateway?
Instead of managing API keys per-model and per-environment, the AI Gateway provides a centralized, authenticated entry point for all LLM interactions. It handles authentication, agent ID validation, and policy enforcement—while maintaining full OpenAI SDK compatibility.
The gateway wraps the OpenAI SDK—any framework that works with OpenAI works with AI Gateway. No code changes required.
Prerequisite: Ensure you’ve installed the SDK and configured your environment before proceeding.
Quick Start
1. Set Up Environment Variables
Before using the gateway, configure your credentials:
# Required for AI Gateway
AI_GATEWAY_API_KEY=your-api-key
AI_GATEWAY_ENDPOINT=your-ai-gateway-endpoint
Never commit API keys to version control. Add .env to your .gitignore.
2. Make Your First Call
from bb_ai_sdk.ai_gateway import AIGateway
gateway = AIGateway.create(
model_id="gpt-4o",
agent_id="550e8400-e29b-41d4-a716-446655440000"
)
response = gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
That’s it—you’re making LLM calls through the Backbase AI Platform.
Auto-Instrumentation with Observability
When you initialize observability, all gateway calls are automatically traced—no additional code required:
from bb_ai_sdk.observability import init
from bb_ai_sdk.ai_gateway import AIGateway
# Initialize observability first
init(agent_name="my-agent")
# Gateway calls are now traced automatically
gateway = AIGateway.create(model_id="gpt-4o", agent_id="...")
response = gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
# This call appears in LangFuse with full context: tokens, latency, cost
Initialize observability before creating the gateway to ensure all calls are captured. See Observability for full configuration options.
Sync vs Async
Choose based on your application architecture:
Use AIGateway for synchronous applications (scripts, simple APIs):from bb_ai_sdk.ai_gateway import AIGateway
gateway = AIGateway.create(
model_id="gpt-4o",
agent_id="550e8400-e29b-41d4-a716-446655440000"
)
response = gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
Use AsyncAIGateway for async applications (FastAPI, LangGraph):from bb_ai_sdk.ai_gateway import AsyncAIGateway
gateway = AsyncAIGateway.create(
model_id="gpt-4o",
agent_id="550e8400-e29b-41d4-a716-446655440000"
)
response = await gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
Common Use Cases
Streaming Responses
For real-time responses (chatbots, interactive UIs), enable streaming:
Sync Streaming
Async Streaming
stream = gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
stream = await gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
async for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Framework Adapters
The AI Gateway is OpenAI-compatible out of the box, but if you’re using LangChain, LangGraph, or Agno, adapters convert the gateway into framework-native objects—no manual configuration required.
LangChain
from bb_ai_sdk.ai_gateway import AIGateway
from bb_ai_sdk.ai_gateway.adapters.langchain import to_langchain
gateway = AIGateway.create(model_id="gpt-4o", agent_id="...")
model = to_langchain(gateway) # Returns a ChatOpenAI-compatible model
# Use with LangChain components
from langchain.schema.output_parser import StrOutputParser
chain = model | StrOutputParser()
response = chain.invoke("Tell me a joke")
LangGraph
from bb_ai_sdk.ai_gateway import AsyncAIGateway
from bb_ai_sdk.ai_gateway.adapters.langchain import to_langchain_async
gateway = AsyncAIGateway.create(model_id="gpt-4o", agent_id="...")
model = to_langchain_async(gateway) # Returns async-compatible model
# Use in LangGraph nodes
async def generate(state):
response = await model.ainvoke(state["messages"])
return {"messages": [response]}
Agno
from bb_ai_sdk.ai_gateway import AIGateway
from bb_ai_sdk.ai_gateway.adapters.agno import to_agno
from agno import Agent
gateway = AIGateway.create(model_id="gpt-4o", agent_id="...")
model = to_agno(gateway) # Returns Agno-compatible model
agent = Agent(
name="Assistant",
model=model,
instructions="You are helpful."
)
response = agent.run("Hello!")
Common Patterns
Token Usage Tracking
Extract token consumption from responses:
from bb_ai_sdk.ai_gateway import get_token_usage
response = gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
usage = get_token_usage(response)
if usage:
print(f"Prompt tokens: {usage.prompt_tokens}")
print(f"Completion tokens: {usage.completion_tokens}")
print(f"Total tokens: {usage.total_tokens}")
Error Handling
Handle errors gracefully using the SDK’s specific exception types for different failure scenarios:
from bb_ai_sdk.ai_gateway import (
AIGateway,
InvalidAgentIdError,
ConfigurationError,
AuthenticationError,
RateLimitError,
)
# Handle creation errors
try:
gateway = AIGateway.create(
model_id="gpt-4o",
agent_id="invalid"
)
except InvalidAgentIdError:
print("Invalid agent ID format—must be UUID v4")
except ConfigurationError:
print("Missing API key or gateway URL")
# Handle request errors
try:
response = gateway.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
except AuthenticationError:
print("Invalid API key")
except RateLimitError:
print("Rate limit exceeded—implement backoff")
Error Types Reference
| Error | HTTP Code | Description |
|---|
InvalidAgentIdError | — | Agent ID not in UUID v4 format |
ConfigurationError | — | Missing API key or invalid gateway URL |
AuthenticationError | 401 | Invalid or expired API key |
AuthorizationError | 403 | Insufficient permissions for this operation |
RateLimitError | 429 | Rate limit exceeded |
ValidationError | 400 | Invalid request parameters |
ModelNotFoundError | 404 | Requested model not available |
ServiceError | 500+ | Server-side error |
NetworkError | — | Connection failed |
Configuration
Environment Variables
Configure credentials via environment variables (recommended):
# Required
AI_GATEWAY_API_KEY=your-api-key
AI_GATEWAY_ENDPOINT=your-ai-gateway-endpoint
Never commit API keys to version control. Add .env to your .gitignore.
Create Parameters
Model identifier (e.g., gpt-4o, gpt-4o-mini, gpt-4-turbo).
Your agent’s unique identifier in UUID v4 format. Obtained from the platform when you register your agent.
API key for authentication. Falls back to AI_GATEWAY_API_KEY environment variable if not provided.
Gateway URL. Falls back to AI_GATEWAY_ENDPOINT environment variable or platform default.
api_version
string
default:"2024-10-21"
API version for the gateway.
Advanced: Accessing the Underlying Client
Access the underlying OpenAI client or raw configuration for advanced use cases:
get_client()
Get the underlying OpenAI client for direct SDK access:
client = gateway.get_client()
# Returns OpenAI (sync) or AsyncOpenAI (async) instance
get_config()
Get configuration dictionary for manual framework setup:
config = gateway.get_config()
# Returns:
# {
# "api_key": "...",
# "base_url": "...",
# "default_headers": {"x-agent-id": "...", "api-key": "..."},
# "default_query": {"api-version": "2024-10-21"},
# "model": "gpt-4o"
# }
API Reference
AIGateway
| Property/Method | Returns | Description |
|---|
chat | Chat interface | OpenAI-compatible chat completions |
model_id | str | Configured model ID |
agent_id | str | Validated agent ID |
get_client() | OpenAI | Underlying OpenAI client |
get_config() | dict | Configuration dictionary |
create() | AIGateway | Factory method (class method) |
AsyncAIGateway
Same interface as AIGateway but returns AsyncOpenAI client and supports async operations.
Next Steps