Architecture overview

Platform layers
The architecture diagram shows five foundational layers that work together:1. foundation (infrastructure)
1. foundation (infrastructure)
- Kubernetes: Container orchestration across any cloud provider
- Service Mesh: Istio for traffic management, security, and observability
- Container Runtime: Docker for containerization
- Packaging: Helm charts for deployment configuration
2. data
2. data
- Vector Storage: PostgreSQL with pgvector for embeddings and RAG
- Caching: Redis for session state and fast lookups
- Object Storage: Cloud-native storage for artifacts and models
- Message Bus: Azure Service Bus for async workflows
- Registries: Container registries (ACR/ECR/GCR) and package registries (GitHub Packages)
3. AI gateway
3. AI gateway
- Provider Abstraction: Unified interface to Azure AI Foundry, OpenAI, Gemini, Anthropic, or BYO models
- Policy Enforcement: Guardrails, rate limiting, cost management, and content safety
- Traffic Management: Semantic caching, load balancing, and intelligent request routing
- PII Protection: Automatic detection and sanitization
- SDK Access: BB AI SDK for standardized integration
4. agent runtime
4. agent runtime
- Agent Workloads: FastAPI-based APIs with background workers
- Orchestration: Agno and LangGraph for multi-step workflows
- MCP Integration: Model Context Protocol servers for tool access
- Banking Services: Pre-integrated domain services (deposits, payments, loans, fraud)
- Guardrails and Safety: Real-time evaluations, content safety filters, PII detection/sanitization, jailbreak prevention, prompt validation
- Observability: OpenTelemetry traces, Langfuse runs, metrics, and logs
5. control plane and ingress
5. control plane and ingress
- Ingress: API Management (APIM) with DNS, WAF, SSL termination
- GitOps: Argo CD for declarative, automated deployments
- CI/CD: Automated pipelines with PR checks, builds, and releases
- Self-Service: Repository provisioning and infrastructure automation
- Monitoring: Grafana dashboards, PagerDuty alerts, real-time evaluations
Runtime flow

- Client → APIM → Agent API: Request enters through API Management with authentication and rate limiting
- Agent → AI Gateway: Agent calls LLM through gateway with guardrails and traffic control applied
- AI Gateway → LLM Provider: Request routed to selected provider (Azure AI Foundry or BYO) with policies enforced
- Tool Execution: Agent calls MCP servers (banking domains) or direct REST/GraphQL APIs
- Observability: OTel traces and Langfuse runs capture every step; logs and metrics flow to platform sinks
- Real-time Evaluation: Automated evaluations flag risks (safety, PII, jailbreaks) and feed continuous improvements