Skip to main content
This page provides a comprehensive overview of the Agentic Platform architecture. It shows how the platform is assembled across fabrics, how it stays cloud-agnostic, and how you ship agents end-to-end.

Reference architecture

Agentic Platform cloud-agnostic architecture diagram

Agentic Platform - Cloud Agnostic Architecture

Core building blocks

The platform consists of these foundational components:
  • AI Gateway: Provider-agnostic LLM ingress with policy enforcement, PII detection, semantic caching, traffic control, and model routing. Accessed via BB AI SDK.
  • Agent Compute: Kubernetes workloads (agent API + workers) deployed via Argo CD with 99.9% availability SLA.
  • Storage: PostgreSQL vector database for long-term storage (embeddings and RAG-based applications); Redis for short-term caching and session state.
  • Observability: OpenTelemetry traces, Langfuse for agent runs, Grafana dashboards, PagerDuty alerts, standard logs/metrics.
  • API Management: APIM fronts agent APIs; MCP bridges to domain tools and banking services.
  • Service mesh & packaging: Optional Istio for traffic policies; Helm charts for deployment packaging.
  • Agent Orchestration: Manages multi-step agent processes, team coordination, and workflow execution.

Platform layers & fabrics

The platform operates across three main fabric layers:
FabricKey servicesWhat you deploy/useDeveloper impact
Digital Banking FabricDigital services, Identity & Entitlements, Process orchestrationExpose agent APIs via APIM; consume digital/identity/process APIsPublish OpenAPI, map to service URL.
Intelligence FabricAgentic Automation Services (this platform), Hybrid AI Services, Data ServicesAgents, tools, knowledge, evaluations (src/agents, src/api, src/knowledge)Core build surface for agents and tooling.
Integration FabricIntegration connectors, marketplace partnersMCP/domain connectors, marketplace toolsCall MCP servers or direct APIs through the tool catalog.
Platform context showing Intelligence, Integration, and Digital Banking fabrics

Platform Context across Fabrics

Reference: Backbase Confluence “Agentic Platform as a Service – Cloud Agnostic Architecture”.

High-level topology

Ingress layer

  • External: APIM in front of agent APIs with DNS, WAF, and SSL termination
  • Internal: Optional internal ingress for MCP/tooling and service-to-service communication

Control plane

  • GitOps: Argo CD for declarative deployments
  • CI/CD: Automated pipelines feed images to container registries (ACR/ECR/GCR/GitHub Packages)
  • Self-service: Repository provisioning and infrastructure management

Data plane

  • Agent workloads: FastAPI pods, background workers, agent orchestration
  • RAG stores: Vector databases (PostgreSQL), term-based search (Redis), object storage
  • Observability sinks: OTel collectors, Langfuse, Grafana, PagerDuty

AI Gateway layer

  • Provider abstraction: Fronts LLM providers (Azure AI Foundry, OpenAI, Gemini, Anthropic, BYO)
  • Policy enforcement: Guardrails, rate limiting, cost management
  • Traffic management: Semantic caching, load balancing, request routing

Agent runtime path

The end-to-end flow when a client makes a request:
  1. Client → APIM → agent API (FastAPI): Request enters through API Management
  2. Agent calls AI Gateway: With guardrails and traffic control applied
  3. AI Gateway → LLM: Calls the selected LLM (Azure AI Foundry or BYO model provider) with policies applied
  4. Agent tools execution: Call MCP servers (domain/third-party) or direct REST/GraphQL APIs
  5. Observability emission: OTel traces and Langfuse runs; logs/metrics go to platform sinks
  6. Real-time evaluation: Evaluations and guardrails run on traces to flag risks (safety, PII, jailbreak) and feed improvements

AI Gateway capabilities

The AI Gateway provides a unified, provider-agnostic entry point with:
  • Multi-provider support: Azure AI Foundry, OpenAI, Gemini, Anthropic, or BYO models
  • Policy enforcement: Centralized access control, cost management, rate limiting
  • Traffic control: AI-driven traffic pattern prediction and load balancing
  • PII sanitization: Automatic detection and redaction of sensitive data
  • Semantic caching: Reduces latency and costs by recognizing repetitive queries
  • Content safety: Default filters for toxicity, bias, and harmful content
  • Prompt guard: Compliance filtering at the gateway level

MCP integration & banking services

Agents discover and call tools via the Model Context Protocol (MCP) for standardized communication:
  • Tool discovery: Agents locate tools via MCP servers
  • Standardized communication: MCP JSON-RPC protocol
  • Integration methods: Direct, via Grand Central, or public MCP servers

Available Banking Domain MCPs

The platform provides pre-integrated MCP servers for banking services:
  • Deposits: Account management and operations
  • Loans: Seamless lending journeys
  • Transactions: Account history and queries
  • Payments Initiation: End-to-end payment lifecycle
  • Currency Exchange: FX operations
  • Investment Account: Investment management
  • Party Lifecycle: Onboarding and verification
  • Device: Card plastics lifecycle
  • Party Reference Data: Customer data management
  • Batch Payment: Bulk payment processing
  • Fraud: Behavioral fraud management
  • Party Access Entitlement: Access control
The platform is adopting BIAN Coreless for unified banking API and connectors.

Caching & performance

Agents utilize caching to reduce latency and costs while maintaining quality:
  • Response caching: Store previous LLM responses for similar queries
  • Embedding caching: Cache vector embeddings for knowledge retrieval
  • Semantic cache: AI Gateway-level semantic similarity matching
Configure caching in your agent code:
from agno import Agent
from agno.cache import InMemoryCache

agent = Agent(
    name="my-agent",
    cache=InMemoryCache(),  # Or RedisCache, FileCache, etc.
    # ... other config
)

Guardrails & governance

AI Gateway Guardrails

  • PII detection: Regex-based detection and sanitization
  • Content safety filters: Default toxicity, bias, and harmful content filters
  • Rate limits: Per-agent and per-user rate limiting
  • Model routing: Intelligent routing based on workload and cost
  • Semantic cache: Reduces redundant LLM calls
  • Traffic control: AI-driven traffic pattern management

Security layers

  • Input/output guardrails: Programmable guardrails at AI Gateway and prompt levels
  • Content safety filters: Multi-layer content filtering
  • Red teaming: Regular adversarial testing for injection, jailbreaking, misuse
  • Prompt validations: Pre-execution validation and sanitization
  • Secure SDLC: Dependency scans, container scans, code quality checks in CI/CD
  • Agent sandboxes: Mock API testing before real-world deployment
  • RBAC: Role-based access control for infrastructure and models

Platform components

AreaComponents
Evaluations & GuardrailsLangfuse, LangWatch, Promptfoo, Nemo Guardrails
LLMsAzure AI Foundry, BYO Models (OpenAI, Gemini, Anthropic)
DataPostgreSQL (vector), Redis, Storage Account, Azure Service Bus, Container Registry, GitHub Packages
ObservabilityGrafana, PagerDuty, Langfuse, OpenTelemetry
LLM OrchestrationAgent Orchestration (Agno, LangGraph)
Interoperability & ConnectivityProductised connectors, Custom connectors, HTTPS/mTLS, Site-to-site VPN, Privatelink
DevOpsSelf Service, CI Automation, Applications Live, Repository Template, Argo CD, Argo Workflows, Reference Agents, Reference MCPs
Service MeshKubernetes, Docker, Istio, Helm
Ingress & APIDNS, WAF, API Management (Azure APIM)

What you get

Cloud Agnostic

Kubernetes-first with pluggable ingress, storage, DNS, WAF, and APIM.

AI Gateway & Models

Provider-agnostic gateway with guardrails; use AI Foundry or BYO models.

GitOps & CI/CD

Argo CD plus opinionated workflows (provisioning, PR checks, build/publish, release).

Observability & Evals

OTel + Langfuse traces; real-time evaluations and guardrails on every trace.

Starter Kits

Prewired single-, multi-, MCP-, and knowledge-agent templates to ship fast.

Data & RAG

Vector DB (Postgres/Redis), object storage, Service Bus, package registries for artifacts.

Connectivity & Security

Productised/custom connectors, HTTPS/mTLS, VPN/Privatelink, DNS/WAF fronted via APIM.

Where to start

1

Provision a repo

Use Self Service and the Agno base template: see Create Your First Agent.The template includes:
  • Pre-configured CI/CD workflows
  • AI Gateway integration
  • Observability hooks (OTel, Langfuse)
  • Standard project structure
2

Build & run locally

Configure your .env file. Install dependencies and run locally.
3

Ship with CI/CD

The default workflows handle:
  • Repository provisioning: Auto-configures project on first push
  • PR checks: Linting, testing, security scans
  • Build/publish: Docker images to container registry
  • Release: Production releases with quality gates
See CI/CD Workflows for details.
4

Deploy to runtime

Update Applications Live with your Helm values and service configuration. Argo CD syncs automatically.
5

Expose via APIM

Add your OpenAPI spec in Applications Live and map to your service URL. Consumers call APIM endpoints, not the cluster directly.
Quick start: Check Starter Kits for prewired templates:
  • Starter Agent (Level 0): Basic instructions + tools
  • Multi Agent (Level 1): Teams with coordination
  • MCP Agent (Level 2): MCP integration examples
  • Knowledge Agent (Level 3): RAG implementations