Skip to main content
Current Version: v0.1.0
The Knowledge Agent (Level 3) is a production-ready RAG (Retrieval-Augmented Generation) starter kit for building knowledge-powered AI agents. It provides complete document ingestion, intelligent chunking, hybrid search, and RAG-enabled query capabilities—everything needed to build agents that answer questions from your organization’s documents and data.
This repository can be used as a base template for creating your own application. Select starter-knowledge-base-agent as the repository_template when provisioning a new repository via Self Service.

GitHub Repository

View source code, releases, and issues

Why RAG for Banking?

Financial institutions manage vast repositories of policies, regulations, product documentation, and customer-facing content. RAG enables agents to provide accurate, source-cited answers from this knowledge—critical for compliance, trust, and customer experience. Common use cases:
  • Policy and compliance Q&A
  • Product information lookup
  • Internal knowledge base assistants
  • Document-grounded customer support

What You’ll Get

This starter provides a complete RAG pipeline with three core capabilities:

Document Ingestion

Load PDFs, text, markdown, and web pages with multiple chunking strategies (character, token, semantic, recursive).

Hybrid Search

Semantic vector search, keyword full-text search, and hybrid search with optional reranking for improved relevance.

RAG Agent

Knowledge-augmented responses with source citations. Answers grounded in your documents.
This documentation explains two implementation approaches:

Starter Kit Implementation

The native implementation included in this repository—custom pipelines using raw SQL, embeddings API, and search logic.

Agno Framework Alternative

Expandable examples throughout each section showing how to achieve similar functionality using Agno’s built-in knowledge base, readers, and search features.

Prerequisites

  • Python 3.11+: Managed via UV
  • UV Package Manager: Modern Python package manager (replaces pip/poetry)
  • PostgreSQL 14+: With pgvector extension for vector storage
  • Docker: For running PostgreSQL locally
Database Required: This starter requires PostgreSQL with the pgvector extension. Unlike other starters, you must set up a database before running the application.

Quick Start

1

Clone and Install UV

git clone https://github.com/bb-ecos-agbs/starter-knowledge-base-agent.git
cd starter-knowledge-base-agent

# Install UV (macOS)
brew install uv

# Install UV (Linux/WSL)
curl -LsSf https://astral.sh/uv/install.sh | sh
2

Setup Environment

# Create virtual environment
uv venv --python 3.11
source .venv/bin/activate  # macOS/Linux
# or .venv\Scripts\activate  # Windows

# Copy environment template
cp env.template .env
3

Configure Credentials

Edit .env with required values:
# Required - Artifactory credentials (for bb-ai-sdk)
export UV_INDEX_BACKBASE_USERNAME[email protected]
export UV_INDEX_BACKBASE_PASSWORD=your-artifactory-token

# AI Gateway
AI_GATEWAY_ENDPOINT=https://ai-gateway.backbase.cloud
AI_GATEWAY_API_KEY=your-api-key

# Database
DB_HOST=localhost
DB_PORT=5432
DB_NAME=knowledge_base
DB_USER=postgres
DB_PASSWORD=postgres

# Observability (Langfuse)
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com

# Web Proxy (Required for local dev)
HTTP_PROXY=http://webproxy.infra.backbase.cloud:8888
HTTPS_PROXY=http://webproxy.infra.backbase.cloud:8888
4

Setup Database and Install Dependencies

Set up PostgreSQL with pgvector. See Database Setup for detailed instructions on provisioning via Azure or running locally with Docker.
# Export env vars and sync dependencies
source .env && uv sync
5

Run the Server

uv run python -m src.main
The server runs at http://localhost:8000. Access the API documentation at /docs.
VPN and Web Proxy: Required for local development. Configure Aviatrix VPN and web proxy settings. See Onboarding Guide for setup instructions.

Database Setup

The starter uses PostgreSQL with pgvector for vector storage and similarity search. You have two options for provisioning the database:
Use Self Service to provision a managed PostgreSQL instance on Azure. This is the recommended approach for production deployments and shared development environments.
See Self Service for instructions on requesting PostgreSQL with pgvector extension and managing infrastructure access.
Once provisioned, configure your .env with the provided credentials:
DB_HOST=your-postgres-instance-host
DB_PORT=5432
DB_NAME=knowledge_base
DB_USER=your-username
DB_PASSWORD=your-password
DB_SSLMODE=require

Initialize Database

After configuring your database connection, run the setup script to create the required tables:
# Initialize database tables
uv run python scripts/setup_db.py
This creates three normalized tables:
TablePurpose
documentsSource documents with metadata
chunksText chunks with token counts
embeddingsVector embeddings (1536 dimensions)

Database Management

# Test connection
uv run python scripts/setup_db.py --test

# Check database status
uv run python scripts/setup_db.py --status

# Reset database (drop and recreate)
uv run python scripts/setup_db.py --reset
Agno manages its own schema automatically via PgVector. To use the AI Gateway for embeddings:
import os
from agno.vectordb.pgvector import PgVector, SearchType
from agno.knowledge.embedder.openai import OpenAIEmbedder

# Configure embedder to use AI Gateway
gateway_url = os.getenv("AI_GATEWAY_ENDPOINT")
gateway_key = os.getenv("AI_GATEWAY_API_KEY")

embedder = OpenAIEmbedder(
    id="text-embedding-3-small",
    dimensions=1536,
    api_key=gateway_key,
    base_url=f"{gateway_url}/deployments/text-embedding-3-small",
)

# Create knowledge base with the embedder
kb = PgVector(
    table_name="my_kb",
    db_url="postgresql://user:pass@localhost:5432/db",
    embedder=embedder,
    search_type=SearchType.hybrid,
)
kb.create()  # Creates table automatically
See Agno PgVector Documentation for more details on PgVector and other supported vector stores.

Document Ingestion

Add documents to the knowledge base using CLI scripts or the REST API.
# Ingest a single file
uv run python scripts/ingest.py --file ./data/samples/document.pdf

# With specific chunking strategy
uv run python scripts/ingest.py --file ./document.pdf --chunking semantic

# Ingest a directory
uv run python scripts/ingest.py --dir ./docs/

# Ingest from URL
uv run python scripts/ingest.py --url https://example.com/article

# PDF with OCR (for scanned documents)
uv run python scripts/ingest.py --file ./scanned.pdf --ocr

Chunking Strategies

StrategyDescriptionBest For
characterFixed character-based splitsSimple text
tokenToken-aware splitting (tiktoken)LLM-optimized chunks
semanticSentence boundary splittingPreserving meaning
recursiveHierarchical splitting (default)Structured documents

Supported File Formats

FormatExtensions
PDF.pdf
Text.txt
Markdown.md
Web URLshttp://, https://
Agno provides built-in Readers that transform raw content from various sources into structured Document objects. Readers handle parsing, text extraction, and automatic chunking.
from agno.knowledge.reader.pdf_reader import PDFReader

# Configure reader with chunking
reader = PDFReader(
    chunk=True,
    chunk_size=1000,
)

# Read and process document
documents = reader.read("document.pdf")

# Add to knowledge base
kb.load(documents)
Agno supports multiple reader types including PDF, CSV, Markdown, JSON, and web content. See Agno Readers Documentation for the full list of supported readers and configuration options.For chunking strategies, Agno supports document chunking, fixed-size chunking, semantic chunking, and agentic chunking. See Agno Chunking Documentation for details.

Search & Retrieval

Query the knowledge base with multiple search strategies.
# Hybrid search (default - combines semantic and keyword)
uv run python scripts/search.py "What is attention?"

# Semantic search only (vector similarity)
uv run python scripts/search.py "transformer architecture" --strategy semantic

# Keyword search only (full-text)
uv run python scripts/search.py "neural network" --strategy keyword

# With reranking for better quality
uv run python scripts/search.py "attention mechanism" --rerank

Search Strategies

StrategyDescriptionBest For
SemanticVector similarity using embeddingsConceptual queries, finding related content
KeywordFull-text search with PostgreSQL tsvectorExact terms, specific phrases
HybridCombined semantic + keyword with RRF fusionGeneral use, best overall accuracy

Reranking

Reranking improves search result quality by re-scoring retrieved documents. Two backends are supported:
BackendLatencySetupBest For
Cohere~100-300msRequires API keysProduction, high throughput
Local Cross-Encoder~1-5sNo setup neededDevelopment, testing
For production deployments, configure Cohere reranking via COHERE_RERANK_ENDPOINT and COHERE_RERANK_API_KEY environment variables.

RAG Agent

The starter includes a RAG-enabled agent that searches the knowledge base to answer questions with source citations.
curl -X POST http://localhost:8000/agent/crude \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the attention mechanism?"}'
Response with sources:
{
  "query": "What is the attention mechanism?",
  "answer": "Based on the knowledge base, the attention mechanism is a component that allows models to focus on relevant parts of the input...",
  "sources": [
    {
      "title": "attention_is_all_you_need.pdf",
      "source": "/data/samples/attention_is_all_you_need.pdf",
      "snippet": "An attention function can be described as mapping a query and a set of key-value pairs to an output..."
    }
  ]
}
Agno agents can automatically search the knowledge base when configured with knowledge:
from agno.agent import Agent

agent = Agent(
    model=model,
    knowledge=kb,
    search_knowledge=True,  # Automatic retrieval
)
response = agent.run("Explain transformers")
When search_knowledge=True, the agent automatically queries the knowledge base for relevant context before generating a response. You can also configure the number of results to retrieve and filtering options.See Agno Knowledge Getting Started for more details on integrating knowledge bases with agents.

API Reference

MethodEndpointDescription
GET/Service status
GET/healthHealth check
POST/ingestIngest documents (file, URL, or text)
GET/documentsList all documents
GET/documents/{id}Get document details
DELETE/documents/{id}Delete a document
POST/searchSearch the knowledge base
POST/agent/crudeRAG agent query
Access interactive API documentation at:
  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Project Structure

starter-knowledge-base-agent/
├── .github/                    # CI/CD workflows
├── scripts/
│   ├── setup_db.py             # Database setup
│   ├── ingest.py               # Ingestion CLI
│   └── search.py               # Search CLI
├── src/
│   ├── main.py                 # App entry point
│   ├── agents/                 # RAG agent implementation
│   │   ├── assistant.py        # Agent with knowledge base tool
│   │   └── tools/              # Knowledge base search tool
│   ├── api/                    # FastAPI routes and schemas
│   │   ├── app.py
│   │   ├── schemas.py
│   │   └── routers/            # Ingest, search, documents, agent
│   ├── config/                 # Configuration management
│   ├── ingestion/              # Document ingestion pipeline
│   │   ├── loaders/            # PDF, text, markdown, web loaders
│   │   ├── chunkers/           # Chunking strategies
│   │   ├── embeddings/         # Embedding generation
│   │   └── pipeline.py         # Ingestion orchestrator
│   ├── search/                 # Search pipeline
│   │   ├── semantic.py         # Vector similarity search
│   │   ├── keyword.py          # Full-text search
│   │   ├── hybrid.py           # RRF fusion
│   │   ├── rerankers/          # Cohere, cross-encoder
│   │   └── pipeline.py         # Search orchestrator
│   └── storage/                # Database repositories
│       ├── document.py
│       ├── chunk.py
│       └── embedding.py
├── tests/                      # Test suite
├── redteam.yaml                # Red teaming configuration
├── Dockerfile                  # Container definition
└── pyproject.toml              # Dependencies

Observability

The starter automatically integrates with Langfuse via the bb-ai-sdk.
  • Automatic Tracing: Captures full traces for agent runs, search queries, and LLM calls.
  • Embedding Tracking: Monitors embedding generation costs and latency.
  • Search Analytics: Tracks search queries, strategies, and result quality.
  • Configuration: Managed via LANGFUSE_* environment variables.
This starter is integrated with bb-ai-sdk to connect with observability tools (Langfuse) and AI Gateway. See BB AI SDK Observability for advanced configuration and custom tracing.

Development

Run Tests

source .env && uv sync --extra dev
uv run pytest tests/ -v

# With coverage
uv run pytest tests/ -v --cov=src

Build Docker Image

docker build -t starter-knowledge-base-agent:local .

CI/CD

Standard workflows are pre-configured in .github/workflows:
  • PR Checks: Linting, testing, and validation.
  • Build & Publish: Docker image creation on merge.
  • Release: Automated versioning and release notes.
See CI/CD Workflows for pipeline details.

Next Steps