Current Version: v0.1.0
GitHub Repository
View source code, releases, and issues
Why RAG for Banking?
Financial institutions manage vast repositories of policies, regulations, product documentation, and customer-facing content. RAG enables agents to provide accurate, source-cited answers from this knowledge—critical for compliance, trust, and customer experience. Common use cases:- Policy and compliance Q&A
- Product information lookup
- Internal knowledge base assistants
- Document-grounded customer support
What You’ll Get
This starter provides a complete RAG pipeline with three core capabilities:Document Ingestion
Load PDFs, text, markdown, and web pages with multiple chunking strategies (character, token, semantic, recursive).
Hybrid Search
Semantic vector search, keyword full-text search, and hybrid search with optional reranking for improved relevance.
RAG Agent
Knowledge-augmented responses with source citations. Answers grounded in your documents.
Starter Kit Implementation
The native implementation included in this repository—custom pipelines using raw SQL, embeddings API, and search logic.
Agno Framework Alternative
Expandable examples throughout each section showing how to achieve similar functionality using Agno’s built-in knowledge base, readers, and search features.
Prerequisites
- Python 3.11+: Managed via UV
- UV Package Manager: Modern Python package manager (replaces pip/poetry)
- PostgreSQL 14+: With pgvector extension for vector storage
- Docker: For running PostgreSQL locally
Quick Start
Setup Database and Install Dependencies
Set up PostgreSQL with pgvector. See Database Setup for detailed instructions on provisioning via Azure or running locally with Docker.
Database Setup
The starter uses PostgreSQL with pgvector for vector storage and similarity search. You have two options for provisioning the database:- Azure PostgreSQL (Production)
- Docker (Local Development)
Use Self Service to provision a managed PostgreSQL instance on Azure. This is the recommended approach for production deployments and shared development environments.Once provisioned, configure your
See Self Service for instructions on requesting PostgreSQL with pgvector extension and managing infrastructure access.
.env with the provided credentials:Initialize Database
After configuring your database connection, run the setup script to create the required tables:| Table | Purpose |
|---|---|
documents | Source documents with metadata |
chunks | Text chunks with token counts |
embeddings | Vector embeddings (1536 dimensions) |
Database Management
Using Agno Framework for Database
Using Agno Framework for Database
Agno manages its own schema automatically via See Agno PgVector Documentation for more details on PgVector and other supported vector stores.
PgVector. To use the AI Gateway for embeddings:Document Ingestion
Add documents to the knowledge base using CLI scripts or the REST API.- CLI
- API
Chunking Strategies
| Strategy | Description | Best For |
|---|---|---|
| character | Fixed character-based splits | Simple text |
| token | Token-aware splitting (tiktoken) | LLM-optimized chunks |
| semantic | Sentence boundary splitting | Preserving meaning |
| recursive | Hierarchical splitting (default) | Structured documents |
Supported File Formats
| Format | Extensions |
|---|---|
.pdf | |
| Text | .txt |
| Markdown | .md |
| Web URLs | http://, https:// |
Using Agno Framework for Ingestion
Using Agno Framework for Ingestion
Agno provides built-in Readers that transform raw content from various sources into structured Agno supports multiple reader types including PDF, CSV, Markdown, JSON, and web content. See Agno Readers Documentation for the full list of supported readers and configuration options.For chunking strategies, Agno supports document chunking, fixed-size chunking, semantic chunking, and agentic chunking. See Agno Chunking Documentation for details.
Document objects. Readers handle parsing, text extraction, and automatic chunking.Search & Retrieval
Query the knowledge base with multiple search strategies.- CLI
- API
Search Strategies
| Strategy | Description | Best For |
|---|---|---|
| Semantic | Vector similarity using embeddings | Conceptual queries, finding related content |
| Keyword | Full-text search with PostgreSQL tsvector | Exact terms, specific phrases |
| Hybrid | Combined semantic + keyword with RRF fusion | General use, best overall accuracy |
Reranking
Reranking improves search result quality by re-scoring retrieved documents. Two backends are supported:| Backend | Latency | Setup | Best For |
|---|---|---|---|
| Cohere | ~100-300ms | Requires API keys | Production, high throughput |
| Local Cross-Encoder | ~1-5s | No setup needed | Development, testing |
Using Agno Framework for Search
Using Agno Framework for Search
Agno provides built-in search capabilities on the knowledge base with support for different search types.Agno’s See Agno Search & Retrieval for advanced search configuration and filtering options.
PgVector supports multiple search types configured at initialization:RAG Agent
The starter includes a RAG-enabled agent that searches the knowledge base to answer questions with source citations.Using Agno Framework for RAG Agent
Using Agno Framework for RAG Agent
Agno agents can automatically search the knowledge base when configured with knowledge:When
search_knowledge=True, the agent automatically queries the knowledge base for relevant context before generating a response. You can also configure the number of results to retrieve and filtering options.See Agno Knowledge Getting Started for more details on integrating knowledge bases with agents.API Reference
| Method | Endpoint | Description |
|---|---|---|
GET | / | Service status |
GET | /health | Health check |
POST | /ingest | Ingest documents (file, URL, or text) |
GET | /documents | List all documents |
GET | /documents/{id} | Get document details |
DELETE | /documents/{id} | Delete a document |
POST | /search | Search the knowledge base |
POST | /agent/crude | RAG agent query |
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
Project Structure
Observability
The starter automatically integrates with Langfuse via thebb-ai-sdk.
- Automatic Tracing: Captures full traces for agent runs, search queries, and LLM calls.
- Embedding Tracking: Monitors embedding generation costs and latency.
- Search Analytics: Tracks search queries, strategies, and result quality.
- Configuration: Managed via
LANGFUSE_*environment variables.
This starter is integrated with bb-ai-sdk to connect with observability tools (Langfuse) and AI Gateway. See BB AI SDK Observability for advanced configuration and custom tracing.
Development
Run Tests
Build Docker Image
CI/CD
Standard workflows are pre-configured in.github/workflows:
- PR Checks: Linting, testing, and validation.
- Build & Publish: Docker image creation on merge.
- Release: Automated versioning and release notes.
See CI/CD Workflows for pipeline details.
Next Steps
- Create Your First Agent: Deploy to a runtime
- Starter Agent: Start with basic agent patterns
- Multi-Agent: Build agent teams
- MCP Agent: Integrate with MCP servers
- BB AI SDK: AI Gateway and observability