Tool selection
Enable only the tools your agent genuinely needs. Start with 2 to 3 essential operations (e.g.,getCustomerProfile, searchKnowledgeBase) and expand as use cases evolve. Automated validation runs faster for focused tool sets, so enabling 20 tools upfront means longer validation time. You can always enable additional tools later through the admin portal as your agent’s capabilities grow.
Match tools to your agent’s specific purpose. A customer support agent needs account lookup and order history - it doesn’t need payment processing or admin operations. A lending advisor needs credit checks and loan calculations - it doesn’t need transaction search. Narrow tool scopes accelerate automated validation and minimize the blast radius if credentials leak.
Avoid redundancy. If two backend APIs provide similar functionality (e.g., GET /customers/{id} and GET /users/{id} both return customer data), enable only one through the portal. Multiple overlapping tools confuse agents during tool selection and waste rate limit capacity on duplicate functionality.
Configure minimal permissions for each tool. Read-only access passes automated validation faster than write access. User-scoped tools (limited to authenticated user’s data) are safer than admin tools (access to all customers). Be explicit when configuring tool access: “This agent needs read-only, user-scoped access to customer profiles for personalization.”
Enabling new tools
When enabling MCP exposure for an API operation through the admin portal, provide clear configuration details that help automated validation understand the security implications. Well-configured tools pass validation in 1 to 3 days for low-risk operations; incomplete configurations may require manual review extending to 3 days for high-risk operations. Good configuration example:AI agent design
Well-designed agents follow predictable patterns that improve reliability and user experience. Here’s the workflow that works in production: Calltools/list once at startup to discover available tools. Cache the result for the agent’s lifetime - tool definitions don’t change mid-session. This avoids wasting rate limit quota on redundant discovery calls. During startup, parse tool descriptions and parameter schemas so the agent understands what each tool does and what inputs it requires.
Match tool descriptions to user intent. When a user asks “What’s my account balance?”, the agent should recognize that getAccountBalance (description: “Retrieve current balance for a bank account”) is the right tool. Train your agent to read tool metadata and choose appropriately. Generic instructions like “You have access to tools, use them when needed” lead to agents that guess randomly or ask users which tool to invoke.
Validate parameters before invocation. Check required fields, data types, and format patterns from the tool’s inputSchema before calling tools/call. If the schema requires accountId matching pattern ^ACC-[0-9]{6}$, and the user provides “12345”, catch that error in agent logic rather than firing a doomed API call. Better user experience: “That doesn’t look like a valid account ID - they usually start with ACC- followed by 6 digits.”
Handle tool failures gracefully. Don’t let 401 errors or rate limits crash the conversation. Implement fallback behavior: retry with exponential backoff for transient errors, ask users to re-authenticate for auth failures, apologize and offer human escalation for persistent problems. Never fabricate data if a tool call fails - tell users the truth about what went wrong.
Prompt engineering for tools
Effective prompts explicitly connect tools to use cases so agents know when to invoke each one:Rate limit awareness
Agent design directly impacts how quickly you hit rate limits. Cache tool results in conversation context to avoid redundant calls - if you fetch a customer profile at the start of a conversation, store it and reference the cached data instead of callinggetCustomerProfile five times. Batch operations when possible - if the backend offers a listAccountBalances (plural) endpoint that accepts multiple account IDs, use that instead of sequential getAccountBalance calls. Implement exponential backoff when you hit 429 errors: wait 1s, 2s, 4s, 8s between retries rather than hammering the API.
Here’s what good caching looks like in practice:
Security best practices
Security failures with AI agents are often subtle - accidentally logging PII, invoking tools without proper user authentication, or leaking API keys in error messages. Follow these patterns to avoid common pitfalls.Protect sensitive data
Never log tool responses without data governance approval. Tool responses may contain PII (names, emails, account numbers) or sensitive business data (transaction amounts, credit scores). Logging “customer profile lookup succeeded” is fine. Logging the actual profile JSON ({"name": "Jane Doe", "ssn": "123-45-6789", ...}) violates privacy policies in most jurisdictions. If you need audit logs, log request metadata (tool name, customer ID, timestamp, success/failure) rather than response payloads.
Validate user context before calling user-scoped tools. Don’t accept unauthenticated user input like “show me profile for customer 12345” and directly call getCustomerProfile(id=12345). The attacker just tricked your agent into exposing someone else’s data. Instead, extract the customer ID from the authenticated JWT token and use that: getCustomerProfile(id=jwt.claims.customerId). If users need to look up other customers (e.g., support agents helping end users), verify the support agent has appropriate permissions via role checks.
Remember that Grand Central logs all tool invocations for audit purposes. Your agent’s actions are traceable: who called what tool, when, with what parameters, and whether it succeeded. Don’t use tools for purposes outside their approved use case - invoking getCustomerProfile to scrape customer data for marketing analytics will get flagged in audit reviews. Stick to the use case you justified in your MCP access request.
API key management
Store API keys in secure secret management systems (Azure Key Vault, AWS Secrets Manager, HashiCorp Vault) rather than configuration files. Rotate keys regularly (every 90 days, or immediately if compromise is suspected): request a new key, update agent config, test with new key, revoke old key. Use different API keys for dev/staging/production environments - if a dev key leaks, production isn’t compromised. Monitor Grand Central’s usage dashboard for unexpected activity (midnight API calls when your agent should be idle, calls from unexpected IP ranges). Never hardcode API keys in source code, even in private repositories - git history is permanent, and repos get forked or made public accidentally. Don’t share API keys between different applications - if one app is compromised, all apps using that key are exposed. Don’t log or print API keys in application output (error messages, debug logs, dashboards). Don’t store keys in version control, even in private repositories protected by access controls.Error handling
Implement graceful degradation
Your AI agent should handle tool failures without breaking user experience:Common error scenarios
| Error Type | User Experience | Agent Response |
|---|---|---|
| Rate limit exceeded | Temporary delay | ”I’m processing many requests. Please wait 30 seconds.” |
| Authentication failed | Configuration issue | Contact your administrator |
| Tool not found | Feature unavailable | ”That feature is currently unavailable.” |
| Invalid parameters | Agent mistake | Retry with corrected parameters |
| Timeout | Backend slow/down | ”This is taking longer than expected. Let me try again.” |
Performance optimization
Agent performance directly affects user satisfaction. Slow agents feel broken even if they’re technically working. Optimize aggressively: Calltools/list once at startup and cache the result - don’t waste latency and rate limit quota discovering tools on every request. Tool definitions don’t change mid-session. Cache them for the agent’s lifetime (or until you detect a deployment that adds new tools).
Store tool results in conversation context to avoid redundant calls. If a user asks “What’s my name?” and you call getCustomerProfile, cache that result. When they later ask “What’s my email?”, reference the cached profile instead of calling the tool again. This pattern reduces API calls by 60-70% in typical conversations.
Use batch endpoints when processing collections. If you need balances for 5 accounts, check whether the backend offers listAccountBalances (plural) that accepts an array of IDs. One batched call (200ms) beats five sequential calls (5x150ms = 750ms). Not all tools support batching, but check during discovery - array-type parameters often indicate batch support.
Response time expectations
Different operations have different performance profiles. Set user expectations appropriately: Tool discovery (tools/list): under 500ms. Fast, should happen invisibly at startup.
Read operations (getCustomer, searchTransactions): 1 to 3 seconds. Moderate - users tolerate brief waits for data retrieval.
Write operations (createPayment, updateAccount): 2 to 5 seconds. Slower due to validation, database writes, and audit logging.
Complex operations (generateMonthlyReport, calculateCreditScore): 10 to 30+ seconds. Very slow - backend processing, aggregations, third-party API calls.
For slow operations, provide feedback so users don’t think the agent is frozen:
Testing your agent
Before deploying to production, test failure scenarios to verify your agent degrades gracefully. Production environments are hostile - rate limits trigger, authentication expires, backends time out. Agents that handle these conditions gracefully provide better user experiences than agents optimized only for the happy path. Rate limit testing: Fire 150 requests rapidly to trigger 429 errors (typical limit: 100/minute). Does your agent implement exponential backoff? Does it inform users about delays? Or does it crash with an unhandled exception? Authentication failure testing: Revoke your API key (or use an invalid key) and attempt tool invocations. Does your agent detect 401 errors and prompt users to re-authenticate? Or does it retry indefinitely, burning CPU and confusing users? Tool unavailability testing: Simulate Grand Central platform outages by blocking network access to the MCP endpoint. Does your agent fall back to knowledge-only responses? Does it offer escalation to human support? Or does it show cryptic connection errors? Invalid parameter testing: Call tools with malformed data (wrong data types, missing required fields, format pattern violations). Does your agent parse validation errors and ask users for corrected input? Or does it expose raw JSON-RPC error codes? Here’s a smoke test script to run before every deployment:Monitoring and observability
Production agents need observability to detect problems before users complain. Track metrics from both your agent’s perspective (client-side telemetry) and Grand Central’s dashboard (server-side platform metrics). Client-side metrics (instrument your agent code):- Tool call success rate: % of invocations that return results vs errors. Target: >95%.
- Average response time: P50/P95/P99 latency for tool invocations. Watch for degradation trends.
- Rate limit hit frequency: How often you hit 429 errors. If greater than 5%, adjust limits through admin portal.
- Tool usage distribution: Which tools get invoked most frequently. Informs caching strategy.
- Error type breakdown: Authentication (401), validation (-32602), backend (5xx). Helps prioritize fixes.
- Total tool invocations: Overall request volume, trends over time.
- Rate limit consumption: % of limit used. Alert at 90% to avoid hitting hard stops.
- Cost per tool: If your subscription has usage-based pricing.
- Authentication failures: Spike indicates credential issues or attacks.
Dashboard access is automatically configured when you enable MCP through the admin portal. Dashboards show aggregated metrics across all agents using your subscription key.
Alert configuration
Set up alerts that trigger before problems impact users: Critical alerts (page on-call immediately):- Authentication failures greater than 5%: Indicates API key expired, revoked, or misconfigured. Check secret management system.
- Tool error rate greater than 5%: Backend APIs are failing. Check Grand Central status page and system status in admin portal.
- Rate limit hits greater than 10%: You’re hitting limits frequently. Implement better caching or adjust limits through admin portal.
- P95 response time greater than 5s: Performance degrading. Check backend API status, review agent caching strategy.
- Rate limit consumption greater than 90%: Approaching limit. Monitor closely and prepare to adjust through portal if sustained.
Production readiness checklist
Before deploying your agent to production, verify these requirements: Security:- API keys stored in secret management system (Azure Key Vault, AWS Secrets Manager, not code)
- Separate API keys for dev/staging/production environments
- Rate limit handling implemented (exponential backoff, user feedback)
- Authentication failure handling (prompt re-auth, don’t crash)
- No sensitive data logged (log metadata, not response payloads)
- Tool failure graceful degradation tested (what happens when tools fail?)
- Retry logic with exponential backoff (1s, 2s, 4s, 8s delays)
- Timeout handling for slow operations (>30s operations have user feedback)
- Circuit breaker for repeated failures (stop hammering failing APIs)
- Fallback behavior documented (offer human escalation when tools unavailable)
- Tool discovery cached at startup (don’t call
tools/liston every request) - Conversation context stores tool results (avoid redundant calls within conversation)
- Redundant tool calls eliminated (profile caching, batch operations)
- Response time expectations set for users (“Generating report, ~30s…”)
- Internal metrics collection enabled (success rate, latency, error types)
- Access to Grand Central dashboard requested and granted
- Alerts configured for authentication failures, rate limits, error rates
- Runbook documented for common issues (see “Common Issues” section below)
Common pitfalls
Overusing tools wastes rate limit quota and slows response times. Don’t callsearchKnowledgeBase when the user asks “What’s your return policy?” - that’s static information the agent should know from training data. Save tool invocations for dynamic, user-specific data: “Have I returned anything this year?” requires getOrderHistory, but general policy questions don’t. Train agents to distinguish knowledge questions (answer immediately) from data retrieval questions (invoke tools).
Ignoring tool descriptions leads to agents using the wrong tools. If your agent doesn’t read the description for searchOrders (“Returns order history for the last 90 days only”), it might invoke that tool when users ask “Show me my orders from 2020” - which will return zero results even though the backend has older data available. Include tool descriptions in agent prompts and train the agent to match descriptions to user intent.
Exposing technical errors to users creates terrible UX. Never show messages like Error -32602: Invalid parameter 'customerId' must match regex ^[0-9]{5}$ to end users. Translate technical errors to plain language: “I couldn’t find that account. Please check the account number and try again.” Parse error codes in agent logic and provide context-appropriate responses.