Skip to main content
The Learning & Memory System enables your AI assistants to remember information across calls, personalize conversations, and continuously improve through evaluation and prompt optimization.

Overview

Burki’s Learning & Memory System provides:

Persistent Memory

Store and recall facts about callers, preferences, and past interactions across multiple calls.

Prompt Versioning

Manage system prompts with version control, A/B testing, and gradual rollouts.

Evaluation System

Test prompt changes against datasets before deploying to production.

Privacy-First Design

GDPR-compliant with caller opt-out, PII redaction, and soft deletion.

Memory Types

Memories are categorized by what kind of information they represent:
TypeDescriptionExample
SemanticGeneral facts and knowledge”Customer prefers email contact”
EpisodicSpecific events and interactions”Discussed order #123 on Jan 5”
ProceduralHow-to knowledge and processes”To escalate, transfer to ext 1234”
Semantic memories are best for persistent preferences. Episodic memories capture conversation history. Procedural memories help the AI follow specific workflows.

Memory Scopes

Control where memories apply with scoping:
ScopeApplies ToUse Case
OrganizationAll calls in your organizationCompany policies, business hours
AssistantOne specific assistantAssistant-specific knowledge
LocationStore or branchLocation-specific info (hours, address)
CallerIndividual customerPersonal preferences, history

Scope Hierarchy

When retrieving memories, the system searches from most specific to most general:
Caller → Location → Assistant → Organization
This allows organization-wide defaults to be overridden at more specific levels.

Memory Features

Caller phone numbers are hashed before storage using SHA-256. The original phone number is never stored with memories, protecting caller privacy while still enabling personalization.
+1234567890 → a3f2b8c9d4e5f6...
Each memory has a confidence score from 0 to 1:
ScoreMeaning
0.9 - 1.0High confidence (explicit statement)
0.7 - 0.9Medium confidence (inferred)
0.5 - 0.7Low confidence (uncertain)
< 0.5Very low (speculation)
Lower confidence memories may be excluded from context or presented with caveats.
Set automatic expiration for memories:
  • Short-term: 24 hours (temporary preferences)
  • Medium-term: 30 days (campaign-related)
  • Long-term: 1 year (customer preferences)
  • Permanent: Never expires (critical facts)
Expired memories are automatically soft-deleted.
Memories are embedded as vectors for semantic search. When the AI needs context, it searches for memories semantically related to the current conversation, not just keyword matches.

Memory Graph

Memories are connected in a graph structure with relationship types:
RelationshipDescriptionExample
TemporalHappened before/after”Order placed” → “Order shipped”
CausalCaused by”Complaint filed” → “Refund issued”
ResolutionResolved by”Issue reported” → “Issue resolved”
SemanticRelated to”Prefers email” ↔ “Contact preferences”
ConflictContradicts”Wants refund” vs “Satisfied with resolution”
SupersedesReplaces”New address” supersedes “Old address”
The graph enables context expansion—when retrieving memories, related memories are included for fuller context.

Caller Privacy

Callers can opt out of memory storage:
  1. Verbal request: If a caller says “don’t remember this” or similar, the AI should respect the request
  2. API opt-out: Mark a caller as opted-out via API
  3. Dashboard: Manage opted-out callers in the Learning dashboard
Opted-out callers:
  • No new memories are created
  • Existing memories are soft-deleted
  • Only session-level context is maintained (within single call)
The memory system supports GDPR data subject rights:
RightImplementation
AccessExport all memories for a caller
RectificationEdit or correct memories
ErasureSoft-delete all caller memories
PortabilityExport in standard JSON format
Use the Caller Privacy Manager in the dashboard or API to handle these requests.
Configure which types of facts assistants can store:
{
  "memory_write_policy": {
    "allowed_types": ["preferences", "contact_info", "order_history"],
    "blocked_types": ["health_info", "financial_details"],
    "require_explicit_consent": true
  }
}

PII Redaction

Personally identifiable information is automatically detected and redacted before storage.

Detected PII Types

TypePatternReplacement
Phone numbersUS and international[PHONE]
Email addressesStandard email format[EMAIL]
SSN9-digit US format[SSN]
Credit cardsVisa, MC, Amex, Discover[CREDIT_CARD]
AddressesStreet addresses[ADDRESS]
Dates of birthDate patterns with DOB context[DOB]
IP addressesIPv4 and IPv6[IP_ADDRESS]
PII redaction is conservative—it over-redacts rather than risk missing sensitive data. Configure which patterns to detect based on your compliance requirements.

Prompt Versioning

Manage system prompts with full version control and safe rollout capabilities.

Lifecycle Stages

StageDescription
DraftBeing edited, not in use
TestingRunning through evaluation harness
ApprovedHuman approval obtained
CanaryServing 5-10% of traffic
ShadowParallel execution (logged but no impact)
ProductionServing 100% of traffic
RetiredNo longer in use
RejectedFailed evaluation

Creating a New Version

  1. Go to Learning > Prompt Versions for your assistant
  2. Click New Version
  3. Edit the system prompt
  4. Add version notes explaining changes
  5. Save as Draft
  6. Run evaluation to move to Testing

Canary Rollouts

Gradually roll out prompt changes to reduce risk:
  1. Start Canary: Deploy to 5% of traffic
  2. Monitor Metrics: Watch success rate, call duration, customer satisfaction
  3. Increase Traffic: Gradually increase to 25%, 50%, 75%
  4. Promote or Rollback: Move to production or revert if issues arise
Always run evaluations before starting a canary. Never promote directly from draft to production.

Evaluation System

Test prompt changes against curated datasets before deployment.

Evaluation Datasets

Create datasets of test cases representing expected conversations:
{
  "name": "Customer Support Golden Set",
  "dataset_type": "golden",
  "cases": [
    {
      "case_name": "Refund Request",
      "conversation": [
        {"role": "user", "content": "I want a refund for order #123"},
        {"role": "assistant", "content": "I'd be happy to help..."}
      ],
      "expected_response": {
        "should_contain": ["refund", "process"],
        "intent": "refund_request",
        "should_call_tool": "check_order_status"
      }
    }
  ]
}

Dataset Types

TypePurpose
GoldenCurated high-quality examples for comprehensive testing
RegressionCases that previously failed (prevent regressions)
SyntheticAI-generated cases for broader coverage

Running Evaluations

curl -X POST "https://api.burki.dev/api/learning/assistants/{assistant_id}/eval/run" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt_version_id": 42,
    "dataset_id": 1,
    "pass_threshold": 0.8
  }'

Evaluation Metrics

MetricDescription
Keyword MatchingRequired words present in response
Intent AccuracyCorrect intent detected
Tool Call AccuracyCorrect tools called with correct parameters
Fluency ScoreResponse quality and naturalness

Dashboard UI

The Learning Dashboard provides visual management of all features:
  • Search and filter stored memories
  • Filter by type (semantic, episodic, procedural)
  • Filter by scope (organization, assistant, caller)
  • View confidence scores and TTL
  • Edit or delete individual memories
  • Visual timeline of all prompt versions
  • Compare versions side-by-side
  • See evaluation results for each version
  • One-click promote/demote/rollback
  • View rollout percentages
  • Create and manage test datasets
  • Add cases from real call transcripts
  • Run evaluations on demand
  • View detailed results and scores
  • Export datasets for sharing
  • View opted-out callers
  • Add manual opt-outs
  • Export caller data (GDPR access requests)
  • Delete caller data (GDPR erasure requests)

API Reference

Memory Endpoints

MethodEndpointDescription
GET/api/learning/memoriesList memories with filters
GET/api/learning/memories/{id}Get a specific memory
POST/api/learning/memoriesCreate a memory manually
PUT/api/learning/memories/{id}Update a memory
DELETE/api/learning/memories/{id}Soft-delete a memory
POST/api/learning/memories/searchSemantic search

Prompt Version Endpoints

MethodEndpointDescription
GET/api/learning/assistants/{id}/promptsList prompt versions
POST/api/learning/assistants/{id}/promptsCreate new version
POST/api/learning/assistants/{id}/prompts/{id}/approveApprove version
POST/api/learning/assistants/{id}/prompts/{id}/start-canaryStart canary rollout
POST/api/learning/assistants/{id}/prompts/{id}/promotePromote to production
POST/api/learning/assistants/{id}/prompts/{id}/rollbackRollback version

Eval Endpoints

MethodEndpointDescription
GET/api/learning/assistants/{id}/eval-datasetsList datasets
POST/api/learning/assistants/{id}/eval-datasetsCreate dataset
POST/api/learning/assistants/{id}/eval-datasets/{id}/casesAdd test case
POST/api/learning/assistants/{id}/eval/runRun evaluation

Best Practices

  • Start with organization memories: Add common facts (business hours, policies) at the org level
  • Use appropriate TTL: Don’t store temporary information permanently
  • Test before deploying: Always run evaluations before canary rollouts
  • Monitor canary metrics: Watch for degradation in success rates or call duration
  • Review memories regularly: Audit stored memories for accuracy and relevance
  • Respect privacy: Honor opt-out requests promptly and completely
The Learning & Memory System is most powerful when combined: store memories about what works, evaluate prompt changes against real scenarios, and gradually roll out improvements with confidence.