11. Memory Usage Patterns — Practical Examples

⚠️ IMPORTANT: All features are experimental, under active development. Use at your own risk. Customization to your workflow required. © 2026 GLG, a.s. | ← Back to Index

11. Memory Usage Patterns — Practical Examples

11.1 Browsing Records

# List all memories with pagination
memories = uaml.search("", limit=20)          # first 20
memories = uaml.search("", limit=20, offset=20)  # next 20

# Via REST with filters:
# All code-related memories from last week:
GET /api/knowledge?topic=code&limit=50&offset=0

# Search by content:
results = uaml.search("deployment server", limit=10)
for r in results:
    print(f"[{r.confidence:.0%}] {r.topic}: {r.content[:100]}")

11.2 Filtering for Relevance

# BAD — returns everything, blows up context:
all_data = uaml.search("", limit=1000)

# GOOD — targeted query with topic filter:
results = uaml.search("API authentication", topic="security", limit=5)

# BETTER — Focus Engine with token budget:
context = uaml.recall(
    query="How do we handle API authentication?",
    budget_tokens=800,          # strict token limit
)
# Returns only the most relevant entries that fit in 800 tokens

11.3 Preventing Context Overload

The #1 problem with AI memory systems is context overload — too much memory injected into the prompt, wasting tokens and confusing the model.

Rules of thumb:

Context need	Budget	Method
Quick fact lookup	200–500 tokens	`uaml.search(query, limit=3)`
Detailed analysis	500–1500 tokens	`uaml.recall(query, budget_tokens=1000)`
Comprehensive report	1500–3000 tokens	`uaml.recall(query, budget_tokens=2500)`
Never exceed	50% of context window	Always set explicit budget

# Anti-pattern: dumping everything
all_memories = uaml.search("", limit=999)  # ❌ 50K+ tokens wasted

# Correct: tiered recall
# Step 1 — quick check: do I have relevant data?
quick = uaml.search(query, limit=3)
if not quick:
    # No relevant memories — answer from current context
    pass
else:
    # Step 2 — focused recall with budget
    context = uaml.recall(query, budget_tokens=800)

11.4 Requesting Only Relevant Data

# Filter by topic — only see what matters:
code_memories = uaml.search("refactor", topic="code")
infra_memories = uaml.search("server", topic="infrastructure")

# Filter by time — recent context only:
from datetime import datetime, timedelta
recent = uaml.search(
    "deployment",
    point_in_time=(datetime.now() - timedelta(days=7)).isoformat()
)

# Filter by confidence — only high-quality data:
config = {"min_confidence": 0.7}  # reject uncertain memories

# Combine filters via Focus Engine:
context = uaml.recall(
    query="deployment issues this week",
    budget_tokens=600,
    preset="conservative"     # strict relevance + confidence thresholds
)

11.5 Memory Lifecycle Management

# Store with appropriate confidence:
uaml.learn("Server IP changed to 10.0.0.5", confidence=0.95)     # fact
uaml.learn("Might need to upgrade DB", confidence=0.5)            # speculation

# Update when information changes:
old = uaml.search("server IP", limit=1)[0]
uaml.learn("Server IP is now 10.0.0.10", topic=old.topic,
           confidence=0.95)  # new entry supersedes old (freshness decay)

# Soft-delete outdated info:
uaml.forget(entry_id)        # marks as deleted, preserved in audit trail

# Temporal validity — knowledge that expires:
uaml.learn(
    "Conference discount code: UAML2026",
    valid_from="2026-03-01",
    valid_until="2026-04-30"  # auto-excluded from recall after expiry
)