⚠️ IMPORTANT: All features are experimental, under active development. Use at your own risk. Customization to your workflow required. © 2026 GLG, a.s. | ← Back to Index
11. Memory Usage Patterns — Practical Examples
11.1 Browsing Records
# List all memories with pagination
memories = uaml.search("", limit=20) # first 20
memories = uaml.search("", limit=20, offset=20) # next 20
# Via REST with filters:
# All code-related memories from last week:
GET /api/knowledge?topic=code&limit=50&offset=0
# Search by content:
results = uaml.search("deployment server", limit=10)
for r in results:
print(f"[{r.confidence:.0%}] {r.topic}: {r.content[:100]}")
11.2 Filtering for Relevance
# BAD — returns everything, blows up context:
all_data = uaml.search("", limit=1000)
# GOOD — targeted query with topic filter:
results = uaml.search("API authentication", topic="security", limit=5)
# BETTER — Focus Engine with token budget:
context = uaml.recall(
query="How do we handle API authentication?",
budget_tokens=800, # strict token limit
)
# Returns only the most relevant entries that fit in 800 tokens
11.3 Preventing Context Overload
The #1 problem with AI memory systems is context overload — too much memory injected into the prompt, wasting tokens and confusing the model.
Rules of thumb:
| Context need | Budget | Method |
|---|---|---|
| Quick fact lookup | 200–500 tokens | uaml.search(query, limit=3) |
| Detailed analysis | 500–1500 tokens | uaml.recall(query, budget_tokens=1000) |
| Comprehensive report | 1500–3000 tokens | uaml.recall(query, budget_tokens=2500) |
| Never exceed | 50% of context window | Always set explicit budget |
# Anti-pattern: dumping everything
all_memories = uaml.search("", limit=999) # ❌ 50K+ tokens wasted
# Correct: tiered recall
# Step 1 — quick check: do I have relevant data?
quick = uaml.search(query, limit=3)
if not quick:
# No relevant memories — answer from current context
pass
else:
# Step 2 — focused recall with budget
context = uaml.recall(query, budget_tokens=800)
11.4 Requesting Only Relevant Data
# Filter by topic — only see what matters:
code_memories = uaml.search("refactor", topic="code")
infra_memories = uaml.search("server", topic="infrastructure")
# Filter by time — recent context only:
from datetime import datetime, timedelta
recent = uaml.search(
"deployment",
point_in_time=(datetime.now() - timedelta(days=7)).isoformat()
)
# Filter by confidence — only high-quality data:
config = {"min_confidence": 0.7} # reject uncertain memories
# Combine filters via Focus Engine:
context = uaml.recall(
query="deployment issues this week",
budget_tokens=600,
preset="conservative" # strict relevance + confidence thresholds
)
11.5 Memory Lifecycle Management
# Store with appropriate confidence:
uaml.learn("Server IP changed to 10.0.0.5", confidence=0.95) # fact
uaml.learn("Might need to upgrade DB", confidence=0.5) # speculation
# Update when information changes:
old = uaml.search("server IP", limit=1)[0]
uaml.learn("Server IP is now 10.0.0.10", topic=old.topic,
confidence=0.95) # new entry supersedes old (freshness decay)
# Soft-delete outdated info:
uaml.forget(entry_id) # marks as deleted, preserved in audit trail
# Temporal validity — knowledge that expires:
uaml.learn(
"Conference discount code: UAML2026",
valid_from="2026-03-01",
valid_until="2026-04-30" # auto-excluded from recall after expiry
)