KnowledgeΒΆ
Pluggable knowledge retrieval with LRU caching and thread-safe operations.
Register custom retrieval functions, query them by name, and benefit from automatic caching via an OrderedDict-based LRU cache.
v2.0 Improvements
KnowledgeRetriever now uses an OrderedDict LRU cache (max 1024 entries), threading.Lock for thread safety, and structured logging.
OverviewΒΆ
The KnowledgeRetriever class provides a unified interface for knowledge retrieval across multiple sources:
| Feature | Description |
|---|---|
| Pluggable sources | Register any callable as a retrieval function |
| LRU cache | Automatic caching with bounded OrderedDict (max 1024) |
| Direct storage | Add key-value knowledge entries directly |
| Thread safety | All operations protected by threading.Lock |
Quick StartΒΆ
Registering SourcesΒΆ
A source is any callable that accepts a query string and returns a result:
Direct Knowledge StorageΒΆ
Add static knowledge entries directly without a retrieval function:
| Python | |
|---|---|
LRU CacheΒΆ
The cache uses OrderedDict with a maximum of 1024 entries. When the limit is reached, the least recently used entry is evicted.
| Python | |
|---|---|
Cache Key
The cache key is (source_name, query) β so the same query to different sources produces separate cache entries.
Bypassing the CacheΒΆ
Pass use_cache=False to skip the cache for a specific query:
| Python | |
|---|---|
API ReferenceΒΆ
KnowledgeRetrieverΒΆ
MethodsΒΆ
| Method | Returns | Description |
|---|---|---|
register_source(name, retrieval_fn) | None | Register a named retrieval function |
add_knowledge(key, content) | None | Add a static knowledge entry |
retrieve(source, query, use_cache=True) | Any | Query a source with optional caching |
clear_cache() | None | Clear the LRU cache |
PropertiesΒΆ
| Property | Type | Description |
|---|---|---|
cache | OrderedDict | Current cache contents |
InternalΒΆ
| Attribute | Type | Description |
|---|---|---|
_sources | dict | Registered retrieval functions |
_knowledge | dict | Direct knowledge entries |
_cache | OrderedDict | LRU cache (max 1024) |
_lock | threading.Lock | Thread synchronisation |
Integration with RAGΒΆ
Combine KnowledgeRetriever with an LLM for retrieval-augmented generation:
| Python | |
|---|---|
Best PracticesΒΆ
Do
- Register sources during application startup.
- Use
clear_cache()when underlying data changes. - Use
use_cache=Falsefor time-sensitive queries. - Keep retrieval functions fast β they block the calling thread.
Don't
- Register sources with the same name (the second overwrites the first).
- Store large binary blobs via
add_knowledge()β use object storage instead. - Forget to call
clear_cache()after data updates.