feat: add ValkeyMemoryService with vector similarity search (Valkey Search module)#156
Open
daric93 wants to merge 7 commits into
Open
feat: add ValkeyMemoryService with vector similarity search (Valkey Search module)#156daric93 wants to merge 7 commits into
daric93 wants to merge 7 commits into
Conversation
…eMemoryService backed by Valkey using the valkey-glide client library. Stores memories as JSON in Valkey lists keyed by app_name and user_id, with simple text-based substring search. - ValkeyMemoryServiceConfig: configurable search_top_k, key_prefix, ttl - ValkeyMemoryService: add_session_to_memory, search_memory, close - Optional dependency: valkey-glide>=2.4.0 under [valkey] extra - 23 unit tests (mocked client) - 5 integration tests (requires running Valkey instance) Ref: AEA-497 Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…s the simple substring-matching implementation with full-text search powered by the Valkey Search module (FT.CREATE / FT.SEARCH). Changes: - Memories stored as Valkey Hash keys (indexed automatically) - FT.CREATE with TEXT field for content, TAG fields for app_name/user_id - FT.SEARCH for full-text search with TAG filtering - Expanded integration tests (11 tests covering isolation, top_k, etc.) - Added memory module README with usage documentation Ref: AEA-497 Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…) Replaces full-text search with vector similarity search powered by the Valkey Search module, matching VertexAiRagMemoryService in functionality. Key changes: - Configurable embedding function (users bring their own embedder) - FT.CREATE with VECTOR field (HNSW, FLOAT32) for KNN search - FT.SEARCH with KNN pre-filtered by app_name/user_id TAG fields - Configurable vector_distance_threshold for filtering low-quality matches - Configurable distance_metric (COSINE, L2, IP) - Batch embedding generation for efficient ingestion - Implements add_events_to_memory for incremental ingestion - 30 unit tests, 12 integration tests (all passing) Ref: AEA-497 Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…ation - Fix incomplete TAG value escaping in _build_knn_query: now escapes all Valkey Search metacharacters (dots, colons, @, spaces, etc.), not just hyphens. This prevents query injection and ensures correct scoping when app_name/user_id contain special characters. - Add _escape_tag_value() static helper with full Valkey Search spec coverage. - Add type annotation for client parameter (Union[GlideClient, GlideClusterClient]). - Add Pydantic field_validator for distance_metric to reject invalid values at config time instead of silently falling back to COSINE. - Fix TTL check to use 'is not None' instead of truthiness for Optional[int]. - Fix create_index docstring (removed incorrect 'timestamp: NUMERIC field'). - Add unit tests: special char escaping (dots, colons, @, spaces), distance_metric validation, _escape_tag_value coverage. Ref: AEA-497 Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…s in docstring and README highlighting that client_name should be set on GlideClientConfiguration for visibility in CLIENT LIST, monitoring dashboards, and CloudWatch metrics. Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…, DRY refactor - Fix tenant isolation bypass: add '?' to _TAG_SPECIAL_CHARS escape set (single-char wildcard glob in Valkey Search TAG queries) - Use Batch pipelining for hset/expire calls (1 round trip vs 2N) - Add asyncio.Lock with double-check locking in _ensure_index() to prevent redundant FT.CREATE calls under concurrent access - Extract shared _ingest_events() method to DRY up add_session_to_memory and add_events_to_memory - Update unit tests for Batch-based approach; add tests for ? escaping and _ensure_index concurrency safety Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…s, remove dead code - Move field_validator import to top-level (was polluting class namespace) - Re-raise RuntimeError on batch exec failure instead of swallowing - Remove unreachable ternary fallback (guarded by earlier len check) - Update unit test to expect RuntimeError on batch exec failure Signed-off-by: Daria Korenieva <daric2612@gmail.com>
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes: #155
Implements
ValkeyMemoryServicefor the ADK community memory module, backed by Valkey using the Valkey Search module for vector similarity search (HNSW). Uses thevalkey-glideclient library (v2.4.0+). This provides functionality analogous toVertexAiRagMemoryServicefor developers with Valkey infrastructure.Changes
src/google/adk_community/memory/valkey_memory_service.py— New memory service implementingBaseMemoryServiceValkeyMemoryServiceConfig: Pydantic config withsimilarity_top_k,vector_distance_threshold,embedding_dimensions,key_prefix,index_name,distance_metric,ttl_secondsValkeyMemoryService: Accepts a configurable async embedding function (users bring their own embedder)FT.CREATEwith VECTOR field (HNSW algorithm) + TAG fields for app_name/user_idFT.SEARCHwith KNN for vector similarity retrieval, pre-filtered by TAGadd_session_to_memoryandadd_events_to_memoryBatchfor single round-trip ingestionasyncio.Lockwith double-check locking for index creation?wildcard)src/google/adk_community/memory/__init__.py— ExportsValkeyMemoryServiceandValkeyMemoryServiceConfigsrc/google/adk_community/memory/README.md— Documentation with usage examplespyproject.toml— Addedvalkey-glide>=2.4.0as optional dependency under[valkey]extratests/unittests/memory/test_valkey_memory_service.py— 46 unit tests (mocked client)tests/integration/test_valkey_memory_service_integration.py— 12 integration tests against live Valkey with Search moduleHow it works
Ingestion (
add_session_to_memory/add_events_to_memory): text is extracted from events, embeddings are generated in batch via the user-provided embedding function, and each event is stored as a Valkey Hash containing the text content, metadata, and the embedding vector. All writes are pipelined viaBatchfor a single network round-trip.Search (
search_memory): an embedding is generated for the query text, thenFT.SEARCHperforms a KNN search over the HNSW index with pre-filtering byapp_nameanduser_idTAG fields. Results are ranked by vector distance and optionally filtered byvector_distance_threshold.Testing plan
Unit tests (46): All public methods tested with mocked valkey-glide client
Integration tests (12): Run against Valkey 9.1 (valkey-bundle image with Search module)
Integration test coverage:
Requirements
valkey/valkey-bundleimage)valkey-glide >= 2.4.0Ref: AEA-497