Add typed raw accessors with drift logging; migrate workspace tabs and resolver#94
Add typed raw accessors with drift logging; migrate workspace tabs and resolver#94bradjin8 wants to merge 4 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds defensive, raw-backed properties to Composer and Bubble, a new models/raw_access module with optional-field readers and schema-drift warnings, refactors workspace resolver/tabs to use these typed accessors and model objects, updates text extraction to accept model objects, and adds tests for drift-logging behavior. ChangesTyped Raw Accessor Layer
Sequence Diagram(s)sequenceDiagram
participant Client as service caller
participant Tabs as services.workspace_tabs
participant Resolver as services.workspace_resolver
participant Models as models.raw_access / models.conversation
Client->>Tabs: request tabs assembly (composer raw rows)
Tabs->>Models: Composer.from_dict / Composer properties (usage_data, model_name_from_config)
Tabs->>Models: conversation_header_bubble_id(header)
Tabs->>Resolver: determine_project_for_conversation(composer)
Resolver->>Models: composer_newly_created_files / composer_code_block_data / bubble helpers
Models-->>Resolver: validated lists/dicts or safe defaults
Resolver-->>Tabs: resolved project/workspace
Tabs-->>Client: assembled tab payloads
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
tests/test_raw_accessors.py (1)
26-30: ⚡ Quick winMake the “missing key” setup explicit.
This test currently relies on
GOOD_COMPOSER_RAWnot containingnewlyCreatedFiles. To keep the test stable and self-descriptive, remove the key explicitly in-test before callingComposer.from_dict(...).Suggested patch
- bare = Composer.from_dict(GOOD_COMPOSER_RAW, composer_id="cid-2") + raw = dict(GOOD_COMPOSER_RAW) + raw.pop("newlyCreatedFiles", None) + bare = Composer.from_dict(raw, composer_id="cid-2")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_raw_accessors.py` around lines 26 - 30, The test test_composer_newly_created_files_empty_when_key_missing should explicitly remove the 'newlyCreatedFiles' key from the input before calling Composer.from_dict to avoid relying on the global GOOD_COMPOSER_RAW shape; create a shallow or deep copy of GOOD_COMPOSER_RAW, pop the 'newlyCreatedFiles' key if present on that copy, then pass the modified dict into Composer.from_dict(..., composer_id="cid-2") and keep the existing with self.assertNoLogs and self.assertEqual(bare.newly_created_files, []). Ensure you do not mutate the global GOOD_COMPOSER_RAW variable.utils/text_extract.py (1)
24-27: ⚡ Quick winTyping precision:
bubble: dict | objectis effectively justobjectfor static type checking (sincedictis a subtype ofobject), so it loses most type value; narrow it to the specific “hasraw” shape (e.g., aProtocol) to keep mypy/pyright helpful.Suggested patch
+from typing import Any, Protocol + +class HasRaw(Protocol): + raw: dict[str, Any] + -def extract_text_from_bubble(bubble: dict | object) -> str: +def extract_text_from_bubble(bubble: dict[str, Any] | HasRaw) -> str: """Extract displayable text from a bubble object (text, richText, codeBlocks).""" - if hasattr(bubble, "raw"): - bubble = bubble.raw # type: ignore[union-attr] + if hasattr(bubble, "raw"): + bubble = bubble.raw🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@utils/text_extract.py` around lines 24 - 27, The parameter typing for extract_text_from_bubble is too broad (dict | object) so static checkers lose precision; define a small Protocol (e.g., HasRaw with attribute raw: dict) or a TypedDict for the expected bubble shape and change the function signature to use that Protocol/TypedDict (e.g., bubble: HasRaw | dict) so mypy/pyright can infer attributes like raw and the contained fields used later in extract_text_from_bubble; update any imports and type comments to keep compatibility with Python versions in the project.models/conversation.py (1)
3-9: 💤 Low valueMove
_loggerbelow the imports to avoid E402.The
_logger = logging.getLogger(__name__)assignment at Line 7 sits betweenfrom typing import Anyand thefrom models.errors import SchemaErrorblock, so the latter is a module-level import that no longer appears at the top of the file. Most linters (flake8/ruffE402) will flag this.♻️ Proposed reordering
import logging from dataclasses import dataclass, field from typing import Any -_logger = logging.getLogger(__name__) - from models.errors import SchemaError from models.from_dict_validation import ( require_dict, require_key, require_non_empty_str, require_non_empty_str_field, require_type, ) + +_logger = logging.getLogger(__name__)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@models/conversation.py` around lines 3 - 9, _the module-level logger `_logger` is defined before the final import `from models.errors import SchemaError`, causing an import-after-code lint error (E402); move the `_logger = logging.getLogger(__name__)` line so it appears after all import statements (including `from models.errors import SchemaError`) — locate `_logger` and the import `SchemaError` in this file and reorder so all imports come first, then initialize `_logger`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@models/raw_access.py`:
- Around line 158-190: composer_headers currently calls optional_raw_list which
emits a missing-key WARNING, while composer_newly_created_files silently returns
[] when the key is absent; align them to the silent-on-missing contract by
changing composer_headers to mirror composer_newly_created_files: if data is a
Composer return data.full_conversation_headers_only, otherwise read value =
data.get("fullConversationHeadersOnly") when data is a dict, return [] if value
is None, warn only if value exists but is not a list (similar warning message
pattern used in composer_newly_created_files), and avoid relying on
optional_raw_list to emit missing-key warnings.
- Around line 203-242: The three accessors bubble_relevant_files,
bubble_attached_file_uris, and bubble_context silently drop values of the wrong
type instead of logging drift; change their dict-path handling to call the
existing helpers optional_raw_list (for keys "relevantFiles" and
"attachedFileCodeChunksUris") and optional_raw_dict (for key "context") rather
than manually checking types, so type-mismatches are logged consistently (remove
the duplicated extraction logic and return the helper results when present).
---
Nitpick comments:
In `@models/conversation.py`:
- Around line 3-9: _the module-level logger `_logger` is defined before the
final import `from models.errors import SchemaError`, causing an
import-after-code lint error (E402); move the `_logger =
logging.getLogger(__name__)` line so it appears after all import statements
(including `from models.errors import SchemaError`) — locate `_logger` and the
import `SchemaError` in this file and reorder so all imports come first, then
initialize `_logger`.
In `@tests/test_raw_accessors.py`:
- Around line 26-30: The test
test_composer_newly_created_files_empty_when_key_missing should explicitly
remove the 'newlyCreatedFiles' key from the input before calling
Composer.from_dict to avoid relying on the global GOOD_COMPOSER_RAW shape;
create a shallow or deep copy of GOOD_COMPOSER_RAW, pop the 'newlyCreatedFiles'
key if present on that copy, then pass the modified dict into
Composer.from_dict(..., composer_id="cid-2") and keep the existing with
self.assertNoLogs and self.assertEqual(bare.newly_created_files, []). Ensure you
do not mutate the global GOOD_COMPOSER_RAW variable.
In `@utils/text_extract.py`:
- Around line 24-27: The parameter typing for extract_text_from_bubble is too
broad (dict | object) so static checkers lose precision; define a small Protocol
(e.g., HasRaw with attribute raw: dict) or a TypedDict for the expected bubble
shape and change the function signature to use that Protocol/TypedDict (e.g.,
bubble: HasRaw | dict) so mypy/pyright can infer attributes like raw and the
contained fields used later in extract_text_from_bubble; update any imports and
type comments to keep compatibility with Python versions in the project.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 563cb83f-22bd-4ffb-a6d0-7d58c209538a
📒 Files selected for processing (6)
models/conversation.pymodels/raw_access.pyservices/workspace_resolver.pyservices/workspace_tabs.pytests/test_raw_accessors.pyutils/text_extract.py
1e98ae3 to
19eb9da
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
services/workspace_tabs.py (2)
136-143:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDon't treat an empty raw bubble payload as "missing".
bubble_map.get()returnsNoneonly when the id is absent. With the newBubble | dictbridge,if not bubble_entryalso drops{}payloads beforeBubble.from_dict(...)runs, so context-only bubbles can disappear silently in the single-tab path.Proposed fix
bubble_id = conversation_header_bubble_id(header, composer_id=composer_id) if not bubble_id: continue bubble_entry = bubble_map.get(bubble_id) - if not bubble_entry: + if bubble_entry is None: continue if isinstance(bubble_entry, Bubble): bubble = bubble_entry else:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/workspace_tabs.py` around lines 136 - 143, The code is incorrectly treating empty dict payloads as missing because it uses a falsy check; change the presence check to a strict None test so empty mappings are preserved — replace "if not bubble_entry" with "if bubble_entry is None" (or alternately check "if bubble_id not in bubble_map") before the isinstance/Bubble.from_dict branch to ensure {} payloads are passed into Bubble.from_dict rather than skipped.
573-617:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate summary rows with
Composer.from_dict()too.This path still bypasses the typed composer contract and reads
cd.get(...)directly. A drifted row can therefore appear inlist_workspace_tab_summaries()even thoughassemble_single_tab()andassemble_workspace_tabs()reject the same composer viaComposer.from_dict(), which leaves the sidebar and detail endpoints disagreeing on what a valid conversation is.Proposed fix
if not isinstance(cd, dict): parse_warnings.record_composer_skipped() continue try: + composer = Composer.from_dict(cd, composer_id=composer_id) if ( composer_id not in composer_id_to_ws and composer_id not in project_layouts_map ): project_layouts_map[composer_id] = load_project_layouts_for_composer( global_db, composer_id, ) pid = determine_project_for_conversation( - cd, composer_id, project_layouts_map, + composer, composer_id, project_layouts_map, project_name_map, workspace_path_map, workspace_entries, {}, composer_id_to_ws, invalid_workspace_ids, ) @@ - headers = cd.get("fullConversationHeadersOnly") or [] + headers = composer.full_conversation_headers_only if not headers: continue - title = cd.get("name") or f"Conversation {composer_id[:8]}" + title = composer.name or f"Conversation {composer_id[:8]}" - _early_model_config = cd.get("modelConfig") or {} - _early_model_name = _early_model_config.get("modelName") + _early_model_name = composer.model_name_from_config() _early_model_names = [_early_model_name] if _early_model_name and _early_model_name != "default" else None @@ tab_entry: dict = { "id": composer_id, "title": title, - "timestamp": to_epoch_ms(cd.get("lastUpdatedAt")) or to_epoch_ms(cd.get("createdAt")) or int(datetime.now().timestamp() * 1000), + "timestamp": to_epoch_ms(composer.last_updated_at) or to_epoch_ms(composer.created_at) or int(datetime.now().timestamp() * 1000), "messageCount": len(headers), } + except SchemaError as e: + _logger.warning( + "Failed to parse Composer from composerData:%s: %s", + composer_id, + e, + ) + parse_warnings.record_composer_skipped() + continue🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/workspace_tabs.py` around lines 573 - 617, The summary-row path in list_workspace_tab_summaries currently reads raw composer dicts (cd) directly and can include drifted rows that Composer.from_dict() would reject; update the logic in list_workspace_tab_summaries to attempt constructing a typed Composer via Composer.from_dict(cd) (same validation used by assemble_single_tab()/assemble_workspace_tabs()), and if Composer.from_dict raises/returns invalid, skip that summary row (continue) so the sidebar and detail endpoints remain consistent; reference the composer dict variable cd and functions Composer.from_dict, assemble_single_tab, assemble_workspace_tabs when locating where to insert this validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@services/workspace_tabs.py`:
- Around line 136-143: The code is incorrectly treating empty dict payloads as
missing because it uses a falsy check; change the presence check to a strict
None test so empty mappings are preserved — replace "if not bubble_entry" with
"if bubble_entry is None" (or alternately check "if bubble_id not in
bubble_map") before the isinstance/Bubble.from_dict branch to ensure {} payloads
are passed into Bubble.from_dict rather than skipped.
- Around line 573-617: The summary-row path in list_workspace_tab_summaries
currently reads raw composer dicts (cd) directly and can include drifted rows
that Composer.from_dict() would reject; update the logic in
list_workspace_tab_summaries to attempt constructing a typed Composer via
Composer.from_dict(cd) (same validation used by
assemble_single_tab()/assemble_workspace_tabs()), and if Composer.from_dict
raises/returns invalid, skip that summary row (continue) so the sidebar and
detail endpoints remain consistent; reference the composer dict variable cd and
functions Composer.from_dict, assemble_single_tab, assemble_workspace_tabs when
locating where to insert this validation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d9b0fc91-65e8-46d2-955e-ee7f60388a4f
📒 Files selected for processing (2)
services/workspace_tabs.pytests/test_blob_parsing_fuzz.py
Summary
models/raw_access.pywith shared optional-key readers and schema-driftWARNINGlogging when structurally important keys are missing or mistyped (projectLayouts,bubbleIdin headers, etc.).ComposerandBubble(newly_created_files,relevant_files,usage_data,bubble_timestamp_ms(), and related fields) so services can use model fields instead of silent.get()chains afterfrom_dictvalidation.services/workspace_tabs.pyandservices/workspace_resolver.pyto passComposer/Bubblethrough the hot path (bubble_mapstoresBubbleinstances;determine_project_for_conversationacceptsComposer | dict).tests/test_raw_accessors.pyfor drift logging and bridge helpers.Closes #90
Test plan
python -m pytest tests/test_raw_accessors.py -vpython -m pytest tests/test_workspace_assignment_fallback.py tests/test_models.py -vpython -m pytest tests/ -q(371 passed)python -m mypy models services/workspace_tabs.py services/workspace_resolver.pynewlyCreatedFiles,relevantFiles)Notes
.get()usage inworkspace_listing.py,api/search.py, andscripts/export.py.Summary by CodeRabbit
New Features
Bug Fixes
Tests