Lazy-load chat histories, fast list summaries, and mtime disk cache (closes #84) by bradjin8 · Pull Request #88 · cppalliance/cppa-cursor-browser

bradjin8 · 2026-06-02T20:04:54Z

Summary

Implements issue #84: split summary and full-assembly code paths so list views avoid scanning global bubbleId:% rows, the workspace UI lazy-loads conversation bodies on tab selection, and repeat page loads are served from an mtime-keyed disk cache.

Home (GET /api/workspaces) — no global bubble scan; SQL-filtered composerData; per-composer MRC only when needed; cached composer_id_to_ws and project list
Workspace sidebar — GET /api/workspaces/<id>/tabs?summary=1 returns titles/metadata only; full payloads via GET /api/workspaces/<id>/tabs/<composer_id>
Phase 3 cache — project list, composer map, and tab summaries under ~/.cache/cursor-chat-browser/; invalidate on storage mtimes; bypass with ?nocache=1 or CURSOR_CHAT_BROWSER_NOCACHE=1
Full /tabs, export, and search — unchanged; monolithic /tabs still available for backward compatibility

Problem

On large local Cursor datasets, the home page and workspace page each rescanned global KV storage on every load (often 1–2+ minutes). The UI blocked on full tab assembly before the sidebar could render.

Solution

Backend

Area	Change
`services/workspace_listing.py`	Summary list path: no `load_bubble_map`; SQL filter for non-empty headers; lazy `load_project_layouts_for_composer`; disk cache wrapper
`services/workspace_tabs.py`	`list_workspace_tab_summaries`, `assemble_single_tab`, shared `_assemble_tab_from_composer_data`; scoped bubble/MRC/diff loaders
`services/workspace_db.py`	Scoped loaders + `build_composer_id_to_workspace_id_cached`
`services/summary_cache.py`	New mtime-keyed cache layer
`api/workspaces.py`	`?summary=1`, `/tabs/<composer_id>`, `?nocache=1`

Frontend

templates/workspace.html — fetch summary tabs on load; lazy-fetch full tab on selection; tabCache for loaded conversations
static/css/style.css — stable grid width during load; .main-content.is-loading for centered spinner

Tests (356 pass)

tests/test_workspace_listing_performance.py — spy: no global bubbleId:% on list path
tests/test_workspace_tabs_summary.py — summary/single-tab scoped queries and payload shape
tests/test_summary_cache.py — cache hit/miss and fingerprint invalidation

Performance notes

First load after cache clear or Cursor data change still rebuilds summaries (expected; much faster than pre-Lazy-load chat histories and serve lightweight list summaries (avoid full global index scan on every page load) #84 due to skipped bubble + full MRC scans).
Repeat loads (refresh home/workspace with unchanged storage) should be near-instant from disk cache.
To force cold rebuild: GET /api/workspaces?nocache=1

Document representative row counts and timings from your local fixture in review comments if helpful.

Test plan

pytest — all tests pass locally
Home page: project cards appear; second refresh noticeably faster than first
Workspace page: sidebar titles render quickly; selecting a conversation loads bubbles with per-tab spinner
Copy All / Download work on a loaded conversation
Export script and full GET /api/workspaces/<id>/tabs (no summary=1) still work
?nocache=1 bypasses cache; cache files appear under ~/.cache/cursor-chat-browser/

Closes #84

Summary by CodeRabbit

New Features
- Lazy-loaded workspace UI with per-conversation summary endpoint and on-demand full conversation loading; sidebar shows per-conversation model badges.
Performance
- Disk-backed summary cache for workspace/tab summaries (nocache option available).
- Scoped data loading to fetch only targeted conversation rows, reducing expensive global scans.
Style
- Updated layout and loading skeleton styles for consistent full-width loading states.
Documentation
- Changelog updated with Phase 3 summary, caching, and API notes.
Tests
- New/regression tests covering summary cache, tab summaries, scoped loading, and listing performance.
Bug Fixes
- Improved inline error handling for workspace and conversation load failures.

coderabbitai · 2026-06-02T20:05:06Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3f1b551a-973c-4fd3-998c-f2b33da775a7

📥 Commits

Reviewing files that changed from the base of the PR and between 14e9f54 and f6d3ff9.

📒 Files selected for processing (6)

services/summary_cache.py
services/workspace_db.py
templates/workspace.html
tests/test_summary_cache.py
tests/test_workspace_listing_performance.py
tests/test_workspace_tabs_summary.py

🚧 Files skipped from review as they are similar to previous changes (6)

tests/test_workspace_tabs_summary.py
tests/test_summary_cache.py
templates/workspace.html
tests/test_workspace_listing_performance.py
services/summary_cache.py
services/workspace_db.py

📝 Walkthrough

Walkthrough

Adds a fingerprinted disk-backed summary cache, composer-scoped KV loaders to avoid global bubble scans, refactors workspace listing and tab assembly to use cached composer↔workspace maps, exposes summary-only and per-composer tab endpoints, and updates the frontend to lazy-load full tabs on demand with tests preventing unscoped scans.

Changes

Lazy-load workspace tabs with summary cache and scoped loaders

Layer / File(s)	Summary
Summary cache infrastructure `services/summary_cache.py`, `CHANGELOG.md`, `tests/test_summary_cache.py`	Disk-backed cache with fingerprinting, nocache bypass, atomic JSON I/O, and accessors for projects, composer→workspace mapping, and per-workspace tab summaries; unit tests validate hits/misses and nocache behavior.
Composer-scoped KV loaders `services/workspace_db.py`	New helpers to extract root paths from messageRequestContext, load project layouts per composer, and scoped loaders for bubbles, message contexts, and code-block diffs; centralizes global DB path resolution and adds cached composer→workspace mapping.
Workspace listing optimization `services/workspace_listing.py`, `tests/test_workspace_listing_performance.py`, `tests/test_project_layouts_dict_shape.py`	`list_workspace_projects` gains `nocache` flag and fingerprint-based caching, uses lightweight composer validation and constant composer-row SQL, and skips unscoped bubble scans; performance tests assert no `bubbleId:%` global LIKE queries and preserve project-card shape.
Tab assembly modularization `services/workspace_tabs.py`	Extracts `_assemble_tab_from_composer_data` and `_build_matching_ws_ids`, centralizes bubble/context injection and metadata aggregation, and updates assembly loop to delegate per-composer tab construction to the helper.
Summary and single-tab endpoints `services/workspace_tabs.py`, `tests/test_workspace_tabs_summary.py`	Adds `list_workspace_tab_summaries` (cached summary-only payload) and `assemble_single_tab` (scoped per-composer loads returning full tab); uncached builders and tests ensure scoped SQL and expected payload shapes, returning 404 for missing/unauthorized composer IDs.
API endpoint routing `api/workspaces.py`	Parses `nocache` on workspace listing requests, branches `GET /api/workspaces/<id>/tabs` on `summary=true` to return summaries or full assembly, and adds `GET /api/workspaces/<workspace_id>/tabs/<composer_id>` for lazy per-composer fetches (CLI workspaces rejected).
Frontend lazy-loading and UI `templates/workspace.html`, `static/css/style.css`	Workspace page renders skeleton immediately and requests summary tabs (`?summary=1`). `selectTab` becomes `async`, uses `tabCache` and lazy-fetch of full tabs with inline loading/error UIs, and sidebar shows a per-tab model badge. CSS adjusted for loading layout and consistent heights.
Test updates and fixtures `tests/*`	Adds cache tests, regression/performance fixtures asserting no unscoped global scans, tab-summary and assemble-single-tab tests verifying scoped SQL and payloads, and updates parse-warning fixtures to new malformed JSON cases.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

cppalliance/cppa-cursor-browser#78: Modifies workspace listing/tab assembly surfaces and parse-warning metadata; overlaps with tab/list changes in this PR.
cppalliance/cppa-cursor-browser#61: Changes Cursor KV loader helpers and workspace DB access patterns—related to the scoped loaders and mapping changes here.
cppalliance/cppa-cursor-browser#2: Touches API routing and tab assembly behavior overlapping at the workspaces/tabs API level.

Suggested reviewers

timon0305
clean6378-max-it
wpak-ai

Poem

🐇 A rabbit nibbles at the cache,
Fingerprints snug in a hidden stash,
Summaries first, then bubbles on call,
No more global scans that stall—
Hop, fetch, render, dash.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 43.04% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the three main changes: lazy-loading chat histories, fast list summaries, and mtime disk cache. It directly maps to the primary objectives in the PR description.
Linked Issues check	✅ Passed	The PR implements all core requirements from issue `#84`: summary endpoints without global bubbleId scans, lazy-load full tabs on selection, and mtime-keyed disk cache. Tests verify no global scans and cache behavior.
Out of Scope Changes check	✅ Passed	All changes are directly tied to implementing lazy-loading and caching: new cache service, scoped loaders, summary endpoints, frontend lazy-loading, CSS for loading states, and corresponding tests. No unrelated refactoring detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/lazy-load-history

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

templates/workspace.html (1)
124-175: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Possible stale-render race when switching tabs quickly.

selectTab is now async and awaits a network fetch before calling renderChat. If a user clicks conversation A (slow fetch) then conversation B, B may render first and A's later-resolving fetch will overwrite main-content with A's content while the sidebar shows B as active. Consider tracking the latest requested id and bailing out of stale resolutions.
🛠️ Suggested guard
 async function selectTab(id) {
   const summary = allTabs.find(t => t.id === id);
   if (!summary) return;
+  selectTab._latest = id;
Then after each await, before mutating selectedTab/main-content:
if (selectTab._latest !== id) return; // a newer selection superseded this one
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@templates/workspace.html` around lines 124 - 175, selectTab can race when
multiple async fetches complete out-of-order; set a per-call marker (e.g. assign
selectTab._latest = id at the start of selectTab) and after any await or before
mutating shared state (tabCache, selectedTab, main-content, calling renderChat)
check if selectTab._latest !== id and if so return early to avoid overwriting a
newer tab; apply these guards around the fetch/try-catch resolution and any
other async points so only the most-recent selection updates the UI.

🧹 Nitpick comments (1)

tests/test_summary_cache.py (1)

37-66: ⚡ Quick win

Add a round-trip test with a realistic fingerprint.

Current tests only exercise scalar-only fingerprints, so they don't catch the tuple→list serialization mismatch in fingerprint_workspace_storage (see the services/summary_cache.py finding). A test that calls fingerprint_workspace_storage (so workspace_files is populated), then set_cached_projects/get_cached_projects with that fingerprint, would guard against the cache silently missing on every repeat load.

💚 Suggested regression test

def test_cache_hit_with_workspace_files_fingerprint(self):
    with tempfile.TemporaryDirectory() as ws:
        entry_dir = os.path.join(ws, "entry1")
        os.makedirs(entry_dir)
        with open(os.path.join(entry_dir, "state.vscdb"), "wb") as f:
            f.write(b"x")
        entries = [{"name": "entry1"}]
        fp = fingerprint_workspace_storage(ws, entries, global_db_path=None, rules=[])
        set_cached_projects(fp, [{"id": "a"}], [])
        # Recompute the fingerprint to mimic a fresh process/page load.
        fp2 = fingerprint_workspace_storage(ws, entries, global_db_path=None, rules=[])
        self.assertIsNotNone(get_cached_projects(fp2))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_summary_cache.py` around lines 37 - 66, Add a regression test that
round-trips a realistic fingerprint containing workspace_files to catch the
tuple→list serialization mismatch: use fingerprint_workspace_storage to create
fp (with a temp workspace and an entry that creates a state.vscdb file), call
set_cached_projects(fp, projects, warnings), then recompute fp2 =
fingerprint_workspace_storage(...) to mimic a fresh process and assert
get_cached_projects(fp2) is not None; this ensures
set_cached_projects/get_cached_projects handle the workspace_files shape
consistently (see fingerprint_workspace_storage, set_cached_projects,
get_cached_projects, and services/summary_cache.py).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@services/summary_cache.py`:
- Around line 64-88: The fingerprint uses Python tuples in workspace_files but
JSON serialization converts tuples to lists, so cached fingerprints (lists)
never equal freshly computed fingerprints (tuples); change the construction of
workspace_files in the fingerprint generator to emit JSON-stable lists instead
of tuples (i.e., append [f"{name}/{rel}", mtime] rather than (f"{name}/{rel}",
mtime)) so set_cached_* and get_cached_* compare identical types, and optionally
make _fingerprint_equal normalize sequence types (e.g., convert tuples to lists
or use deep structural equality) to be robust to future tuple/list differences.

In `@services/workspace_db.py`:
- Around line 294-299: COMPOSER_ROWS_WITH_HEADERS_SQL's NOT LIKE predicate only
excludes the compact empty-array form and should also exclude the spaced JSON
form; update the SQL string constant COMPOSER_ROWS_WITH_HEADERS_SQL so the
predicate excludes both "\"fullConversationHeadersOnly\":[]" and
"\"fullConversationHeadersOnly\": []" (e.g. add an additional AND value NOT LIKE
'%fullConversationHeadersOnly\": []%' or normalize/remove whitespace in the
filter) to ensure rows with empty headers are filtered out.

In `@tests/test_workspace_listing_performance.py`:
- Around line 88-109: The test's spy patch is targeting
services.workspace_db.open_global_db but list_workspace_projects calls the
module-local open_global_db imported into services.workspace_listing, so update
the patch to target services.workspace_listing.open_global_db (e.g. import
services.workspace_listing as _ws_listing_mod and use
patch.object(_ws_listing_mod, "open_global_db", _spying_open_global_db)) so the
spy (_spying_open_global_db) actually intercepts calls used by
list_workspace_projects; also ensure executed_queries is asserted non-empty
before filtering for "bubbleId:%" to avoid vacuous success (refer to symbols
_spying_open_global_db, executed_queries, list_workspace_projects,
open_global_db, and patch.object).

In `@tests/test_workspace_tabs_summary.py`:
- Around line 111-133: The SQL spy wrapper in _collect_queries is patching
services.workspace_db.open_global_db but workspace_tabs imports open_global_db
directly, so change the patch target to services.workspace_tabs.open_global_db
in _collect_queries to ensure the wrapper runs; additionally harden the tests to
assert that the captured executed list is non-empty (fail the test if no SQL was
recorded) and update test_no_global_bubble_scan to call the exercised function
with nocache=True to force DB access; use the unique names _collect_queries,
executed, and test_no_global_bubble_scan/test_scoped_bubble_query_only to locate
the spots to modify.

---

Outside diff comments:
In `@templates/workspace.html`:
- Around line 124-175: selectTab can race when multiple async fetches complete
out-of-order; set a per-call marker (e.g. assign selectTab._latest = id at the
start of selectTab) and after any await or before mutating shared state
(tabCache, selectedTab, main-content, calling renderChat) check if
selectTab._latest !== id and if so return early to avoid overwriting a newer
tab; apply these guards around the fetch/try-catch resolution and any other
async points so only the most-recent selection updates the UI.

---

Nitpick comments:
In `@tests/test_summary_cache.py`:
- Around line 37-66: Add a regression test that round-trips a realistic
fingerprint containing workspace_files to catch the tuple→list serialization
mismatch: use fingerprint_workspace_storage to create fp (with a temp workspace
and an entry that creates a state.vscdb file), call set_cached_projects(fp,
projects, warnings), then recompute fp2 = fingerprint_workspace_storage(...) to
mimic a fresh process and assert get_cached_projects(fp2) is not None; this
ensures set_cached_projects/get_cached_projects handle the workspace_files shape
consistently (see fingerprint_workspace_storage, set_cached_projects,
get_cached_projects, and services/summary_cache.py).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c8f448e2-81de-4319-86b7-18abdc0bb2c8

📥 Commits

Reviewing files that changed from the base of the PR and between 3bd7ce0 and 02fcb65.

📒 Files selected for processing (13)

CHANGELOG.md
api/workspaces.py
services/summary_cache.py
services/workspace_db.py
services/workspace_listing.py
services/workspace_tabs.py
static/css/style.css
templates/workspace.html
tests/test_parse_warnings.py
tests/test_project_layouts_dict_shape.py
tests/test_summary_cache.py
tests/test_workspace_listing_performance.py
tests/test_workspace_tabs_summary.py

clean6378-max-it · 2026-06-02T23:12:03Z

Should fix

services/workspace_tabs.py:693 — In assemble_single_tab, replace load_project_layouts_map(global_db) with load_project_layouts_for_composer(global_db, composer_id) (and a minimal map {composer_id: …} for determine_project_for_conversation). (Per-tab load still full-scans all messageRequestContext:% rows, contradicting the scoped-load design and CHANGELOG wording for the single-tab endpoint.)

services/workspace_tabs.py:696-701 — Gate the broad composerData:% alias query behind invalid_workspace_ids (same pattern as list_workspace_tab_summaries / listing), or scope it to the one composer. (Every tab open re-scans all composers for alias resolution even when no invalid workspaces exist.)

api/workspaces.py:167-192 — Add Flask test-client coverage for GET /api/workspaces//tabs?summary=1 and GET /api/workspaces//tabs/<composer_id> (status, shape, no bubbles on summary). (New user-facing routes are only exercised via direct service calls; aligns with eval verification void.)

Nice to have

services/workspace_tabs.py:594 — messageCount uses len(headers) while _assemble_tab_from_composer_data may omit empty bubbles; consider counting renderable messages or documenting the difference. (Sidebar count can disagree with opened conversation.)

templates/workspace.html:113 — Prefer data-id + addEventListener over onclick="selectTab('${tab.id}')". (Avoids breakage if IDs ever contain quotes; minor hardening.)

services/workspace_listing.py:167-224, services/workspace_tabs.py:531-606 — Shared helper for composer-row → project assignment loop. (Reduces drift between list, summary, and full assembly under monolith-duplication.)

bradjin8 added 3 commits May 29, 2026 12:04

feat: initial implementation for lazy and dynamic loading

7058507

fix: loading spinner positioning

a3f3bbe

update: improve project-loading speed by implementing disk cache.

02fcb65

bradjin8 self-assigned this Jun 2, 2026

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread services/summary_cache.py Outdated

Comment thread services/workspace_db.py

Comment thread tests/test_workspace_listing_performance.py

Comment thread tests/test_workspace_tabs_summary.py

bradjin8 added 2 commits June 2, 2026 16:30

Fix mypy errors in summary cache fingerprint comparison.

14e9f54

fix: review findings

f6d3ff9

bradjin8 requested a review from clean6378-max-it June 2, 2026 21:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy-load chat histories, fast list summaries, and mtime disk cache (closes #84)#88

Lazy-load chat histories, fast list summaries, and mtime disk cache (closes #84)#88
bradjin8 wants to merge 5 commits into
masterfrom
feat/lazy-load-history

bradjin8 commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

clean6378-max-it commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bradjin8 commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Backend

Frontend

Tests (356 pass)

Performance notes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

clean6378-max-it commented Jun 2, 2026

Should fix

Nice to have

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bradjin8 commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading