Skip to content

fix(sources):skip redundant file fingerprinting for already-watched files#275

Open
vparfonov wants to merge 1 commit into
ViaQ:v0.54.0-rhfrom
vparfonov:log9436
Open

fix(sources):skip redundant file fingerprinting for already-watched files#275
vparfonov wants to merge 1 commit into
ViaQ:v0.54.0-rhfrom
vparfonov:log9436

Conversation

@vparfonov

@vparfonov vparfonov commented Jun 9, 2026

Copy link
Copy Markdown

On each glob cycle, FileServer fingerprinted every file returned by the paths provider, even files already being actively watched. Each fingerprint involves syscalls (open, seek, read etc). On clusters with 500+ pods this caused thousands of unnecessary read syscalls per minute, saturating disk I/O and disrupting etcd on control plane nodes.
Add a path-based reverse lookup before fingerprinting. If a file path is already tracked in fp_map and hasn't been truncated (file size >=read position), skip fingerprinting entirely. Truncated files still fall through to full fingerprinting to preserve correct behavior.

Measured impact (500 files, 35s trace):

  • open: 1,503 → 5 (99.7% reduction)
  • lseek: 3,000 → 0 (100% reduction)
  • read: 4,500 → 2,500 (44% reduction, remaining are data reads)
  • total: 12,033 → 2,555 (78.8% reduction)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

Summary by CodeRabbit

  • Refactor
    • Optimized file server's discovery process to reduce processing overhead and improve performance when monitoring file changes.

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Enterprise

Run ID: c2b43135-a52e-453e-bf24-50f7960085cb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

FileServer's periodic glob/discovery loop now builds a reverse lookup from watched file paths to their stored fingerprints and read positions. During path iteration, it checks whether the current file size is not smaller than the stored read position; if so, it marks the watcher findable and skips expensive re-fingerprinting. Existing fingerprinting and watcher logic remains as fallback.

Changes

FileServer fingerprinting optimization

Layer / File(s) Summary
Fast-path fingerprinting with lookup cache
lib/file-source/src/file_server.rs
Reverse lookup map from watched paths to fingerprints/positions is constructed (lines 188–193) and used to conditionally bypass fingerprinting when file size is not smaller than recorded read position (lines 195–215); existing fingerprinting logic retained as fallback.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A fast path hops through the fingerprint field,
No re-scanning if size gains reveal,
The lookup cache keeps watch with care,
While fallback logic waits still there—
Optimization makes the loop more real! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main optimization: skipping redundant fingerprinting for already-watched files in the FileServer's periodic discovery loop.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vparfonov

Copy link
Copy Markdown
Author

/hold

@jcantrill

Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/file-source/src/file_server.rs`:
- Around line 199-203: The code currently treats metadata errors as if the file
is not dominated, which incorrectly triggers the fast-path and marks the watcher
findable; instead, change the logic around fs::metadata(&path).await so that you
only compute and use dominated when metadata() returns Ok — e.g., match on
fs::metadata(&path).await and set let dominated = metadata.len() <
*file_position only in the Ok branch, and in the Err branch fall through into
the existing fingerprinting path (do not set or use dominated nor mark the
watcher findable). Update the block that references dominated (the fast-path
check) to only run when metadata succeeded.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Enterprise

Run ID: 62dea80f-3c1b-4150-8d06-2353d3e0ad3c

📥 Commits

Reviewing files that changed from the base of the PR and between a329118 and 7d5aa86.

📒 Files selected for processing (1)
  • lib/file-source/src/file_server.rs

Comment thread lib/file-source/src/file_server.rs Outdated

@jcantrill jcantrill left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcantrill, vparfonov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved label Jun 9, 2026
…iles

  On each glob cycle, FileServer fingerprinted every file returned by the
  paths provider, even files already being actively watched. Each
  fingerprint involves syscalls (open, seek, read magic bytes, seek,
  read first line, EOF check). On clusters with 500+ pods this caused
  thousands of unnecessary read syscalls per minute, saturating disk I/O
  and disrupting etcd on control plane nodes.

  Add a path-based reverse lookup before fingerprinting. If a file path
  is already tracked in fp_map and hasn't been truncated (file size >=
  read position), skip fingerprinting entirely. Truncated files still
  fall through to full fingerprinting to preserve correct behavior.

  Measured impact (500 files, 35s trace):
  - open:  1,503 → 5    (99.7% reduction)
  - lseek: 3,000 → 0    (100% reduction)
  - read:  4,500 → 2,500 (44% reduction, remaining are data reads)
  - total: 12,033 → 2,555 (78.8% reduction)

  Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Signed-off-by: Vitalii Parfonov <vparfono@redhat.com>
@vparfonov

Copy link
Copy Markdown
Author

/test cluster-logging-operator-e2e

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown

@vparfonov: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/cluster-logging-operator-e2e cd88077 link true /test cluster-logging-operator-e2e

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants