Skip to content

test(core): tolerate grader timing jitter#1355

Merged
christso merged 1 commit into
mainfrom
fix/orchestrator-timing-flake
Jun 11, 2026
Merged

test(core): tolerate grader timing jitter#1355
christso merged 1 commit into
mainfrom
fix/orchestrator-timing-flake

Conversation

@christso

Copy link
Copy Markdown
Collaborator

Summary

  • Relax the flaky per-grader timing assertions to require positive recorded duration instead of an exact lower bound matching setTimeout delays.
  • Keep the existing timestamp assertions that prove duration_ms is internally consistent with started_at/ended_at timing.

Root cause

Main CI failed in GitHub Actions run https://github.com/EntityProcess/agentv/actions/runs/27316820557/job/80699130076 because packages/core/test/evaluation/orchestrator.test.ts asserted durationMs >= 50 after awaiting a 50ms timer. CI recorded 49ms. The orchestrator records millisecond wall-clock timestamps before and after the evaluator call, so timer/clock granularity can make the measured elapsed time one millisecond below the requested delay even though timing metadata is present and correct.

Verification

  • cd packages/core && bun test test/evaluation/orchestrator.test.ts -t "includes per-grader timing in scores" repeated 30x
  • cd packages/core && bun test test/evaluation/orchestrator.test.ts -t "includes per-grader timing even when evaluator fails" repeated 30x
  • cd packages/core && bun test test/evaluation/orchestrator.test.ts
  • bun --filter @agentv/core test
  • Pre-push hook: typecheck + biome check .

@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 60e4eb6
Status: ✅  Deploy successful!
Preview URL: https://a04bb565.agentv.pages.dev
Branch Preview URL: https://fix-orchestrator-timing-flak.agentv.pages.dev

View logs

@christso christso merged commit f790a85 into main Jun 11, 2026
8 checks passed
@christso christso deleted the fix/orchestrator-timing-flake branch June 11, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant