Skip to content

[SEA-NodeJS] (2/3) SEA execution + result fetching#410

Open
msrathore-db wants to merge 2 commits into
mainfrom
msrathore/sea-execution-results
Open

[SEA-NodeJS] (2/3) SEA execution + result fetching#410
msrathore-db wants to merge 2 commits into
mainfrom
msrathore/sea-execution-results

Conversation

@msrathore-db
Copy link
Copy Markdown
Contributor

@msrathore-db msrathore-db commented Jun 1, 2026

Second of three stacked PRs (base: 1/3 connect + auth). Wires statement execution + result reads.

  • SeaSessionBackend.executeStatement — real impl; runs SQL via the napi Connection, returns a SeaOperationBackend.
  • SeaOperationBackend — fetch pipeline (Statement.fetchNextBatch → SeaResultsProvider → ArrowResultConverter → ResultSlicer) + cancel/close/finished via SeaOperationLifecycle.
  • SeaResultsProvider / SeaArrowIpc / SeaArrowIpcDurationFix — Arrow IPC decode for inline + CloudFetch (duration-fix rewrites Arrow Duration → Int64 for apache-arrow@13).
  • ArrowResultConverter — neutral { schema? } constructor so Thrift + SEA both construct it without an adapter.
  • flatbuffers pinned to 23.5.26 (matches apache-arrow@13's nested copy).

Tests: executeStatement/openSession forwarding, M0 datatype round-trip (primitives + ARRAY/MAP/STRUCT), multi-batch streaming, neutral-metadata contract.

Stack: 1/3 → 2/3#411

This pull request and its description were written by Isaac.

Second of three stacked PRs (base: [1/3] connect + auth). Wires the
statement-execution + result-read path:

- SeaSessionBackend.executeStatement: real implementation — runs SQL via the
  napi Connection and returns a SeaOperationBackend (replaces [1/3]'s stub).
- SeaOperationBackend: fetch pipeline (napi Statement.fetchNextBatch →
  SeaResultsProvider → ArrowResultConverter → ResultSlicer) plus operation
  cancel/close/finished via the SeaOperationLifecycle helpers.
- SeaResultsProvider / SeaArrowIpc / SeaArrowIpcDurationFix: Arrow IPC decode
  for inline + CloudFetch result batches (the duration-fix pre-processor
  rewrites Arrow Duration → Int64 so apache-arrow@13 can read it).
- ArrowResultConverter: constructor now takes the neutral { schema? } shape so
  both the Thrift and SEA backends construct it without an adapter.
- flatbuffers pinned to 23.5.26 to match apache-arrow@13's nested copy.

Tests: executeStatement + openSession forwarding, M0 datatype round-trip
through the shared converter (primitives + ARRAY/MAP/STRUCT), multi-batch
streaming, and the neutral-metadata converter contract. Full INTERVAL-type
value parity + exhaustive operation-lifecycle coverage land in [3/3].

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
@msrathore-db msrathore-db force-pushed the msrathore/sea-execution-results branch from 33252ab to 59fa37a Compare June 1, 2026 17:57
@msrathore-db msrathore-db changed the base branch from msrathore/sea-connect-auth to main June 1, 2026 17:57
…rity, docs

Validated each finding against a live pecotesting warehouse first; the headline
INTERVAL story turned out to be split-artifact, not breakage.

- F7: getResultMetadata stored the *unpatched* Duration IPC bytes in
  meta.arrowSchema while advertising ArrowBased — store the patched bytes so an
  ArrowBased consumer doesn't hit `Unrecognized type "Duration" (18)`.
- F3: fetchChunk now honors the `isClosed` cooperative-cancel probe (parity with
  ThriftOperationBackend) at its yield points.
- F6: on a fetch error, best-effort close the statement (napi contract: stream
  is unspecified after Err) and surface a typed kernel error via decodeNapiKernelError.
- F9: cancel-after-fetch now throws the canonical OperationStateError(Canceled)
  ("The operation was canceled by a client") — byte-matches the Thrift message.
- F10: typed HiveDriverError (not raw Error) in the schema/fetchNextBatch guards.
- F1: corrected SeaArrowIpcDurationFix docs — on this layer the rewriter only
  makes Duration *decodable* (raw Int64); the duration_unit formatter lands in
  #411 (verified live: byte-identical to Thrift).
- F5: documented that nested Duration is a SHARED apache-arrow@13 limitation —
  verified the Thrift backend throws the identical error, so SEA matches parity.
- F2: added a live e2e that drives a real Arrow Duration column through the
  rewriter (asserts no "Duration (18)" crash + raw-Int64 on this layer).
- F8: pinned the no-`Failed` invariant in status() (failures reject at submit).
- F12: renamed SeaResultsProvider's SeaStatementHandle → SeaFetchHandle (was a
  name collision with the lifecycle interface of a different shape).
- F13: dropped the no-op await on the synchronous statement.schema().
- F14: fixed the Float-precision comment (Precision enum, not bit-width).
- F15: SeaResultsProvider.prime loops instead of self-recursing on empty batches.

Deferred (noted on the PR): F4 (per-batch triple-decode perf) and F11
(hasResultSet() hard-coded true for M0).

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
@msrathore-db msrathore-db deployed to azure-prod June 1, 2026 20:40 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant