feat(kernel): surface kernel logs through Python logging on use_kernel by vikrantpuppala · Pull Request #824 · databricks/databricks-sql-python

vikrantpuppala · 2026-06-04T05:15:17Z

Summary

On the use_kernel=True path, the Rust kernel's logs now surface through Python's stdlib logging automatically — no extra setup. Kernel logs land under the databricks.sql.kernel logger, a child of the connector's databricks.sql.* namespace, so a customer's existing logging config just works:

import logging
logging.getLogger("databricks.sql").setLevel(logging.DEBUG)  # connector + kernel

Before this, kernel-backed sessions emitted nothing into Python logging — the kernel produced tracing events but nothing on the Python path consumed them.

Changes

backend/kernel/_errors.py calls databricks_sql_kernel.init_logging() once when the kernel extension loads (the canonical kernel-import site). The call is getattr-guarded, so an older kernel wheel without the function still works — just without kernel logs.
tests/e2e/test_kernel_backend.py adds two creds-gated e2e tests (matching the existing kernel e2e convention):
- test_kernel_logs_reach_python_logging — a query at DEBUG produces records on databricks.sql.kernel, and the pyo3 boundary surfaces under databricks.sql.kernel.pyo3.
- test_kernel_log_level_is_respected — at WARNING, the chatty DEBUG kernel records are filtered out (proving level control, not unconditional forwarding).

Cross-layer visibility

With this, a single query shows the full vertical in one stream:

databricks.sql.session       DEBUG  Creating kernel-backed client for use_kernel=True
databricks.sql.kernel.pyo3   DEBUG  Session.__new__: opening kernel session (GIL released)
databricks.sql.kernel        DEBUG  CreateSession: POST .../sessions (attempt 1/6)
databricks.sql.kernel        INFO   kernel.session; session_id=...
databricks.sql.client        DEBUG  Cursor.execute(operation=SELECT 1 AS a)
databricks.sql.kernel.pyo3   DEBUG  Statement.execute: running statement (GIL released)
databricks.sql.kernel        DEBUG  ExecuteStatement: POST .../statements (attempt 1/6)

Dependency

Requires the companion kernel/pyo3 change that exposes init_logging(): databricks/databricks-sql-kernel#120. Until a kernel wheel containing that function is installed, this is a no-op (the getattr guard), so it's safe to merge independently.

Testing

Both e2e tests pass against a live warehouse with a locally-built kernel wheel (maturin develop).

🤖 Generated with Claude Code

When the kernel backend loads, auto-initialize the kernel's tracing -> Python logging bridge so `use_kernel=True` users see kernel logs with no extra setup. Kernel logs land under the `databricks.sql.kernel` logger (a child of the connector's `databricks.sql.*` namespace), so an existing `logging.getLogger("databricks.sql").setLevel(...)` cascades to them. - `_errors.py` calls `databricks_sql_kernel.init_logging()` once at extension load (it's the canonical kernel-import site). The call is `getattr`-guarded so an older kernel wheel without the function still works — just without kernel logs. - e2e tests assert kernel records reach the `databricks.sql.kernel` logger (and the pyo3 boundary under `databricks.sql.kernel.pyo3`) and that the level set on the logger is respected. Creds-gated per the existing kernel e2e convention. Requires the companion kernel/pyo3 change (databricks-sql-kernel#120) that exposes `init_logging()`. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

gopalldb · 2026-06-04T05:26:29Z

🟠 P1 (Important — should fix before merge)

init_logging() call is not wrapped against exceptions/panics — _errors.py:74-76 (confirmed: bare _kernel_init_logging()). The getattr guards a missing function, not a throwing one, and the surrounding
try/except ImportError only catches the import. A pyo3 panic surfaces as pyo3_runtime.PanicException — a BaseException, not caught by except Exception — and would propagate out of module import, failing
every use_kernel=True connection in the process over a non-essential logging feature. → wrap in try/except BaseException (deliberate, since pyo3 panics derive from it), log a debug breadcrumb, and continue
without the bridge.
The "idempotent on the Rust side" claim is load-bearing and unverifiable here — _errors.py:71-72. Safety rests on the Rust side using try_init/a guard rather than
pyo3_log::init()/tracing_subscriber...init(), both of which panic if a global logger was already set. The logger is process-global, so if the host app (or another lib) already installed a log/tracing
subscriber, a non-idempotent init panics → lands on P1#1. → land the try/except (neutralizes it regardless of the Rust impl), and confirm/tighten the comment to the exact contract ("safe to call repeatedly;
never panics").
Forwarding kernel DEBUG logs widens the data-exposure surface — once installed, any app setting databricks.sql to DEBUG (a common troubleshooting step) receives whatever the Rust kernel emits at
DEBUG/TRACE — commonly SQL text, bound parameter values, request URLs, and Authorization headers. None of that is visible/redactable from this repo, so the connector can't assert no leakage. Unlike
pure-Python paths where the connector controls/redacts its own log content, this routes kernel internals into app logs/aggregation. → audit what the kernel emits at DEBUG/TRACE for the databricks.sql.kernel
target; confirm credentials/auth headers are never logged; decide whether SQL/param exposure at DEBUG is acceptable, and if not, cap the forwarded level (e.g. INFO+) until the Rust side guarantees redaction.
Document the exposure in user docs.

🟡 P2 (Minor)

Level-filtering test proves caplog handler filtering, not source-side suppression — test_kernel_backend.py:173-194. caplog.at_level(WARNING, ...) sets the logger level and attaches a handler, so a DEBUG
record that crossed the FFI would still be dropped by Python before reaching caplog.records. The test can't distinguish "kernel suppressed at source" from "Python filtered after FFI" — false confidence that
a chatty TRACE loop avoids per-record FFI + LogRecord cost. → assert source-side suppression explicitly (emit counter) or soften the docstring + note the per-record FFI cost.
Hard-asserting databricks.sql.kernel.pyo3 records — :165-167 couples the connector test suite to the kernel's internal pyo3-log target naming; a benign kernel change breaks e2e. → treat .pyo3 as a
soft/optional check.
Skip hygiene — verified clean: importorskip + fake-stub file guard + kernel_conn_params gating mean both new tests skip cleanly without creds/wheel.

Review feedback from PR #824 (gopalldb): - P1: guard the init_logging() call against throwing, not just a missing function. A panic across the PyO3 boundary surfaces as pyo3_runtime.PanicException (a BaseException, not Exception), so a bare call could escape module import and fail every use_kernel=True connection over a non-essential logging feature. Wrap in try/except BaseException, log a debug breadcrumb, continue. This also neutralizes the idempotency concern regardless of the Rust impl. - P2: soften the level-control test docstring to make clear it asserts the effective customer-visible outcome (sub-threshold records don't surface), not source-side suppression — caplog filters after the FFI. - P2: downgrade the databricks.sql.kernel.pyo3 assertion to a soft warning so a benign kernel change to the boundary breadcrumb target doesn't break the connector e2e suite. The core databricks.sql.kernel contract is still hard-asserted. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

vikrantpuppala · 2026-06-04T05:33:45Z

Thanks @gopalldb — these are great catches. Dispositions below; pushed in ca9234b.

P1#1 — bare `init_logging()` vulnerable to a throwing/panicking call ✅ Fixed

You're right. Wrapped in try/except BaseException (deliberately broad, since a PyO3 panic surfaces as pyo3_runtime.PanicException, a BaseException), logging a debug breadcrumb and continuing without the bridge. Logging is non-essential and must never take down a use_kernel=True connection.

P1#2 — idempotency claim load-bearing & unverifiable here ✅ Covered

The try/except above neutralizes this regardless of the Rust impl, as you suggested. FWIW the kernel side is safe — it guards with a OnceLock and uses pyo3_log::Logger::install() which returns a Result (matched, returns false on Err), so it returns rather than panics on a double install — but we no longer rely on that. Tightened the comment to state the guard holds regardless.

P1#3 — DEBUG data-exposure surface ✅ Audited; documented; no level cap

I audited what the kernel emits for the databricks::sql::kernel target:

Auth headers / tokens: never logged. No Authorization/Bearer/token-value logging. OAuth lines log cache state ("loaded cached token with refresh_token"), not secrets.
Bound parameter values: never logged.
SQL text: logged at DEBUG (ExecuteStatement + metadata queries).
URLs: logged, but they're API endpoints (/api/2.0/sql/statements), not credential-bearing.

On SQL-at-DEBUG: this is parity with the existing connector, which already logs both SQL and parameter values at DEBUG (client.py:1331: Cursor.execute(operation=%s, parameters=%s)). The kernel path is actually more conservative — SQL but not params. So this isn't a new exposure class for the driver; capping the kernel to INFO+ would (a) break that parity, (b) gut the feature's primary use case (debugging queries), and (c) still leave the connector's own DEBUG SQL/param logging in place. I'd rather keep parity and document it. Happy to add a note to user-facing docs that databricks.sql.kernel at DEBUG includes SQL text — let me know if you'd like that in this PR or a docs follow-up.

P2 — level test proves handler filtering, not source suppression ✅ Docstring softened

Correct. Reworded the docstring to scope the claim to the effective customer-visible outcome (sub-threshold records don't surface), and noted that source-side suppression / per-record FFI cost avoidance is a kernel-side concern, not asserted here.

P2 — hard-asserting `databricks.sql.kernel.pyo3` couples to kernel internals ✅ Softened

Downgraded to a warnings.warn so a benign kernel change to the boundary breadcrumb target doesn't break the connector e2e. The core databricks.sql.kernel contract stays hard-asserted.

P2 — skip hygiene

Thanks for confirming.

vikrantpuppala temporarily deployed to azure-prod June 4, 2026 05:15 — with GitHub Actions Inactive

vikrantpuppala temporarily deployed to azure-prod June 4, 2026 05:33 — with GitHub Actions Inactive

gopalldb approved these changes Jun 4, 2026

View reviewed changes

vikrantpuppala added this pull request to the merge queue Jun 4, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kernel): surface kernel logs through Python logging on use_kernel#824

feat(kernel): surface kernel logs through Python logging on use_kernel#824
vikrantpuppala wants to merge 2 commits into
mainfrom
feat/kernel-logging-init

vikrantpuppala commented Jun 4, 2026

Uh oh!

gopalldb commented Jun 4, 2026

Uh oh!

vikrantpuppala commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vikrantpuppala commented Jun 4, 2026

Summary

Changes

Cross-layer visibility

Dependency

Testing

Uh oh!

gopalldb commented Jun 4, 2026

Uh oh!

vikrantpuppala commented Jun 4, 2026

P1#1 — bare init_logging() vulnerable to a throwing/panicking call ✅ Fixed

P1#2 — idempotency claim load-bearing & unverifiable here ✅ Covered

P1#3 — DEBUG data-exposure surface ✅ Audited; documented; no level cap

P2 — level test proves handler filtering, not source suppression ✅ Docstring softened

P2 — hard-asserting databricks.sql.kernel.pyo3 couples to kernel internals ✅ Softened

P2 — skip hygiene

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

P1#1 — bare `init_logging()` vulnerable to a throwing/panicking call ✅ Fixed

P2 — hard-asserting `databricks.sql.kernel.pyo3` couples to kernel internals ✅ Softened