Skip to content

Fix client *Dma address translation (KeyExport / NvmAdd / NvmRead) and KeyCacheDma use-after-free#403

Open
Frauschi wants to merge 4 commits into
wolfSSL:mainfrom
Frauschi:dma-client-translation-fixes
Open

Fix client *Dma address translation (KeyExport / NvmAdd / NvmRead) and KeyCacheDma use-after-free#403
Frauschi wants to merge 4 commits into
wolfSSL:mainfrom
Frauschi:dma-client-translation-fixes

Conversation

@Frauschi

@Frauschi Frauschi commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Background

A few client *Dma APIs wrote the raw client pointer straight into the wire message instead of routing it through the DMA address-translation callback like the rest of the family.

This is invisible on flat-memory ports (the POSIX SHM reference, where client and server share an address space), so it survived CI. It breaks on a real split-address-space port — found during DMA bring-up on one of the new hardware platforms for wolfHSM, where the server runs on a separate security CPU with no view of host RAM, so it bounds-checks the untranslated pointer and rejects the request (WH_ERROR_BADARGS).

What this fixes

  • Missing translation in KeyExportDma, KeyExportPublicDma, NvmAddObjectDma, NvmReadDma. Each now runs *_PRE translation before sending and the matching *_POST on the response, mirroring RngGenerateDma.
  • Use-after-free in KeyCacheDmaRequest, which ran its POST inside the request — releasing/recycling the scratch before the server read the buffer (benign on POSIX, a real UAF on any port whose callback allocates). It now defers the POST to KeyCacheDmaResponse.

Notable implementation details

  • The per-call-site async-DMA structs are consolidated into a single whClientDmaAsyncBuf (carrying the POST direction) plus a two-buffer nvmAdd variant, with a shared wh_Client_DmaAsyncPost() helper that returns the POST status (a failed unmap/copy-back is surfaced, not swallowed).
  • Each request clears its mapping slot(s) before the PRE, so a path that leaves a slot unpopulated (e.g. metadata-only NvmAddObjectDma) can't make the response POST stale union memory; data_hostaddr is 0 when there is no data buffer, so a raw client pointer is never forwarded.
  • Requests fail fast with WH_ERROR_REQUEST_PENDING on a busy transport (no mapping acquired then leaked).
  • wh_{Client,Server}_DmaRegisterAllowList() now accept NULL to unregister, symmetric with DmaRegisterCb(NULL).

Tests

The reason these bugs survived CI is that no POSIX test ever registered a non-identity DMA callback — a missing translation looked identical to a correct call. This PR adds a shared bounce-pool harness (test/wh_test_dma.c) that models a split-address-space port: the client callback bounces each buffer through a dedicated pool and hands the server a pool address; the server callback rejects any untranslated address. Freed slots are poisoned (so a premature POST corrupts the data) and a POST matching no live slot is recorded as a stray/double free.

  • wh_test_clientserver.c drives the keystore/NVM *Dma APIs end-to-end — including metadata-only and injected PRE-failure cases — in the single-thread pump harness, where ordering makes the use-after-free deterministic.
  • wh_test_crypto.c registers the same callback on the threaded harness, so the entire crypto/cert *Dma suite (AES/SHA/CMAC/RNG/Ed25519/ML-DSA/ML-KEM/cert, plus KeyExportPublicDma) runs through translation automatically.
  • Also fixes a pre-existing bug the new coverage surfaced: _testDma left a stack-local allow list registered after returning (a dangling pointer ASAN flags once a later test uses server DMA).

Runs under make DMA=1 (already set by every CI job).

Validation

  • make DMA=1 ASAN=1: passes, 0 ASAN findings.
  • Default non-DMA build: passes.
  • Each fix was confirmed to be caught by the new tests (regressions re-introduced → tests fail).

@Frauschi Frauschi self-assigned this Jun 5, 2026
@Frauschi Frauschi force-pushed the dma-client-translation-fixes branch 2 times, most recently from 364edf6 to b611887 Compare June 5, 2026 14:30
@Frauschi Frauschi marked this pull request as ready for review June 5, 2026 14:37
@Frauschi Frauschi assigned wolfSSL-Bot and unassigned Frauschi Jun 5, 2026
@Frauschi Frauschi requested a review from bigbrett June 5, 2026 14:37
Comment thread src/wh_client.c Outdated
Comment thread src/wh_client.c Outdated
Comment thread src/wh_client.c Outdated
Comment thread src/wh_client.c Outdated
Comment thread src/wh_client.c Outdated
Comment thread src/wh_client.c
Comment thread src/wh_client.c Outdated
Comment thread src/wh_client_nvm.c Outdated
Comment thread src/wh_client_nvm.c Outdated
@Frauschi Frauschi force-pushed the dma-client-translation-fixes branch from b611887 to d126890 Compare June 9, 2026 14:21
rizlik
rizlik previously approved these changes Jun 9, 2026

@rizlik rizlik left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lgtm!

@rizlik rizlik assigned bigbrett and unassigned Frauschi Jun 9, 2026
Comment thread src/wh_client.c Outdated
Comment on lines +1462 to +1465
c->dma.asyncCtx.buf.xformedAddr = keyAddrPtr;
c->dma.asyncCtx.buf.clientAddr = (uintptr_t)keyAddr;
c->dma.asyncCtx.buf.sz = keySz;
c->dma.asyncCtx.buf.postOper = WH_DMA_OPER_CLIENT_READ_POST;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems really clunky - should we not have a generic pre- wrapper like you added for the post operation that primes the asyncCtx under the hood based on the input args?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by adding the new wh_Client_DmaAsyncPre() helper analogous to the post helper.

Comment thread src/wh_client.c
}

ret = wh_Client_RecvResponse(c, &group, &action, &size, (uint8_t*)resp);
if (ret == WH_ERROR_NOTREADY) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be helpful to have a comment about why NOTREADY gets an early return but other errors do not (releasing the DMA mapping)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an explanatory comment (on all locations)

Comment thread src/wh_client.c
Comment thread src/wh_client_dma.c Outdated
buf->postOper, (whDmaFlags){0});
/* Clear the slot even on failure so a later Response cannot re-run the
* POST; the failure is returned to the caller. */
buf->sz = 0;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part of me thinks we should just clear the whole struct since it is no longer in use? Makes it seem more explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/wh_client_nvm.c
* unset, and the Response must not POST a stale (shared-union) size. */
c->dma.asyncCtx.nvmAdd.meta.sz = 0;
c->dma.asyncCtx.nvmAdd.data.sz = 0;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above re- clearing the whole struct. I wonder if this should have a helper just to make it explicit (1wh_Client_DmaAsyncCtxReset()`) or "clear" or something like that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleared now completely within the post helper.

bigbrett
bigbrett previously approved these changes Jun 9, 2026

@bigbrett bigbrett left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Frauschi I'm approving in light of my comments earlier. If you want it in ASAP you can go ahead and click merge since both Marco and I approved, though I'd like the comments addressed at some point. If you would rather do it in a follow-up PR, go ahead and merge. If you want to follow up with another PR on this branch, then you are welcome to do that too. Up to you, both fine with me.

@bigbrett bigbrett assigned Frauschi and unassigned bigbrett and wolfSSL-Bot Jun 9, 2026
Frauschi added 4 commits June 10, 2026 09:43
… buffers

KeyExportDma, KeyExportPublicDma, NvmAddObjectDma and NvmReadDma put the raw
client pointer in the wire message instead of running the DMA translation
callback -- invisible on flat-memory ports (POSIX SHM) but rejected by a
split-address-space server. Each now runs the PRE translation before sending
and the matching POST on the Response.

KeyCacheDmaRequest also ran its POST inside the Request, before the server read
the buffer (a use-after-free on ports whose callback allocates); it now defers
the POST to KeyCacheDmaResponse.

The per-call-site async structs are consolidated into whClientDmaAsyncBuf with a
shared wh_Client_DmaAsyncPost() helper that returns the POST status (so a failed
unmap/copy-back is surfaced). Each Request clears its slot(s) before the PRE, so
an unpopulated slot (e.g. metadata-only NvmAddObjectDma) cannot POST stale union
memory, and forwards data_hostaddr = 0 when there is no data buffer. Requests
fail fast with WH_ERROR_REQUEST_PENDING on a busy transport, and
wh_{Client,Server}_DmaRegisterAllowList() now accept NULL to unregister.
The existing POSIX tests never registered a non-identity DMA callback, so a
*Dma API that skipped translation looked identical to a correct one -- which is
why the keystore/NVM translation bugs survived CI. Add a shared "bounce-pool"
harness (test/wh_test_dma.c) modeling a split-address-space port: the client
callback bounces each buffer through a pool, the server callback rejects any
untranslated address, freed slots are poisoned so a premature POST
(use-after-free) corrupts the data, and a POST matching no live slot is recorded
as a stray/double free.

- wh_test_clientserver.c drives the keystore/NVM *Dma APIs (including
  metadata-only and injected PRE-failure cases) in the single-thread pump
  harness, where ordering makes the use-after-free deterministic.
- wh_test_crypto.c registers the same callback on the threaded harness so the
  whole crypto/cert *Dma suite runs through translation.
- _testDma now unregisters its stack-local allow list before returning (a
  pre-existing dangling pointer ASAN flags once a later test uses DMA).

Runs under `make DMA=1`.
@Frauschi Frauschi dismissed stale reviews from bigbrett and rizlik via 089eee2 June 10, 2026 09:21
@Frauschi Frauschi force-pushed the dma-client-translation-fixes branch from d126890 to 089eee2 Compare June 10, 2026 09:21
@Frauschi

Copy link
Copy Markdown
Contributor Author

@bigbrett I addressed your comments right away. The new pre and post helper are currently only applied to the *DMA methods touched in this PR. I'll do a follow-up PR that migrates all the other *Dma methods (mostly the many crypto operations) to these as well.

@Frauschi Frauschi assigned bigbrett and unassigned Frauschi Jun 10, 2026
@Frauschi Frauschi requested a review from bigbrett June 10, 2026 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants