Code generation for ThingClient subclasses by rwb27 · Pull Request #89 · labthings/labthings-fastapi

rwb27 · 2024-12-04T14:08:57Z

Currently, ThingClient subclasses are generated on-the-fly from a Thing Description. This "gets the job done" and means it's always possible to have a client that's up to date with the server. However, it would be nice to support static analysis, e.g. type checking, autocompletion, etc. with importable client classes for specific instruments.

Functionality

This PR implements client code generation, specifically:

A client class is created, subclassing ThingClient and named as per the title of the Thing Description.
Properties are added using their names from the TD
- Property types are converted to Python type hints (with very basic conversion)
- Read-only properties are made read-only. Write-only properties still have a read method.
- Implementations use get_property, set_property as provided by ThingClient
Actions are added using names from the TD
- The input schema is broken down into its properties (i.e. arguments), and each property's schema is converted to an argument with a type hint and default.
- Required properties of the input schema become required arguments
- Additional keyword arguments are always accepted and passed to the endpoint - TD DataSchema objects don't distinguish whether extra properties are allowed or not.
- The output schema is converted to a return type hint
- The function body calls invoke_action.
- ... is used to distinguish between explicitly set None values, and unspecified values (i.e. not included in the JSON). This allows the server to fill in default values, which is safer than trying to generate default values in the Python code.
ast is used to generate a syntax tree that's then turned into code. This has the significant advantage that:
- the generated code should always be valid Python
- it should be very robust against attack.

Schema conversion

Conversion from DataSchema to Python types is implemented, and can generate client types for everything in the current OFM simulation. Julian's more recent PR (#300) independently implemented very similar logic, which I take as a good sign.

Types are converted recursively, with support for:

integer -> int
number -> float
boolean -> bool
string -> str
anyOf -> Union
null -> None
array -> list (if an item type is specified, it's converted recursively) or tuple (if several item types are specified)
object -> dict[str, Any] or a generated pydantic.BaseModel subclass (if properties are specified)
Anything else -> Any

object schemas will be recursively converted to pydantic models. This can be turned off by disabling recursion, and more fine-grained control may be helpful.

Key differences from #300 are:

array schemas are converted to tuples if items is an array of Data Schema objects, as per JSONSchema 2019 (there's a link in the relevant comment in DataSchemaConverter.convert().
object schemas are converted to dynamically-generated pydantic.BaseModel subclasses, if they specify properties. If properties are not specified, object is still rendered as dict.
By default, this implementation is recursive, and will nest up to ten levels of type conversion. This may be increased or decreased by passing an argument to the constructor of DataSchemaConverter (though that's not yet exposed in e.g. ThingClient.from_url().

Still to do

Before this can be merged, we need to add a few things:

More testing, including all the data types
Actual use of the created class. This is done, because ThingClient.from_url now uses this module.
Better deduplication of models.
Consider how to expose (or not generate) models when returning only a client object, e.g. in ThingClient.from_url.
Recognition of Blob in the output (rather than autogenerating a model for it).
Expose controls for recursion and/or model generation in ThingClient.from_url so we can revert to something closer to the current client if that's desired.
It would be nice to check that the methods/properties of the generated client class match the Thing Description of the thing we're connecting to. This may be left for the future.

Future work

In a subsequent PR, it would be very good to:

Allow import of some input/output types, so we can re-use existing models rather than autogenerate copies.
Process values that come from the server, as a means of "reinflating" them to Python objects (particularly useful for np.array). This probably uses pydantic validation, probably with an option to skip it if needed.
Make it possible to wrap a Thing (analogous to DirectThingClient) to ease the transition between client-side and server-side code.

VigneshVSV · 2024-12-04T15:32:54Z

I remember us talking about this. Now that I am somewhat finished with my obligations, I will have deeper look at the latest changes in labthings.

FYI, I already released pydantic based property types from this code base as part of my repository.

rwb27 · 2025-11-14T09:38:33Z

I'm going to bump this to 0.0.13 as it's not yet critical. I am keen to get in in though, as it will really improve the experience of using lab things from e.g. a python notebook.

This generates Python code for a module containing a client. The Python code generated for the test thing executes successfully, though the name of the class is not quite right yet.

Dataschemas of type Object that have defined properties are now converted to Pydantic models. This should allow much better autocompletion. The inconsistency of using `dict[str, any]` as a fallback rankles, but I think it's the best option.

This should be significantly more secure: I believe it should now be impossible for a malcious Thing Description to create dangerous Python code. In doing this, I've simplified the way the client creates properties - I need to check if it type hints them correctly. There are still a few things to fix: * Default values probably want to be `...` rather than constants, so they're filled in server-side. * This doesn't yet validate/convert return values. * This isn't tested enough and probably doesn't actually work! * We need configurability: as a minimum, recursion limits and fallback to `Any` so it fails gracefully if given a bad TD.

This commit adds: * a limit on recursion * a bunch of tidying up * many more comments * cleaner generated code * generated docstrings for actions, properties, and Things. * thing client properties now derive from BaseDescriptor - this required a few changes, but makes the generated code neater. * BaseDescriptor may now be defined on non-Thing classes.

This moves the single file out of the folder, keeping the module name the same. I've also provided the `type_params` argument on Python > 3.11.

barecheck · 2026-06-10T23:11:53Z

Barecheck - Code coverage report

Total: 95.9%

Your code coverage diff: -1.07% ▾

Uncovered files and lines

File	Lines
src/labthings_fastapi/base_descriptor.py	181, 195-196
src/labthings_fastapi/client/__init__.py	41, 77, 80-81, 237, 286-289, 403
src/labthings_fastapi/code_generation.py	41-42, 138, 151, 182, 299, 303, 323-324, 331, 333, 368-369, 388-390, 463, 468-469, 484-485, 487, 598, 602, 697, 700, 725-726, 758-766, 769-770, 774, 779, 784-785

This compares unparsed strings rather than AST nodes, as the latter don't properly show as equal. `ast.compare` will help here, but not until we drop a few Python versions.

rwb27 · 2026-06-11T08:42:10Z

I think this is now sufficiently close to complete that it's better than the existing ThingClient implementation.

The unpinned tests are failing because of a new Ruff rule, this should be fixed separately (and I'm trying not to establish a precedent that upstream changes hold up merge requests if the pinned tests are passing).

The OFM test failure is, I believe, a bug in the OFM test that's uncovered by these changes, it should be a trivial fix in a feature branch.

I'm very aware this needs a lot more test code, and I'd be very grateful for advice from @bprobert97 on how to do that. I think what I really want is a way to randomly construct challenging type hints, but I think that would be a pig to implement, unless hypothesis is very much cleverer than I guessed.

For a first pass of review, I think I'm mostly hoping for feedback on:

Is code generation using ast from the Thing Description safe? I think it's OK; unless there's a bug in ast.unparse there's no way to achieve code execution from any string passed in from the Thing Description.
Is writing to a temporary file and loading it as a module a reasonable way to dynamically generate a subclass? It would also be possible to use eval, I don't think there's a huge change in safety but it would break the current implementation of properties, and potentially stop some documentation tools from picking up their docstrings (as this is commonly done by inspecting the source, which you can't do for modules created with eval).

I can't pretend I don't have an opinion here - I like this implementation. However, I'm also aware this sort of code generation needs a lot of care, and I do want to get it right.

rwb27 · 2026-06-11T09:30:09Z

For the record, I've also run the generated code for the microscope client through pdoc and at first glance the docs look really pretty nice.

One-line docstrings end up in both the `title` and `description` fields. We now deduplicate this, so the generated docstrings match the originals.

This executes `ruff` in a subprocess, which `ruff` flags as potentially insecure. I've thought through this carefully and I'm happy we can ignore the warning, as I don't see any opportunity to inject malicious input.

rwb27 mentioned this pull request Dec 18, 2024

Client should consistently return either dict or BaseModel #90

Open

rwb27 added this to the v0.0.12 milestone Jul 31, 2025

rwb27 changed the title ~~Draft: Code generation for ThingClient subclasses~~ Code generation for ThingClient subclasses Jul 31, 2025

rwb27 marked this pull request as draft July 31, 2025 08:37

rwb27 modified the milestones: v0.0.12, v0.0.13 Nov 14, 2025

rwb27 modified the milestones: v0.1.0, v0.1.1, v0.2.0 Mar 3, 2026

rwb27 mentioned this pull request Mar 30, 2026

Fix signatures for ThingClient #300

Draft

rwb27 modified the milestones: v0.2.0, v0.3.0 Apr 2, 2026

rwb27 added 2 commits June 4, 2026 12:45

Code generation for ThingClient subclasses

92fb4fa

This generates Python code for a module containing a client. The Python code generated for the test thing executes successfully, though the name of the class is not quite right yet.

Convert objets to Models

542917b

Dataschemas of type Object that have defined properties are now converted to Pydantic models. This should allow much better autocompletion. The inconsistency of using `dict[str, any]` as a fallback rankles, but I think it's the best option.

rwb27 force-pushed the client-code-generation branch from 2c68771 to bff6809 Compare June 4, 2026 16:20

rwb27 force-pushed the client-code-generation branch from bff6809 to 4a46348 Compare June 9, 2026 14:11

rwb27 added 2 commits June 10, 2026 23:32

Fix typing for python > 3.11 and move module.

7b6f87d

This moves the single file out of the folder, keeping the module name the same. I've also provided the `type_params` argument on Python > 3.11.

Deduplicated generated models

8469cdc

This compares unparsed strings rather than AST nodes, as the latter don't properly show as equal. `ast.compare` will help here, but not until we drop a few Python versions.

rwb27 requested a review from bprobert97 June 11, 2026 08:42

rwb27 mentioned this pull request Jun 11, 2026

Literal doesn't pass through to ThingClient. #355

Open

rwb27 added 2 commits June 11, 2026 10:46

Don't duplicate one-line docstrings

7db58cf

One-line docstrings end up in both the `title` and `description` fields. We now deduplicate this, so the generated docstrings match the originals.

Add a function to generate a module on disk.

71394cd

This executes `ruff` in a subprocess, which `ruff` flags as potentially insecure. I've thought through this carefully and I'm happy we can ignore the warning, as I don't see any opportunity to inject malicious input.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code generation for ThingClient subclasses#89

Code generation for ThingClient subclasses#89
rwb27 wants to merge 8 commits into
mainfrom
client-code-generation

rwb27 commented Dec 4, 2024 •

edited

Loading

Uh oh!

VigneshVSV commented Dec 4, 2024 •

edited

Loading

Uh oh!

rwb27 commented Nov 14, 2025

Uh oh!

barecheck Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

rwb27 commented Jun 11, 2026

Uh oh!

rwb27 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rwb27 commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Functionality

Schema conversion

Still to do

Future work

Uh oh!

VigneshVSV commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rwb27 commented Nov 14, 2025

Uh oh!

barecheck Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Barecheck - Code coverage report

Uh oh!

rwb27 commented Jun 11, 2026

Uh oh!

rwb27 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rwb27 commented Dec 4, 2024 •

edited

Loading

VigneshVSV commented Dec 4, 2024 •

edited

Loading

barecheck Bot commented Jun 10, 2026 •

edited

Loading