chore(proto): add buf-based proto skeleton + tooling (Phase 0)#614
chore(proto): add buf-based proto skeleton + tooling (Phase 0)#614lakhansamani wants to merge 1 commit into
Conversation
Sets up the foundation for the upcoming multi-protocol public API surface (GraphQL + gRPC + REST). No behaviour change: only the proto module, its generated Go bindings, Makefile helpers, and a CI lint/breaking-check job. - proto/ rooted at buf.build/authorizerdev/authorizer with v2 layout - common/v1: annotations (required_permissions, mcp_tool, audit_log, public), pagination, errors (ErrorReason enum), shared AppData - Generated Go committed under gen/go so go build works without protoc - Makefile: proto-tools (auto-installs buf), proto-lint, proto-breaking, proto-gen - CI: new "proto" job runs buf lint always and buf breaking on PRs Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakhansamani
left a comment
There was a problem hiding this comment.
Principal-engineer review — Phase 0 proto skeleton
Overall this is a well-scoped Phase-0 PR: it lands the foundations (buf v2 module, custom annotations, pagination, errors, AppData) without touching any runtime code, the lint/breaking CI gate is wired up correctly, and the generated Go is committed so consumers don't need protoc. I have no blocking objections, but there are a handful of things I'd tighten before more services pile on top of this surface — they get progressively harder to change once Phase 2/4 services depend on them.
Strengths
- buf v2 module layout with a single named module (
buf.build/authorizerdev/authorizer) is the right shape for a monorepo. Pinningbufbuild/protovalidateviabuf.lockand deferringgoogleapisto Phase 2 (to avoid the "unused dep" warning) is exactly the discipline I'd want. - Managed mode with a
disableforprotovalidateis the correct workaround for BSR-published Go SDKs — easy to miss, glad it's there with a comment explaining why. go_package_prefixoverride giving a stablegithub.com/authorizerdev/authorizer/gen/gopath means consumers cango getthe module without a separate publish step. Good.- Annotation design is sensible:
required_permissionsmodelled as repeatedPermissionRequirement(AND semantics) cleanly mirrors the existing authz resource/scope model.mcp_tool.destructiveis forward-thinking — many MCP hosts already key off this for confirmation prompts. - Error taxonomy maps 1:1 to gRPC canonical codes and documents the HTTP mapping inline. Using
google.rpc.ErrorInfo.reasonis the right vehicle (clients branch on the enum, not the string). - Pagination dual-shape (
page/limitfor back-compat,page_tokenfor AIP-158) is pragmatic and the doc comment makes the precedence explicit. - CI: lint on every push, breaking only on PRs against main. Correct — running
buf breakingon push-to-main would always be a no-op or false positive. - Makefile auto-installs buf with a pinned version; commits generated Go so
go build ./...works without protoc — both reduce contributor friction.
Issues / suggestions
Important (worth fixing before Phase 2 lands)
-
No
git diff --exit-codecheck afterbuf generatein CI. The proto skeleton commits generated Go but nothing verifies it stays in sync with.proto. A reviewer editing a proto can forget to regenerate and the PR will still be green (no Go code importsgen/yet, so evengo buildwon't catch it). Add a third step to theprotojob:- run: make proto-gen - run: git diff --exit-code -- gen/ || (echo "::error::run make proto-gen and commit gen/"; exit 1)
This becomes load-bearing the moment Phase 2 lands, so easier to add now.
-
ERROR_REASON_VERIFICATION_REQUIREDandTOKEN_EXPIREDandACCOUNT_DEACTIVATEDandOPERATION_DISABLEDall map toFAILED_PRECONDITION/ HTTP 412. That's four distinct user-actionable failure modes collapsing into one transport status. The reason enum disambiguates, which is the whole point — but worth confirming: (a)FAILED_PRECONDITIONis canonically HTTP 400, not 412 per google.rpc.Code and grpc-gateway's default mapping; 412 isprecondition_failedwhich gateway doesn't emit by default. The doc comments say 412 for all of these — that won't match what grpc-gateway actually returns unless you customise the error handler. Suggest either correcting the comments to 400, or noting the planned gateway override. -
Pagination semantics for
limit=0andpage=0.(buf.validate.field).int64 = {gte: 0}allows zero for both. What doeslimit=0mean — "default" (10), "none" (0 results), or "all"? Same forpage=0vs. the "1-based" doc comment. Aoneof { offset_paging, page_token }would express the precedence more honestly than the "server picks page_token when set, else falls back" prose, and(buf.validate.field).int64 = {gt: 0}plus an optional wrapper would let "unset" mean "default" without conflating with zero. As-is, every server implementation will independently re-invent the "is this 0 or unset" logic. -
required_permissionsAND-only semantics will hurt. The comment acknowledges this: "to express OR, list alternatives in a single PermissionRequirement with a wildcard scope and let the policy engine evaluate." That pushes OR semantics into the resource/scope strings — i.e. you encode policy in identifier strings instead of the annotation. Two cleaner options to consider before Phase 2: (a) make the option aPermissionRequirementSetmessage withrepeated PermissionRequirement all_ofandrepeated PermissionRequirement any_of; or (b) keep the field repeated but document that the interceptor evaluates a boolean expression ((resource, scope) OR (resource, scope2)) — but then you need a grammar. Option (a) is simpler and forward-compatible. Doing this after services already use the simple form is a breaking proto change. -
Pagination.offsetis redundant withpage * limitand not present in the GraphQLPaginationtype's intent (it's there but always derivable). For a new API surface I'd drop it — clients should compute it or usepage_token. Keeping it perpetuates a small piece of the legacy shape into a v1 you're committing to.
Nit
-
Extension field numbers 50001-50004 are in the "internal use" range (1000-99999 per protobuf docs). Fine for a server-internal options module, but if you ever publish
authorizer.common.v1.annotationsfor third parties to consume (which the BSR module name suggests is plausible), consider requesting a range in the protobuf-global-extension-registry. Not blocking; worth a comment in the .proto noting the rationale. -
Lint excepts
PACKAGE_VERSION_SUFFIXbut every current package (authorizer.common.v1) already has the.v1suffix, so the rule would pass. The exception signals "future packages may not have the suffix" which is the opposite of what we want for an evolving public API. Drop the exception or pin a comment explaining why it's there. -
MakefileBUF ?= $(shell command -v buf …)is evaluated at parse time. Insideproto-toolstheif [ -z "$(BUF)" ]works the first time, but downstream targets invoke barebuf(not$(BUF)), which requires$GOBINto already be onPATH. CI is unaffected (usesbufbuild/buf-action), but a fresh local dev withoutGOBINinPATHwill seeproto-lintfail right after install. Either pointBUFat$(shell go env GOBIN)/bufafter install, or document thePATHrequirement. -
proto-genrunsbuf dep updateunconditionally. Fine for the common case but flakes offline. Considerbuf dep prune/skip whenbuf.lockis fresh, or split intoproto-deps+proto-gen. -
buf_actionbreaking_againstURL useshttps://github.com/${{ github.repository }}.git#branch=main. Works for public repos. If the repo ever goes private, this needs a token. Worth a comment in the workflow as a future-self reminder. -
AppDatawraps a singlegoogle.protobuf.Struct. Defensible (lets you addschema_versionetc. later without breaking) but if no further fields are anticipated, nakedStructremoves a layer. Either way, document that JSON-int64 → float64 fidelity loss is expected (this bites people every time). -
Field ordering disconnect: GraphQL
PaginationRequestis{limit, page}; proto is{page=1, limit=2, page_token=3}. Cosmetic; doesn't affect wire compatibility (different schemas) but a teammate trained on one will reach for the other's order. Worth aligning if cheap. -
go.modaddsgoogle.golang.org/protobufas indirect. Once the gRPC server lands in Phase 2 it'll become direct. No action needed now, just flagging thatgo mod tidywill reshuffle it.
Test coverage gaps
This PR adds zero test files, which is reasonable for a code-generation PR — but a couple of cheap tests would lock in the contract:
- Generated-code drift test. As noted above: a CI step that runs
make proto-genandgit diff --exit-code -- gen/. Catches stale generated files. - Annotation round-trip test (defer to Phase 1/2). A tiny Go test that constructs a
descriptorpb.MethodOptions, attaches each of the four extensions, marshals + unmarshals, and asserts you can read them back viaproto.GetExtension. Cheap insurance that the option numbers don't drift and that consumers (interceptors, MCP scanner) can rely on the contract. - buf-format check. Consider adding
format: trueto the buf-action step (or amake proto-fmt-checkrunningbuf format -d --exit-code) so style drift doesn't accumulate.
Recommendation
Approve with comments. None of the issues block merging Phase 0 — but items 1 (gen-drift CI), 2 (FAILED_PRECONDITION docs), 3 (pagination zero semantics), and 4 (AND-only permissions) all become breaking-to-fix the moment Phase 2 services start using these types, so I'd prefer to see them resolved before or in the same window as Phase 2.
Summary
_/_authz_prefixed) stay GraphQL.proto/rooted atbuf.build/authorizerdev/authorizer(buf v2 layout) withcommon/v1:required_permissions,mcp_tool,audit_log,publicPaginationRequest/Paginationaccepting both legacypage/limitand AIP-158page_tokenErrorReasonenum surfaced viagoogle.rpc.StatusErrorInfoAppData(mirrors the GraphQLMapscalar)gen/go/sogo buildworks withoutprotoc.Makefiletargets:proto-tools(auto-installsbufv1.47.2),proto-lint,proto-breaking,proto-gen.protousingbufbuild/buf-action@v1: lint always, breaking-check on PRs vsmain.Test plan
make proto-tools && make proto-lintexits 0make proto-genproduces 4.pb.gofiles cleanlygo build ./...succeeds (includinggen/)protojob green on this PRtestjob still greenFollow-ups (separate PRs per the approved plan)
internal/servicelayer🤖 Generated with Claude Code