Skip to content

Highlight the remaining YAML monogram#12 items (#4/#5/#8/#10)#21

Merged
johnsoncodehk merged 2 commits into
masterfrom
fix-yaml-issue12-remaining
Jun 7, 2026
Merged

Highlight the remaining YAML monogram#12 items (#4/#5/#8/#10)#21
johnsoncodehk merged 2 commits into
masterfrom
fix-yaml-issue12-remaining

Conversation

@johnsoncodehk
Copy link
Copy Markdown
Owner

@johnsoncodehk johnsoncodehk commented Jun 7, 2026

The remaining four monogram#12 items, all highlighter-onlyyaml.ts parser unaffected (src-coverage-yaml alignment stays 100%), and the other six grammars regenerate byte-identical.

#5 — invalid escape in a double-quoted scalar

"quoted \' scalar": \' is not a valid YAML escape and was left as plain string content. The derived quoted-string region now emits an invalid.illegal.constant.character.escape after the valid-escape pattern (valid escapes still win the leftmost tie); only for backslash-escape strings, not doubled-delimiter ('') ones.

#8 — glued # after a directive

%YAML 1.1#...: the glued #... (no preceding space) was scoped as a comment. Per §6.6 a # is a comment only at line start or after whitespace, so the Comment token gains a notPrecededBy(nonWhitespace) guard (portable fixed-width (?<!\S)).

#4 — malformed directive line

%YAML 1.2 foo: a directive owns its whole line (§6.8), so the trailing param is illegal and the parser rejects it — neither directive token matches, and foo fell through to the plain-scalar tokens as a stray string.unquoted. A % can never begin a plain scalar (§7.3.3), so a %-led line the clean tokens left is always a malformed directive: a #directive-malformed fallback re-scopes the whole line as an invalid directive (the % indicator is read from the directive tokens, not hardcoded; ranked below the clean directives and above the plain scalars).

#10 — explicit-indent block scalar

abc: |5: an explicit |N pins the content indent at parent+N, overriding the body's auto-detect (which floors at the first content line — so a deeper first line releases a real body line at parent+N as a comment). TextMate can't use a captured digit as a repeat count portably (RedCMD does, via Oniguruma {\N} backref-as-count + conditionals + subroutines — all rejected by Onigmo / GitHub-Linguist), so the portable spelling is one region per digit with a literal {N} count. Same structure as the auto-detect block scalars; the while bound becomes \1 {N} and the body is painted via contentName. Emitted for digits 1–9 in both value position (covering nested and doc-root --- |N) and sequence position (- a: |N, floor adds the dash column via \3). Verified: empty-line survival, deeper key-shaped body lines stay opaque, >N and chomping in either order.

Verification

  • yaml-issue12-regressions: 10 pass / 0 known-bug / 0 regression — every case is hard-gated (no bug: flags remain), so any future regression fails the run.
  • src-coverage-yaml parser alignment 100%; other six grammars byte-identical; scope-gap-yaml monogramWrong 8 → 7 (99.66% > official 99.51%); agnostic 9/9; sanity 15/15; RedCMD Onigmo diagnostics clean; tsc clean.

Closes the YAML half of #12.

Refs #12

Two more monogram#12 items, highlighter-only:

#5 — a double-quoted scalar's INVALID escape (`"quoted \' scalar"`: `\'` is not a
valid YAML escape) was left as plain string content. The derived quoted-string
region now emits an `invalid.illegal.constant.character.escape` pattern after the
valid-escape pattern, so an unrecognised `\.` is highlighted (the valid escapes
still win the leftmost tie). Only for backslash-escape strings, not doubled-
delimiter (`''`) ones, where a lone `\` is literal.

#8 — `%YAML 1.1#...`: the glued `#...` (no preceding space) was scoped as a comment.
Per YAML §6.6 a `#` is a comment only at line start or after whitespace, so the
Comment token gains a `notPrecededBy(nonWhitespace)` guard (a portable fixed-width
`(?<!\S)`). Plain scalars already keep a glued `#` as content; this stops the Comment
token from claiming a glued `#` a directive left behind. Parser is unaffected
(src-coverage-yaml alignment still 100%) — the lexer skips comments via the indent
config, so this only changes the highlighter's Comment pattern.

Also un-flag the now-fixed regression cases (#5/#6/#7/#8/#9 drop `bug:true` → hard-
gated; #6/#7/#9 were stale from the earlier fixes). yaml-issue12-regressions now
8 pass / 2 known-bug (#4, #10) / 0 regression. The other six grammars regenerate
byte-identical; agnostic 9/9; sanity 15/15; RedCMD Onigmo diagnostics clean.

Refs #12
…alars (#10)

The last two monogram#12 items, highlighter-only (parser unaffected — src-coverage-yaml
alignment stays 100%; the other six grammars regenerate byte-identical).

#4 — `%YAML 1.2 foo` (a malformed directive). A directive owns its whole line (§6.8), so a
trailing param is illegal: YamlDirective's arity lookahead fails and the generic Directive
excludes the `%YAML ` prefix, so neither token matches and `foo` falls through to the plain-
scalar tokens (mis-scoped as a stray string.unquoted). A `%` can never begin a plain scalar
(§7.3.3 — `%` is a c-indicator), so a `%`-led line the clean directive tokens did not claim is
always a malformed directive. A `#directive-malformed` fallback re-scopes the whole line as an
invalid directive; the indicator is read from the directive tokens' leading literal (not
hardcoded), ranked below the clean directives and above the plain scalars (scopeOrder 6.5).

#10 — `abc: |5` (explicit indentation indicator). An explicit `|N` pins the content indent at
parent+N, overriding the funky body's auto-detect (which floors at the FIRST content line, so a
deeper first line releases a real body line at parent+N as a comment). TextMate cannot use a
captured digit as a repeat count portably (RedCMD does, via Oniguruma `{\N}` backref-as-count +
conditionals + subroutines — all rejected by Onigmo / GitHub-Linguist), so the portable spelling
is a region per digit with a literal `{N}` count. Same structure as the auto-detect block scalars
(forward-captured node indent + an inner introducer rule); the `while` bound becomes `\1 {N}` and
the body is painted via `contentName` (the floor is known, so no auto-detect is needed). Emitted
for digits 1–9 in both value position (covering nested and doc-root `--- |N`) and sequence
position (`- a: |N`, whose floor adds the dash column via `\3`). Verified: empty-line survival,
deeper key-shaped body lines stay opaque, `>N` and chomping in either order, all Onigmo-clean.

yaml-issue12-regressions now passes 10/10 with no `bug:` flags (all hard-gated). scope-gap-yaml
monogramWrong 8 → 7 (99.66% > official 99.51%); agnostic 9/9; tm-diagnostics clean; tsc clean.

Refs #12
@johnsoncodehk johnsoncodehk changed the title Highlight YAML invalid escapes (#5) and glued directive '#' (#8) Highlight the remaining YAML monogram#12 items (#4/#5/#8/#10) Jun 7, 2026
@johnsoncodehk johnsoncodehk linked an issue Jun 7, 2026 that may be closed by this pull request
@johnsoncodehk johnsoncodehk merged commit c404452 into master Jun 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

YAML issues

1 participant