Highlight the remaining YAML monogram#12 items (#4/#5/#8/#10)#21
Merged
Conversation
Two more monogram#12 items, highlighter-only: #5 — a double-quoted scalar's INVALID escape (`"quoted \' scalar"`: `\'` is not a valid YAML escape) was left as plain string content. The derived quoted-string region now emits an `invalid.illegal.constant.character.escape` pattern after the valid-escape pattern, so an unrecognised `\.` is highlighted (the valid escapes still win the leftmost tie). Only for backslash-escape strings, not doubled- delimiter (`''`) ones, where a lone `\` is literal. #8 — `%YAML 1.1#...`: the glued `#...` (no preceding space) was scoped as a comment. Per YAML §6.6 a `#` is a comment only at line start or after whitespace, so the Comment token gains a `notPrecededBy(nonWhitespace)` guard (a portable fixed-width `(?<!\S)`). Plain scalars already keep a glued `#` as content; this stops the Comment token from claiming a glued `#` a directive left behind. Parser is unaffected (src-coverage-yaml alignment still 100%) — the lexer skips comments via the indent config, so this only changes the highlighter's Comment pattern. Also un-flag the now-fixed regression cases (#5/#6/#7/#8/#9 drop `bug:true` → hard- gated; #6/#7/#9 were stale from the earlier fixes). yaml-issue12-regressions now 8 pass / 2 known-bug (#4, #10) / 0 regression. The other six grammars regenerate byte-identical; agnostic 9/9; sanity 15/15; RedCMD Onigmo diagnostics clean. Refs #12
…alars (#10) The last two monogram#12 items, highlighter-only (parser unaffected — src-coverage-yaml alignment stays 100%; the other six grammars regenerate byte-identical). #4 — `%YAML 1.2 foo` (a malformed directive). A directive owns its whole line (§6.8), so a trailing param is illegal: YamlDirective's arity lookahead fails and the generic Directive excludes the `%YAML ` prefix, so neither token matches and `foo` falls through to the plain- scalar tokens (mis-scoped as a stray string.unquoted). A `%` can never begin a plain scalar (§7.3.3 — `%` is a c-indicator), so a `%`-led line the clean directive tokens did not claim is always a malformed directive. A `#directive-malformed` fallback re-scopes the whole line as an invalid directive; the indicator is read from the directive tokens' leading literal (not hardcoded), ranked below the clean directives and above the plain scalars (scopeOrder 6.5). #10 — `abc: |5` (explicit indentation indicator). An explicit `|N` pins the content indent at parent+N, overriding the funky body's auto-detect (which floors at the FIRST content line, so a deeper first line releases a real body line at parent+N as a comment). TextMate cannot use a captured digit as a repeat count portably (RedCMD does, via Oniguruma `{\N}` backref-as-count + conditionals + subroutines — all rejected by Onigmo / GitHub-Linguist), so the portable spelling is a region per digit with a literal `{N}` count. Same structure as the auto-detect block scalars (forward-captured node indent + an inner introducer rule); the `while` bound becomes `\1 {N}` and the body is painted via `contentName` (the floor is known, so no auto-detect is needed). Emitted for digits 1–9 in both value position (covering nested and doc-root `--- |N`) and sequence position (`- a: |N`, whose floor adds the dash column via `\3`). Verified: empty-line survival, deeper key-shaped body lines stay opaque, `>N` and chomping in either order, all Onigmo-clean. yaml-issue12-regressions now passes 10/10 with no `bug:` flags (all hard-gated). scope-gap-yaml monogramWrong 8 → 7 (99.66% > official 99.51%); agnostic 9/9; tm-diagnostics clean; tsc clean. Refs #12
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The remaining four monogram#12 items, all highlighter-only —
yaml.tsparser unaffected (src-coverage-yamlalignment stays 100%), and the other six grammars regenerate byte-identical.#5 — invalid escape in a double-quoted scalar
"quoted \' scalar":\'is not a valid YAML escape and was left as plain string content. The derived quoted-string region now emits aninvalid.illegal.constant.character.escapeafter the valid-escape pattern (valid escapes still win the leftmost tie); only for backslash-escape strings, not doubled-delimiter ('') ones.#8 — glued
#after a directive%YAML 1.1#...: the glued#...(no preceding space) was scoped as a comment. Per §6.6 a#is a comment only at line start or after whitespace, so theCommenttoken gains anotPrecededBy(nonWhitespace)guard (portable fixed-width(?<!\S)).#4 — malformed directive line
%YAML 1.2 foo: a directive owns its whole line (§6.8), so the trailing param is illegal and the parser rejects it — neither directive token matches, andfoofell through to the plain-scalar tokens as a straystring.unquoted. A%can never begin a plain scalar (§7.3.3), so a%-led line the clean tokens left is always a malformed directive: a#directive-malformedfallback re-scopes the whole line as an invalid directive (the%indicator is read from the directive tokens, not hardcoded; ranked below the clean directives and above the plain scalars).#10 — explicit-indent block scalar
abc: |5: an explicit|Npins the content indent at parent+N, overriding the body's auto-detect (which floors at the first content line — so a deeper first line releases a real body line at parent+N as a comment). TextMate can't use a captured digit as a repeat count portably (RedCMD does, via Oniguruma{\N}backref-as-count + conditionals + subroutines — all rejected by Onigmo / GitHub-Linguist), so the portable spelling is one region per digit with a literal{N}count. Same structure as the auto-detect block scalars; thewhilebound becomes\1 {N}and the body is painted viacontentName. Emitted for digits 1–9 in both value position (covering nested and doc-root--- |N) and sequence position (- a: |N, floor adds the dash column via\3). Verified: empty-line survival, deeper key-shaped body lines stay opaque,>Nand chomping in either order.Verification
yaml-issue12-regressions: 10 pass / 0 known-bug / 0 regression — every case is hard-gated (nobug:flags remain), so any future regression fails the run.src-coverage-yamlparser alignment 100%; other six grammars byte-identical;scope-gap-yamlmonogramWrong 8 → 7 (99.66% > official 99.51%);agnostic9/9;sanity15/15; RedCMD Onigmo diagnostics clean;tscclean.Closes the YAML half of #12.
Refs #12