Tya v0.35 Specification — Canonical Syntax, Step 3

Tya v0.35 is the third step in the multi-version landing of Canonical Syntax described in docs/CANONICAL_SYNTAX.md. It extends the v0.34 comment-capture infrastructure with per-statement comment attachment so the formatter (v0.36) can preserve and emit comments deterministically.

v0.35 is intentionally additive and non-breaking. The formatter itself remains the v0.2 conservative text pass; AST-driven unparse work lands in v0.36.

Goals (v0.35)

Non-Goals (v0.35)

Surface

// internal/ast
type Program struct {
    Stmts          []Stmt
    HeaderComments []string                   // v0.34
    Comments       map[Stmt]StmtComments      // v0.35
}

type StmtComments struct {
    Leading []string  // contiguous # lines immediately before stmt, same indent
    LineEnd string    // single # comment on the same source line as stmt's start
}

ParseWithComments populates Comments for each top-level statement that has a leading and/or line-end comment per CANONICAL §3.1 / §3.2. Statements without comments do not appear in the map.

Attachment rules (v0.35 scope)

Given a top-level statement at start line S and indent 0:

The header-comment block (§3.3) is computed first, by v0.34’s logic. Its comments are excluded from the leading-comment pool.

Acceptance Criteria

A v0.35 build is acceptable when:

  1. parser.ParseWithComments populates Program.Comments for every top-level statement with leading or line-end comments per the rules above.
  2. Inputs without comments leave Program.Comments as nil (or an empty map) and existing tests pass unchanged.
  3. The default parser.Parse continues to return programs with Comments == nil. CLI behavior is unchanged.
  4. go test ./... -count=1 passes, including the self-host invariant.

Multi-Version Plan (updated)

Version Step
v0.33 Parenthesized multi-parameter lambda. (Done.)
v0.34 Lexer comment capture; Program.HeaderComments. (Done.)
v0.35 Per-stmt Comments map at top level. (This release.)
v0.36 Comment attachment for nested stmts; AST-driven formatter v1 (operator spacing, blank-line rules, single-line forms, empty else removal, import sort/grouping, empty-collection normalization, comment emission).
v0.37 Formatter v2 (per-construct multi-line wrap, 80-column limit, atomic-token exception, """...""" rewrite, idempotency).
v0.38 Normalize the entire codebase with the formatter; reject non-canonical forms at parse time.

(Originally v0.36 was planned to ship both formatter v1 and v2. The schedule shifts each step one release later to keep individual releases reviewable.)

docs/CANONICAL_SYNTAX.md remains the single source of truth.

Self-Host Invariant

Parse is unchanged. The self-host pipeline does not call ParseWithComments. TestSelfhostV01Scripts continues to pass.