fix: better align parsing with elixir's matched/unmatched/no-parens #93

doorgan · 2026-02-05T20:32:43Z

Supercedes #87

I had originally intended to break down this into multiple PRs like #87, but changing one part of the parser often broke another or didn't fully fix the issues, so I apologize for the large diff. Tackling the root cause ended up being a larger change but it fixed a whole family of parser issues that would have taken 100 PRs applying bandaids. This is also the reason this PR includes some tests from that PR.

This PR brings the with_context helper from that PR to help scope the parser context to specific parsing contexts.

The main issue this PR tackles is how matched, unmatched and no-parens expressions are handled.
The elixir grammar distinguishes three kinds of expressions:

"matched": expressions with clear boundaries, like literals, parenthesized calls, tuples, lists, maps, identifiers, modules.
The key property of these is that they can safely appear as an operand without ambiguity.
"unmatched": expressions with do-end blocks. These are "unmatched" because their boundary is the end keyword, not a closing delimiter.
You cannot nest an unmatched expression inside another unmatched expression without parentheses:
if if true do true else false end do - invalid
if(if true do true else false end) do - valid

Operators can take unmatched expressions as their RHS (e.g., x = if true do 1 end), which is why unmatched_op_expr exists separately from matched_op_expr in elixir_parser.yrl

"no_parens": multi-argument calls without parentheses like foo a, b, c. These cannot be nested.
No-parens expressions are further subdivided:
- no_parens_one: single unambiguous arg (f a, f key: val)
- no_parens_many: multiple args (f a, b, c)
- no_parens_one_ambig: single arg that is itself a no_parens_many (f g a, b -> f(g(a, b)))

The grammar enforces that operators with a no_parens RHS can only combine with matched/unmatched LHS — not another no_parens expression.

In Spitfire we don't work with grammar rules but we rather consume tokens as we go, so to achieve the same we have to resort to flags like nesting, in_map, stab_state etc on top of precedence checks). This PR expands that to mimic the grammar rules behavior in elixir. I clarify this because to explain the parser behavior I need to refer to the elixir grammar, but when you look at the diff in this PR it's mostly flags and extra checks, not explicit grammar rules.

Some examples of the issues this fixes:

No-parens calls tended to consume too many tokens
Given %{foo a, b => c}, Elixir parses this as %{foo(a, b) => c} because foo a, b is a no_parend_many and => is an operator that expects matched_expr on both sides.
Spitfire would parse this as {foo(a, b =>c)} because it parsed no-parens calls at the @lowest precedence regardless of context.
The fix uses assoc_op precedence when is_map == true to force foo(a, b) to complete before continuing parsing the rhs of =>
-> would be parsed as infix in the wrong contexts
-> should only be allowed when wrapped around parens, do-end or fn-end, but never as a regular infix operator.
Spitfire would treat -> as an infix operator controlled by is_stab but this flag was not scoped.
In code like fn a, b -> (c -> d) end, is_stab from the fn body leaked into the grouped (c -> d) or viceversa.
Adding stop_before_stab_op? and scoping it with with_context each boundary is managed independently
keyword lists not stopping at stab boundary
There is a grammar rule call_args_no_parens_kw_expr -> kw_eol matched_expr, it basically says the value of a keyword pair in no-parens must be a matched expression, so in this code:
fn key: expr -> body end
expr stops at -> because -> can't appear in a matched expression

Spitfire would continue consuming tokens after -> because it wasn't aware of stabs in that context. This PR adds stab-aware functions to deal with this scenario. Also, differently to #87, this does so more generally so this takes care of the %{a do :ok end | b c, d => e, f => g} scenario that is broken in that PR.

Some other issues that aren't strictly related to matched/unmatched/no-parens but were still necessary:

. no-parens calls didn't parse args
foo.bar 1, 2 is a valid no-parens call, but Spitfire parsed it as foo.bar() followed by dangling 1, 2. Checking no_parens metadata like elixir does in build_identifier) fixes that
block_identifier not treated as terminal
In the Elixir grammar, block_identifier (else, rescue, catch, after) only appears in rules within do-blocks. In Spitfire, block_identifier wasn't in @terminals or @peeks so the parser could consume else/rescue as expression tokens instead of stopping at block boundaries. For example, a stab body inside a do-block could run past else.

A lot of those changes don't have corresponding tests in the PR specifically. I was working on fixing the issues in property tests added in #78, #80 and #81. The way I tested all this was to copy the new tests from those files into my branch and running them locally.

After the changes here:

Only 6 failures in tests from fix: operator precedence #78
All tests from test: grammar rule conformance #80 pass, so that PR could be merged after this
Only 3 failures in tests from test: simple operator test suite #81 (which I believe are unrelated to this PR)

doorgan · 2026-02-05T20:35:51Z

test/spitfire_test.exs

               {:error,
                [
-                  {:->, [line: 1, column: 3], [[{:a, [line: 1, column: 1], nil}], {:__block__, [], []}]},
+                  {:->, [line: 1, column: 3], [[{:a, [line: 1, column: 1], nil}], nil]},


The elixir parser allows nil here, this PR changes that to align with the elixir parser so this recovery scenario test needed to be updated

mhanberg · 2026-02-05T23:45:27Z

lib/spitfire.ex

          {ast, parser}
        end

-      rhs = build_block_nr(exprs)


Tbh, I have completely forgotten what _nr was supposed to mean 😭

mhanberg · 2026-02-07T18:34:19Z

lib/spitfire.ex

    end
  end

+  # Dot expression for struct types - never parses call arguments.


is this like %foo.bar{} ?

doorgan mentioned this pull request Feb 5, 2026

fix: ambiguous map update syntax #87

Closed

fix: better align parsing with elixir's matched/unmatched/no-parens

def1e3d

doorgan force-pushed the doorgan/matched-unmatched-no-parens branch from 31f6b5a to def1e3d Compare February 5, 2026 20:34

doorgan commented Feb 5, 2026

View reviewed changes

doorgan marked this pull request as draft February 5, 2026 20:37

This comment was marked as resolved.

Sign in to view

doorgan force-pushed the doorgan/matched-unmatched-no-parens branch from 961c494 to feefaa5 Compare February 5, 2026 22:01

fix: regressions on repo tests

7f8b273

doorgan force-pushed the doorgan/matched-unmatched-no-parens branch from feefaa5 to 7f8b273 Compare February 5, 2026 22:08

doorgan marked this pull request as ready for review February 5, 2026 22:11

mhanberg approved these changes Feb 7, 2026

View reviewed changes

mhanberg merged commit 808783e into main Feb 7, 2026
37 checks passed

mhanberg deleted the doorgan/matched-unmatched-no-parens branch February 7, 2026 18:37

github-actions bot mentioned this pull request Feb 7, 2026

chore(main): release 0.3.3 #94

Merged

mhanberg mentioned this pull request Feb 7, 2026

fix: add :eof to the terminator list in parse_stab_expression/1 #92

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: better align parsing with elixir's matched/unmatched/no-parens #93

fix: better align parsing with elixir's matched/unmatched/no-parens #93

Uh oh!

doorgan commented Feb 5, 2026

Uh oh!

doorgan Feb 5, 2026

Uh oh!

This comment was marked as resolved.

mhanberg Feb 5, 2026

Uh oh!

mhanberg Feb 7, 2026

Uh oh!

doorgan Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: better align parsing with elixir's matched/unmatched/no-parens #93

fix: better align parsing with elixir's matched/unmatched/no-parens #93

Uh oh!

Conversation

doorgan commented Feb 5, 2026

Uh oh!

doorgan Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

mhanberg Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

mhanberg Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

doorgan Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants