Skip to content

Commit 68c1a3d

Browse files
committed
Python: Fix syntax error when = is used as a format fill character
An example (provided by @redsun82) is the string `f"{x:=^20}"`. Parsing this (with unnamed nodes shown) illustrates the problem: ``` module [0, 0] - [2, 0] expression_statement [0, 0] - [0, 11] string [0, 0] - [0, 11] string_start [0, 0] - [0, 2] interpolation [0, 2] - [0, 10] "{" [0, 2] - [0, 3] expression: named_expression [0, 3] - [0, 9] name: identifier [0, 3] - [0, 4] ":=" [0, 4] - [0, 6] ERROR [0, 6] - [0, 7] "^" [0, 6] - [0, 7] value: integer [0, 7] - [0, 9] "}" [0, 9] - [0, 10] string_end [0, 10] - [0, 11] ``` Observe that we've managed to combine the format specifier token `:` and the fill character `=` in a single token (which doesn't match the `:` we expect in the grammar rule), and hence we get a syntax error. If we change the `=` to some other character (e.g. a `-`), we instead get ``` module [0, 0] - [2, 0] expression_statement [0, 0] - [0, 11] string [0, 0] - [0, 11] string_start [0, 0] - [0, 2] interpolation [0, 2] - [0, 10] "{" [0, 2] - [0, 3] expression: identifier [0, 3] - [0, 4] format_specifier: format_specifier [0, 4] - [0, 9] ":" [0, 4] - [0, 5] "}" [0, 9] - [0, 10] string_end [0, 10] - [0, 11] ``` and in particular no syntax error. To fix this, we want to ensure that the `:` is lexed on its own, and the `token(prec(1, ...))` construction can be used to do exactly this. Finally, you may wonder why `=` is special here. I think what's going on is that the lexer knows that `:=` is a token on its own (because it's used in the walrus operator), and so it greedily consumes the following `=` with this in mind.
1 parent 05bef12 commit 68c1a3d

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

python/extractor/tsg-python/tsp/grammar.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1168,7 +1168,7 @@ module.exports = grammar({
11681168
_not_escape_sequence: $ => token.immediate('\\'),
11691169

11701170
format_specifier: $ => seq(
1171-
':',
1171+
token(prec(1,':')),
11721172
repeat(choice(
11731173
token(prec(1, /[^{}\n]+/)),
11741174
alias($.interpolation, $.format_expression)

0 commit comments

Comments
 (0)