-
Notifications
You must be signed in to change notification settings - Fork 12
Use module-lexer to parse comments #750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Use syntax-color/module-lexer instead of br-parser-tools/lex - Support expression comments (#;) - Add tests for expression comments - All existing tests pass Co-authored-by: jackfirth <8175575+jackfirth@users.noreply.github.com>
- Track lexer mode state through iterations instead of rewinding port - Update skip-one-sexp to return final mode - Handle reencoded ports that don't support file-position - All 1037 tests pass Co-authored-by: jackfirth <8175575+jackfirth@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resyntax analyzed 1 file in this pull request and has added suggestions.
private/comment-reader.rkt
Outdated
| (if is-line-comment? | ||
| ;; Peek at the next token to see if it's a newline | ||
| (let () | ||
| (define-values (next-lexeme next-type next-paren next-start next-end next-backup next-mode) | ||
| (module-lexer in 0 mode-out)) | ||
| (cond | ||
| [(and (equal? next-type 'white-space) (equal? next-lexeme "\n")) | ||
| ;; Include the newline in the comment range and continue with the mode after the newline | ||
| (loop (cons (closed-open-range comment-start (sub1 next-end) #:comparator natural<=>) ranges) | ||
| next-mode)] | ||
| [(eof-object? next-lexeme) | ||
| ;; EOF after comment | ||
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges) | ||
| next-mode)] | ||
| [else | ||
| ;; Non-newline token after comment; we need to "un-consume" it | ||
| ;; by processing it in the next iteration. But we can't easily do that | ||
| ;; with module-lexer. Let's use a different approach. | ||
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges) | ||
| next-mode)])) | ||
| ;; Block comment - don't peek ahead | ||
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges) | ||
| mode-out))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if-let-to-cond: cond with internal definitions is preferred over if with let, to reduce nesting
| (if is-line-comment? | |
| ;; Peek at the next token to see if it's a newline | |
| (let () | |
| (define-values (next-lexeme next-type next-paren next-start next-end next-backup next-mode) | |
| (module-lexer in 0 mode-out)) | |
| (cond | |
| [(and (equal? next-type 'white-space) (equal? next-lexeme "\n")) | |
| ;; Include the newline in the comment range and continue with the mode after the newline | |
| (loop (cons (closed-open-range comment-start (sub1 next-end) #:comparator natural<=>) ranges) | |
| next-mode)] | |
| [(eof-object? next-lexeme) | |
| ;; EOF after comment | |
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges) | |
| next-mode)] | |
| [else | |
| ;; Non-newline token after comment; we need to "un-consume" it | |
| ;; by processing it in the next iteration. But we can't easily do that | |
| ;; with module-lexer. Let's use a different approach. | |
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges) | |
| next-mode)])) | |
| ;; Block comment - don't peek ahead | |
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges) | |
| mode-out))] | |
| (cond | |
| [is-line-comment? | |
| ;; Peek at the next token to see if it's a newline | |
| (define-values (next-lexeme next-type next-paren next-start next-end next-backup next-mode) | |
| (module-lexer in 0 mode-out)) | |
| (cond | |
| [(and (equal? next-type 'white-space) (equal? next-lexeme "\n")) | |
| ;; Include the newline in the comment range and continue with the mode after the newline | |
| (loop (cons (closed-open-range comment-start (sub1 next-end) #:comparator natural<=>) | |
| ranges) | |
| next-mode)] | |
| [(eof-object? next-lexeme) | |
| ;; EOF after comment | |
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) | |
| ranges) | |
| next-mode)] | |
| [else | |
| ;; Non-newline token after comment; we need to "un-consume" it | |
| ;; by processing it in the next iteration. But we can't easily do that | |
| ;; with module-lexer. Let's use a different approach. | |
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) | |
| ranges) | |
| next-mode)])] | |
| ;; Block comment - don't peek ahead | |
| [else | |
| (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) | |
| ranges) | |
| mode-out)])] |
Debugging details
Textual replacement
(line-replacement
#:new-lines
'#(" (cond"
" [is-line-comment?"
" ;; Peek at the next token to see if it's a newline"
" (define-values (next-lexeme next-type next-paren next-start next-end next-backup next-mode)"
" (module-lexer in 0 mode-out))"
" (cond"
" [(and (equal? next-type 'white-space) (equal? next-lexeme \"\\n\"))"
" ;; Include the newline in the comment range and continue with the mode after the newline"
" (loop (cons (closed-open-range comment-start (sub1 next-end) #:comparator natural<=>)"
" ranges)"
" next-mode)]"
" [(eof-object? next-lexeme)"
" ;; EOF after comment"
" (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>)"
" ranges)"
" next-mode)]"
" [else"
" ;; Non-newline token after comment; we need to \"un-consume\" it"
" ;; by processing it in the next iteration. But we can't easily do that"
" ;; with module-lexer. Let's use a different approach."
" (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>)"
" ranges)"
" next-mode)])]"
" ;; Block comment - don't peek ahead"
" [else"
" (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>)"
" ranges)"
" mode-out)])]")
#:original-lines
'#(" (if is-line-comment?"
" ;; Peek at the next token to see if it's a newline"
" (let ()"
" (define-values (next-lexeme next-type next-paren next-start next-end next-backup next-mode)"
" (module-lexer in 0 mode-out))"
" (cond"
" [(and (equal? next-type 'white-space) (equal? next-lexeme \"\\n\"))"
" ;; Include the newline in the comment range and continue with the mode after the newline"
" (loop (cons (closed-open-range comment-start (sub1 next-end) #:comparator natural<=>) ranges)"
" next-mode)]"
" [(eof-object? next-lexeme)"
" ;; EOF after comment"
" (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges)"
" next-mode)]"
" [else"
" ;; Non-newline token after comment; we need to \"un-consume\" it"
" ;; by processing it in the next iteration. But we can't easily do that"
" ;; with module-lexer. Let's use a different approach."
" (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges)"
" next-mode)]))"
" ;; Block comment - don't peek ahead"
" (loop (cons (closed-open-range comment-start comment-end-base #:comparator natural<=>) ranges)"
" mode-out))]")
#:start-line 44)Syntactic replacement
(syntax-replacement
#:introduction-scope #<procedure:...and/syntax-local.rkt:148:2>
#:new-syntax
#<syntax:default-recommendations/let-replacement/cond-let-replacement.rkt:47:2 (cond (is-line-comment? (define-values (next-lexeme next-type next-paren next-start next-end next-backup next-mode) (module-lexer in 0 mode-out)) (cond ((and (equal? next-type (quote white-space)) (equal? next-lexeme "\n")) (loop (cons (closed-open-rang...>
#:original-syntax
#<syntax:private/comment-reader.rkt:44:7 (if is-line-comment? (let () (define-values (next-lexeme next-type next-paren next-start next-end next-backup next-mode) (module-lexer in 0 mode-out)) (cond ((and (equal? next-type (quote white-space)) (equal? next-lexeme "\n")) (loop (cons (closed-open...>
#:source
(file-source
#<path:/home/runner/work/resyntax/resyntax/private/comment-reader.rkt>)
#:uses-universal-tagged-syntax? #f)|
@copilot Add some |
- Added comment-preservation-test.rkt with 6 test cases - Tests show expression comments (#;) are preserved - Tests show block comments (#|) are preserved - All 1043 tests pass Co-authored-by: jackfirth <8175575+jackfirth@users.noreply.github.com>
Added in 5d4ee83. Created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resyntax analyzed 3 files in this pull request and found no issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resyntax analyzed 4 files in this pull request and found no issues.
The custom comment parser was incomplete (missing expression comments like
#;(foo)), language-specific to#lang racket, and difficult to maintain.Changes
syntax-color/module-lexer: Removedbr-parser-tools/leximplementation in favor of Racket's standard lexer#;comments viaskip-one-sexphelper that tracks parenthesis depthfile-position, which fails on reencoded portscomment-preservation-test.rktwith 6#lang resyntax/testtest cases demonstrating expression (#;) and block (#|) comment preservation during refactoringsExample
Impact
Fixes Comment parser failures #149 (expression comments not preserved)
Enables automatic support for comments in other
#langvariants (Refactoring custom languages #270)Added 5 new test cases for comment parsing in
comment-reader.rkt, all 1043 tests passAdded 6 new test cases demonstrating comment preservation in refactorings
Fixes Use
module-lexerto parse comments #299Original prompt
module-lexerto parse comments #299✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.