Skip to content

Comments

feat(snowflake)!: Refactor RPAD/LPAD #6869

Merged
georgesittas merged 15 commits intomainfrom
RD-1147745-transpile_RPAD
Feb 3, 2026
Merged

feat(snowflake)!: Refactor RPAD/LPAD #6869
georgesittas merged 15 commits intomainfrom
RD-1147745-transpile_RPAD

Conversation

@fivetran-ashashankar
Copy link
Collaborator

@fivetran-ashashankar fivetran-ashashankar commented Jan 27, 2026

Refactor RPAD/LPAD transpilation for binary and string types.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 27, 2026

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:RD-1147745-transpile_RPAD, sqlglot version: RD-1147745-transpile_RPAD)
  • baseline (main, sqlglot version: 28.9.1.dev5)

⚠️ Limited to dialects: snowflake, duckdb, bigquery

By Dialect

dialect main sqlglot:RD-1147745-transpile_RPAD difference links
bigquery -> bigquery 2592/2624 passed (98.8%) 2592/2624 passed (98.8%) No change full result / delta
bigquery -> duckdb 1847/2623 passed (70.4%) 1848/2623 passed (70.5%) No change full result / delta
duckdb -> duckdb 4003/4003 passed (100.0%) 4003/4003 passed (100.0%) No change full result / delta
snowflake -> duckdb 1315/2623 passed (50.1%) 1325/2623 passed (50.5%) ⬆ improved by 0.4% full result / delta
snowflake -> snowflake 2606/2623 passed (99.4%) 2606/2623 passed (99.4%) No change full result / delta

Overall

main: 14496 total, 12363 passed (pass rate: 85.3%), sqlglot version: 28.9.1.dev5

sqlglot:RD-1147745-transpile_RPAD: 14496 total, 12374 passed (pass rate: 85.4%), sqlglot version: RD-1147745-transpile_RPAD

Difference: ⬆ improved by 0.1%

Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @fivetran-ashashankar, nice work! Leaving a few comments:

fivetran-ashashankar and others added 3 commits January 28, 2026 10:38
Co-authored-by: Vaggelis Danias <daniasevangelos@gmail.com>
…r binary and string types. Code review comments implemented
…r binary and string types. super.pad_sql for varchar case
or _is_binary(fill_arg)
or isinstance(string_arg, (exp.ToBinary, exp.Encode))
or isinstance(fill_arg, (exp.ToBinary, exp.Encode))
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VaggelisD - the below check is not covered in _is_binary

or isinstance(string_arg, (exp.ToBinary, exp.Encode))
                    or isinstance(fill_arg, (exp.ToBinary, exp.Encode))

I did verify the ASTs returned for SELECT CAST('Hi' AS BINARY) - returns BINARY
and SELECT TO_BINARY('Hi', 'UTF8') returns nothing .

hence the lines 2894 and 2895 the instance checks are important.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Asha, the suggestion is to leave type inference up to annotate_types, the generator should only do is_type(...) calls to choose a path. Although that was a common pattern in the generator before, lately we are straying away from it and instead our message is to annotate the AST before generating it to get the most accurate transpilation we can offer.

The reason for that is that there could be many different expressions & functions that result in a BINARY type and instance-checking each one of them is not a good pattern; It's error prone, violates DRY and mixes concerns between the generator & type annotator.

As for the tests cases, lately we are annotating the ASTs ourselves to validate the different transpilation paths, e.g:

# test_snowflake.py
lpad_binary = self.validate_identity("LPAD(TO_BINARY('foo'))")
lpad_binary = annotate_types(lpad_binary, dialect="snowflake")
self.assertEqual(lpad_binary.sql("duckdb"), "'foo' || REPEAT(...)")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. Removed the isInstance checks

Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a few comments, the PR is much cleaner, thanks for the quick iterations!

Copy link
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take this to the finish line

@georgesittas georgesittas merged commit 2e2ff03 into main Feb 3, 2026
8 checks passed
@georgesittas georgesittas deleted the RD-1147745-transpile_RPAD branch February 3, 2026 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants