feat(snowflake)!: Refactor RPAD/LPAD #6869
Conversation
…r binary and string types.
SQLGlot Integration Test ResultsComparing:
By Dialect
Overallmain: 14496 total, 12363 passed (pass rate: 85.3%), sqlglot version: sqlglot:RD-1147745-transpile_RPAD: 14496 total, 12374 passed (pass rate: 85.4%), sqlglot version: Difference: ⬆ improved by |
VaggelisD
left a comment
There was a problem hiding this comment.
Hey @fivetran-ashashankar, nice work! Leaving a few comments:
Co-authored-by: Vaggelis Danias <daniasevangelos@gmail.com>
…r binary and string types. Code review comments implemented
…r binary and string types. super.pad_sql for varchar case
Co-authored-by: Jo <46752250+georgesittas@users.noreply.github.com>
…RD-1147745-transpile_RPAD
| or _is_binary(fill_arg) | ||
| or isinstance(string_arg, (exp.ToBinary, exp.Encode)) | ||
| or isinstance(fill_arg, (exp.ToBinary, exp.Encode)) | ||
| ) |
There was a problem hiding this comment.
@VaggelisD - the below check is not covered in _is_binary
or isinstance(string_arg, (exp.ToBinary, exp.Encode))
or isinstance(fill_arg, (exp.ToBinary, exp.Encode))
I did verify the ASTs returned for SELECT CAST('Hi' AS BINARY) - returns BINARY
and SELECT TO_BINARY('Hi', 'UTF8') returns nothing .
hence the lines 2894 and 2895 the instance checks are important.
There was a problem hiding this comment.
Hey Asha, the suggestion is to leave type inference up to annotate_types, the generator should only do is_type(...) calls to choose a path. Although that was a common pattern in the generator before, lately we are straying away from it and instead our message is to annotate the AST before generating it to get the most accurate transpilation we can offer.
The reason for that is that there could be many different expressions & functions that result in a BINARY type and instance-checking each one of them is not a good pattern; It's error prone, violates DRY and mixes concerns between the generator & type annotator.
As for the tests cases, lately we are annotating the ASTs ourselves to validate the different transpilation paths, e.g:
# test_snowflake.py
lpad_binary = self.validate_identity("LPAD(TO_BINARY('foo'))")
lpad_binary = annotate_types(lpad_binary, dialect="snowflake")
self.assertEqual(lpad_binary.sql("duckdb"), "'foo' || REPEAT(...)")There was a problem hiding this comment.
okay. Removed the isInstance checks
VaggelisD
left a comment
There was a problem hiding this comment.
Leaving a few comments, the PR is much cleaner, thanks for the quick iterations!
georgesittas
left a comment
There was a problem hiding this comment.
Will take this to the finish line
Refactor RPAD/LPAD transpilation for binary and string types.