Skip to content

Conversation

@smaheshwar-pltr
Copy link
Contributor

// Do not merge.

For fun, writing out the call stack to avoid the recursion that caused #1759.

@smaheshwar-pltr smaheshwar-pltr changed the title Iterative bind with a stack instead of recursion [Do not merge] Iterative bind with a stack instead of recursion Mar 10, 2025
assert upd.rows_inserted == 4


def test_large_upsert_into_empty_table(catalog: Catalog) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now-passing test fails on main with

    def __getitem__(self, key):
>       return self.data[ref(key)]
E       RecursionError: maximum recursion depth exceeded in comparison

/<PATH>/weakref.py:416: RecursionError
!!! Recursion error detected, but an error occurred locating the origin of recursion.
  The following exception happened when comparing locals in the stack frame:
    RecursionError: maximum recursion depth exceeded
  Displaying first and last 10 stack frames out of 962.

Comment on lines +415 to +416
num_columns = 50
num_rows = 10000
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, 20 and 1000 is enough to make main fail this test with (default) recursion depth exceeded

@kevinjqliu
Copy link
Contributor

changing the visitor to an iterative approach seems like a sound solution. are there any reasons we dont want to do this?

@Fokko
Copy link
Contributor

Fokko commented Mar 11, 2025

I like the solution!

changing the visitor to an iterative approach seems like a sound solution. are there any reasons we dont want to do this?

My only concern is performance. We probably want to check what the impact is, since we bind to schemas all over the codebase. It would be good to see if we can get some numbers on the impact on the performance.

@smaheshwar-pltr
Copy link
Contributor Author

smaheshwar-pltr commented Mar 23, 2025

Thanks for taking a look folks, and apologies for the delayed response.

are there any reasons we don't want to do this?

This approach serves only to circumvent recursion. When it's said that "iteration is faster than recursion", I don't think it refers to just concretising the frames / call-stack in memory - I believe this aligns with @Fokko mentioning that performance might be worsened, which I agree with. If we do want to go with this PR's approach, then I think we should consider it for the other visitors in visitors.py (after benchmarking performance). I also wonder if (a) some decorator magic is possible for the conversion, because IMO these simple recursive visitors read nicer and feel less error-prone (we can rewrite to be tail-recursive and use tco but I think the rewrite introduces complexity similar to this PR), or if (b) there's some way to un-intrusively keep the current recursive approach.

I see some recent activity / PRs on the original issue so happy to wait for that discussion to conclude?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants