Skip to content

[FLINK-39015][table] Fix key extractor for multi join. Change GenericRowData to BinaryRowData#27508

Open
ldadima wants to merge 1 commit intoapache:masterfrom
ldadima:FLINK-39015
Open

[FLINK-39015][table] Fix key extractor for multi join. Change GenericRowData to BinaryRowData#27508
ldadima wants to merge 1 commit intoapache:masterfrom
ldadima:FLINK-39015

Conversation

@ldadima
Copy link
Contributor

@ldadima ldadima commented Feb 3, 2026

What is the purpose of the change

To fix FLINK-39015

Verifying this change

Run JoinITCase#testInnerMultiJoinWithEqualPk

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): ( no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? ( no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 3, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@ldadima
Copy link
Contributor Author

ldadima commented Feb 4, 2026

Hey @snuyanzin Could you review, please?

Copy link
Contributor

@gustavodemorais gustavodemorais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for picking this up @ldadima.

Could you just move the test as I mentioned in the comment? Thanks

}

@TestTemplate
def testInnerMultiJoinWithEqualPk(): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it either makes sense to

  1. Have the whole suite running with this the MULTI_JOIN flag as a param
  2. Add this to MultiJoinSemanticTests

Since 1. is high LOE, for this PR, I'd just go with 2. for now.

Copy link
Contributor Author

@ldadima ldadima Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review @gustavodemorais

1 Checked. All passed
image
2 I don't understand why we need to add SemanticTest. The problem only occurs after restoring from a checkpoint and only for HeapStateBackend. Also in SemanticTest there is MULTI_JOIN_TWO_WAY_INNER_JOIN_WITH_WHERE_IN, which checks similar sql (uses the same MultiJoinStateViews.JoinKeyContainsUniqueKey). I suggest to modify MultiJoinITCase (here) for failingDataSource and add this case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ldadima, if it only happens after restoring from a checkpoint, then it'd make sense to move it to `MultiJoinRestoreTest?

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants