Skip to content

Conversation

@axreldable
Copy link
Contributor

@axreldable axreldable commented Oct 12, 2025

What's Changed

UnionMapWriter will null out the entire map struct entry instead of setting the value to null in:

childWriter.value().fixedSizeBinary().writeNull();

This PR overrides the fixedSizeBinary() method for the UnionMapWriter, resolving it.
In addition, it introduces the fixedSizeBinary(int byteWidth) which needs to be used to initialize key and value writes in UnionMapWriter if they have not been initialized.

Closes #586.

@github-actions

This comment has been minimized.

@axreldable axreldable marked this pull request as ready for review October 12, 2025 14:48
@axreldable
Copy link
Contributor Author

axreldable commented Oct 12, 2025

Thank you for opening a pull request!

Please label the PR with one or more of:

  • bug-fix
  • chore
  • dependencies
  • documentation
  • enhancement

Also, add the 'breaking-change' label if appropriate.

See CONTRIBUTING.md for details.

Hm, looks like I don't have enough permissions to add labels.

Could the maintainers add the bug-fix label, please?

@axreldable
Copy link
Contributor Author

Hi @nbauernfeind! It's my attempt to address the issue you opened. Could you please take a look? Is it what you had in mind for it?

@axreldable
Copy link
Contributor Author

Hi @lidavidm , @jbonofre! Could you please check this? Or at least add the bug-fix label to ensure PR format to release CI build. Thank you!

@jbonofre
Copy link
Member

@axreldable sure. I will. FYI we are working on fixing the CI right now. Your PR will probably need a rebase as soon as CI is green again. I will keep you posted.

@jbonofre jbonofre added the bug-fix PRs that fix a big. label Oct 14, 2025
@github-actions github-actions bot added this to the 18.4.0 milestone Oct 14, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it technically works but I find it very confusing to recycle the test, especially since the map value was always BigInt before. I'd rather see a new test that exercises both the key/value paths explicitly and is explicitly marked as a regression test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidavidm , thank you for looking into it! Sorry for the confusion. Let me introduce a separate unit test.

--

Could you please clarify what do you mean by

explicitly marked as a regression test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/** Regression test for https://github.com/apache/arrow-java/issues/586 */
@Test
void mapFixedSizeBinary() {
  // ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidavidm , I've implemented a separate unit test for fixedSizeBinary.

However, I came across an issue. When I call writer.key().fixedSizeBinary() for the first time, it fails with NPE, as it falls back to NullableStructWriter which requires writer to be not-null (checkout the testFixedSizeBinaryWriterInitializationError test).
It work when I first initialize writers for both key and value fields with other writes, e.g. bigInt (checkout testFixedSizeBinaryWriter test).

Could you help me here? What would be the right way to initialize writes for key and value fields for the UnionMapWriter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidavidm , I introduced public FixedSizeBinaryWriter fixedSizeBinary(int byteWidth) in the UnionMapWriter which supposed to be used for initialization of the key and value writers. I’d really appreciate your feedback if there’s a better approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm rather behind - I will try to review this soon but I can't make any promises right now

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @jbonofre?

@axreldable
Copy link
Contributor Author

axreldable commented Nov 10, 2025

@lidavidm , thank you for the approval! I added testFixedSizeBinaryFirstInitialization unit test to show the initialization of key or value writer failure when initialized with fixedSizeBinary().

      // require byteWidth parameter for first-time initialization of `key` or `value` writers
      assertThrows(NullPointerException.class, () -> writer.key().fixedSizeBinary().write(holder1));
      assertThrows(
          NullPointerException.class, () -> writer.value().fixedSizeBinary().write(holder2));
      writer.key().fixedSizeBinary(holder1.byteWidth).write(holder1);
      writer.value().fixedSizeBinary(holder2.byteWidth).write(holder2);

@axreldable
Copy link
Contributor Author

I also noticed one more thing here.

      // {null -> [32, 21]} - wrong "for a given entry, the "key" is non-nullable" - todo: it shouldn't work. Should it?

It's possible to add a map with null key value which contradicts the doc.

...
Also for a given entry, the "key" is non-nullable, however the "value" can be null.

It doesn't relate to the change as I can do the same for other types, e.g. for the bigInt() writer:
{1 -> 11, null -> 22, 3 -> 33}.

So, maybe it’s worth creating an issue about it. Or let me know if it’s expected.

@lidavidm
Copy link
Member

@axreldable sorry for the delay. It would be worth making an issue, however, as far as I understand, the APIs were originally designed not to do very much checking, instead assuming that you maintain the needed invariants. (For instance: I don't think anything stops you from putting nulls in a non-null column.)

@lidavidm lidavidm merged commit 266dfc1 into apache:main Nov 17, 2025
25 of 26 checks passed
@axreldable axreldable deleted the fix-UnionMapWriter branch November 17, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change bug-fix PRs that fix a big.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UnionMapWriter fails to write fixed size binary

3 participants