Skip to content

Conversation

@zhongyujiang
Copy link
Contributor

Rationale for this change

The return type of the decompress method in ZStandardCodec should be bytes, but it currently returns a bytearray, which causes an exception when reading Avro files compressed with zstd.

def new_decoder(b: bytes) -> BinaryDecoder:
        try:
            from pyiceberg.avro.decoder_fast import CythonBinaryDecoder

>           return CythonBinaryDecoder(b)
E           TypeError: Argument 'input_contents' has incorrect type (expected bytes, got bytearray)

Are these changes tested?

Yes, test_write_manifest

Are there any user-facing changes?

No.

@zhongyujiang zhongyujiang force-pushed the yuj/fix-zstd-decompress branch from d09539c to 0b1d932 Compare June 20, 2025 07:39
Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm able to reproduce it locally, great catch @zhongyujiang


    def new_decoder(b: bytes) -> BinaryDecoder:
        try:
            from pyiceberg.avro.decoder_fast import CythonBinaryDecoder
    
>           return CythonBinaryDecoder(b)
E           TypeError: Argument 'input_contents' has incorrect type (expected bytes, got bytearray)

../../pyiceberg/avro/decoder.py:181: TypeError

@Fokko Fokko merged commit c27028f into apache:main Jun 20, 2025
10 checks passed
@zhongyujiang zhongyujiang deleted the yuj/fix-zstd-decompress branch June 21, 2025 02:25
amitgilad3 pushed a commit to amitgilad3/iceberg-python that referenced this pull request Jul 7, 2025
<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change
The return type of the decompress method in `ZStandardCodec` should be
`bytes`, but it currently returns a `bytearray`, which causes an
exception when reading Avro files compressed with zstd.

```text
def new_decoder(b: bytes) -> BinaryDecoder:
        try:
            from pyiceberg.avro.decoder_fast import CythonBinaryDecoder

>           return CythonBinaryDecoder(b)
E           TypeError: Argument 'input_contents' has incorrect type (expected bytes, got bytearray)
```

# Are these changes tested?
Yes,  `test_write_manifest`

# Are there any user-facing changes?
No.
<!-- In the case of user-facing changes, please add the changelog label.
-->
gabeiglio pushed a commit to Netflix/iceberg-python that referenced this pull request Aug 13, 2025
<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change
The return type of the decompress method in `ZStandardCodec` should be
`bytes`, but it currently returns a `bytearray`, which causes an
exception when reading Avro files compressed with zstd.

```text
def new_decoder(b: bytes) -> BinaryDecoder:
        try:
            from pyiceberg.avro.decoder_fast import CythonBinaryDecoder

>           return CythonBinaryDecoder(b)
E           TypeError: Argument 'input_contents' has incorrect type (expected bytes, got bytearray)
```

# Are these changes tested?
Yes,  `test_write_manifest`

# Are there any user-facing changes?
No.
<!-- In the case of user-facing changes, please add the changelog label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants