Skip to content

Conversation

@kevinjqliu
Copy link
Contributor

@kevinjqliu kevinjqliu commented Jun 19, 2025

Closes #2124

Rationale for this change

Refactor Makefile

  • Grouped commands and added comments
  • Added COVERAGE param to run test with coverage
  • Added COVERAGE_FAIL_UNDER param to specify coverage threshold to pass
  • Change test coverage threshold to 85 for unit tests, we are currently at 87
  • Change test coverage threshold to 75 for integration tests, we are currently at 77

CI

  • Add s3/adls/gcs integration tests to run in CI
  • Run tests with coverage report

Note the gcsfs issue was resolved by #2127

Are these changes tested?

Yes

Are there any user-facing changes?

No

@kevinjqliu
Copy link
Contributor Author

kevinjqliu commented Jun 19, 2025

gcsfs is throwing errors...

________________________ test_fsspec_new_input_file_gcs ________________________

fsspec_fileio_gcs = <pyiceberg.io.fsspec.FsspecFileIO object at 0x7f3934871ba0>

    @pytest.mark.gcs
    def test_fsspec_new_input_file_gcs(fsspec_fileio_gcs: FsspecFileIO) -> None:
        """Test creating a new input file from a fsspec file-io"""
        location = f"gs://warehouse/{uuid.uuid4()}.txt"
    
>       input_file = fsspec_fileio_gcs.new_input(location=location)

tests/io/test_fsspec.py:484: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyiceberg/io/fsspec.py:354: in new_input
    fs = self.get_fs(uri.scheme)
pyiceberg/io/fsspec.py:391: in _get_fs
    return self._scheme_to_fs[scheme](self.properties)
pyiceberg/io/fsspec.py:177: in _gs
    from gcsfs import GCSFileSystem
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/gcsfs/__init__.py:5: in <module>
    from .core import GCSFileSystem
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/gcsfs/core.py:29: in <module>
    from .credentials import GoogleCredentials
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/gcsfs/credentials.py:11: in <module>
    import google.auth.compute_engine
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/google/auth/compute_engine/__init__.py:17: in <module>
    from google.auth.compute_engine.credentials import Credentials
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/google/auth/compute_engine/credentials.py:29: in <module>
    from google.auth import iam
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/google/auth/iam.py:28: in <module>
    from google.auth import crypt
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/google/auth/crypt/__init__.py:39: in <module>
    from google.auth.crypt import rsa
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/google/auth/crypt/rsa.py:20: in <module>
    from google.auth.crypt import _cryptography_rsa
../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/google/auth/crypt/_cryptography_rsa.py:28: in <module>
    import pkg_resources
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    import zipfile
    import zipimport
    from collections.abc import Iterable, Iterator, Mapping, MutableSequence
    from pkgutil import get_importer
    from typing import (
        TYPE_CHECKING,
        Any,
        BinaryIO,
        Callable,
        Literal,
        NamedTuple,
        NoReturn,
        Protocol,
        TypeVar,
        Union,
        overload,
    )
    
    sys.path.extend(((vendor_path := os.path.join(os.path.dirname(os.path.dirname(__file__)), 'setuptools', '_vendor')) not in sys.path) * [vendor_path])  # fmt: skip
    # workaround for #4[476](https://github.com/apache/iceberg-python/actions/runs/15764189458/job/44437233302?pr=2125#step:5:477)
    sys.modules.pop('backports', None)
    
    # capture these to bypass sandboxing
    from os import open as os_open, utime  # isort: skip
    from os.path import isdir, split  # isort: skip
    
    try:
        from os import mkdir, rename, unlink
    
        WRITE_SUPPORT = True
    except ImportError:
        # no write support, probably under GAE
        WRITE_SUPPORT = False
    
    import packaging.markers
    import packaging.requirements
    import packaging.specifiers
    import packaging.utils
    import packaging.version
    from jaraco.text import drop_comment, join_continuation, yield_lines
    from platformdirs import user_cache_dir as _user_cache_dir
    
    if TYPE_CHECKING:
        from _typeshed import BytesPath, StrOrBytesPath, StrPath
        from _typeshed.importlib import LoaderProtocol
        from typing_extensions import Self, TypeAlias
    
>   warnings.warn(
        "pkg_resources is deprecated as an API. "
        "See https://setuptools.pypa.io/en/latest/pkg_resources.html. "
        "The pkg_resources package is slated for removal as early as "
        "2025-11-30. Refrain from using this package or pin to "
        "Setuptools<81.",
        UserWarning,
        stacklevel=2,
    )
E   UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.

../../../.cache/pypoetry/virtualenvs/pyiceberg-bKWKvoA4-py3.10/lib/python3.10/site-packages/pkg_resources/__init__.py:98: UserWarning

Resolved for now by adding these errors to filterwarnings

Comment on lines -85 to -87
docker compose -f dev/docker-compose-integration.yml kill
docker compose -f dev/docker-compose-integration.yml rm -f
docker compose -f dev/docker-compose-integration.yml up -d
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor: remove these lines and reuse test-integration-setup

Comment on lines -51 to -53
test-s3: # Run tests marked with s3, can add arguments with PYTEST_ARGS="-vv"
sh ./dev/run-minio.sh
poetry run pytest tests/ -m s3 ${PYTEST_ARGS}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor: move lower and closer to test-adls and test-gcs

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving the Makefile some much-needed love and attention @kevinjqliu 🙌 I left some extra-nitpicky comments to clean it up a bit more, let me know what you think!

Fokko added a commit to Fokko/iceberg-python that referenced this pull request Jun 20, 2025
In apache#2125 we got a warning
from the google-auth package, and we're on `1.6.3` while `2.40.3`
is out there.
kevinjqliu pushed a commit that referenced this pull request Jun 20, 2025
In #2125 we got a warning
from the google-auth package, and we're on `1.6.3` while `2.40.3` is out
there. So maybe good to bump everything 👍

<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change

# Are these changes tested?

# Are there any user-facing changes?

<!-- In the case of user-facing changes, please add the changelog label.
-->
@kevinjqliu kevinjqliu force-pushed the kevinjqliu/fix-integration-tests branch from 4ccc9e3 to cf061a2 Compare June 20, 2025 14:51
@kevinjqliu kevinjqliu changed the title CI: run FileIO integration tests (s3/adls/gcs) Refactor Makefile and run s3/adls/gcs integration tests in CI Jun 20, 2025
@kevinjqliu
Copy link
Contributor Author

TOTAL                                      14886   1893    87%
Coverage failure: total of 87 is less than fail-under=90
make: *** [Makefile:123: coverage-report] Error 2
Error: Process completed with exit code 2.

:(

@kevinjqliu kevinjqliu requested a review from Fokko June 20, 2025 16:35

help: ## Display this help
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m\033[0m\n"} /^[a-zA-Z_-]+:.*?##/ { printf " \033[36m%-20s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
ifeq ($(COVERAGE),1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it

@Fokko
Copy link
Contributor

Fokko commented Jun 20, 2025

@kevinjqliu This is the tricky part of the coverage. Looks like we have sufficient coverage for the unit tests, but not for the integration tests :( So maybe only report coverage for the unit-tests? Or we have to do some GitHub Actions wizardry to combine the reports of both the unit-tests and the integration tests.

@kevinjqliu
Copy link
Contributor Author

Made coverage fail percentage customizable.

  • 85% for unit tests, currently at 87%
  • 75% for integration tests, currently at 77%

Also added this comment :)

Coverage threshold should only increase over time — never decrease it!

@kevinjqliu kevinjqliu requested a review from Fokko June 20, 2025 19:34
Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, love the cleanup 🚀

@Fokko Fokko merged commit 89e71c3 into apache:main Jun 20, 2025
10 checks passed
@Fokko
Copy link
Contributor

Fokko commented Jun 20, 2025

Thanks @kevinjqliu 🙌

@kevinjqliu kevinjqliu deleted the kevinjqliu/fix-integration-tests branch June 20, 2025 22:41
amitgilad3 pushed a commit to amitgilad3/iceberg-python that referenced this pull request Jul 7, 2025
In apache#2125 we got a warning
from the google-auth package, and we're on `1.6.3` while `2.40.3` is out
there. So maybe good to bump everything 👍

<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change

# Are these changes tested?

# Are there any user-facing changes?

<!-- In the case of user-facing changes, please add the changelog label.
-->
amitgilad3 pushed a commit to amitgilad3/iceberg-python that referenced this pull request Jul 7, 2025
<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
Closes apache#2124

# Rationale for this change
Refactor `Makefile` 
* Grouped commands and added comments
* Added `COVERAGE` param to run test with coverage
* Added `COVERAGE_FAIL_UNDER` param to specify coverage threshold to
pass
* Change test coverage threshold to `85` for unit tests, we are
currently at `87`
* Change test coverage threshold to `75` for integration tests, we are
currently at `77`

CI 
* Add s3/adls/gcs integration tests to run in CI
* Run tests with coverage report

Note the `gcsfs` issue was resolved by apache#2127

# Are these changes tested?
Yes

# Are there any user-facing changes?
No

<!-- In the case of user-facing changes, please add the changelog label.
-->
gabeiglio pushed a commit to Netflix/iceberg-python that referenced this pull request Aug 13, 2025
In apache#2125 we got a warning
from the google-auth package, and we're on `1.6.3` while `2.40.3` is out
there. So maybe good to bump everything 👍

<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change

# Are these changes tested?

# Are there any user-facing changes?

<!-- In the case of user-facing changes, please add the changelog label.
-->
gabeiglio pushed a commit to Netflix/iceberg-python that referenced this pull request Aug 13, 2025
<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
Closes apache#2124

# Rationale for this change
Refactor `Makefile` 
* Grouped commands and added comments
* Added `COVERAGE` param to run test with coverage
* Added `COVERAGE_FAIL_UNDER` param to specify coverage threshold to
pass
* Change test coverage threshold to `85` for unit tests, we are
currently at `87`
* Change test coverage threshold to `75` for integration tests, we are
currently at `77`

CI 
* Add s3/adls/gcs integration tests to run in CI
* Run tests with coverage report

Note the `gcsfs` issue was resolved by apache#2127

# Are these changes tested?
Yes

# Are there any user-facing changes?
No

<!-- In the case of user-facing changes, please add the changelog label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: run FileIO integration tests (s3/adls/gcs)

2 participants