-
Notifications
You must be signed in to change notification settings - Fork 414
Use version-hint.text for StaticTable #1887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use version-hint.text for StaticTable #1887
Conversation
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising this @arnaudbriche 🙌
How would you feel about:
- Adding a test to make sure that this works, and we don't break it in the future
- Adding a line of two to the docs so people know that it is here
Sure. I'm not doing much Python these days, so I must miss something. I tried: poetry install
poetry run pytestBut I go the following error: |
|
@arnaudbriche That's no problem. Can you try: make install
make testWe use a |
On OSX: Without virtualenv: make install
/bin/sh: pip: command not found
Poetry version does not match required version 2.0.1. Updating...
/bin/sh: pip: command not found
make: *** [install-poetry] Error 127With virtualenv: make install
WARNING: Package(s) not found: poetry
Poetry version does not match required version 2.0.1. Updating...
[notice] A new release of pip is available: 25.0 -> 25.0.1
[notice] To update, run: pip install --upgrade pip
ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.
make: *** [install-poetry] Error 1 |
|
Ok so I installed the right poetry version by hand. But now, while trying to run the tests, I get: make test
poetry run pytest tests/ -m "(unmarked or parametrize) and not integration"
/Users/arnaudbriche/Desktop/code/iceberg-python/venv/lib/python3.13/site-packages/_pytest/config/__init__.py:331: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: helpconfig, Hook: pytest_cmdline_parse
ConftestImportFailure: ModuleNotFoundError: No module named 'pyiceberg' (from /Users/arnaudbriche/Desktop/code/iceberg-python/tests/conftest.py)
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
config = pluginmanager.hook.pytest_cmdline_parse(
ImportError while loading conftest '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/conftest.py'.
tests/conftest.py:51: in <module>
from pyiceberg.catalog import Catalog, load_catalog
E ModuleNotFoundError: No module named 'pyiceberg'
make: *** [test] Error 4 |
|
And now: PYTHONPATH=".:$PYTHONPATH" make test
poetry run pytest tests/ -m "(unmarked or parametrize) and not integration"
=========================================================================================== test session starts ===========================================================================================
platform darwin -- Python 3.13.2, pytest-7.4.4, pluggy-1.5.0
rootdir: /Users/arnaudbriche/Desktop/code/iceberg-python
configfile: pyproject.toml
plugins: checkdocs-2.13.0, mock-3.14.0, lazy-fixture-0.6.3, requests-mock-1.12.1
collected 3705 items / 3 errors / 884 deselected / 2821 selected
================================================================================================= ERRORS ==================================================================================================
_______________________________________________________________________________ ERROR collecting tests/catalog/test_hive.py _______________________________________________________________________________
ImportError while importing test module '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/catalog/test_hive.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/catalog/test_hive.py:24: in <module>
from hive_metastore.ttypes import (
E ModuleNotFoundError: No module named 'hive_metastore'
____________________________________________________________________________ ERROR collecting tests/integration/test_reads.py _____________________________________________________________________________
ImportError while importing test module '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/integration/test_reads.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/integration/test_reads.py:29: in <module>
from hive_metastore.ttypes import LockRequest, LockResponse, LockState, UnlockRequest
E ModuleNotFoundError: No module named 'hive_metastore'
______________________________________________________________________ ERROR collecting tests/integration/test_writes/test_writes.py ______________________________________________________________________
ImportError while importing test module '/Users/arnaudbriche/Desktop/code/iceberg-python/tests/integration/test_writes/test_writes.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/integration/test_writes/test_writes.py:40: in <module>
from pyiceberg.catalog.hive import HiveCatalog
pyiceberg/catalog/hive.py:35: in <module>
from hive_metastore.ThriftHiveMetastore import Client
E ModuleNotFoundError: No module named 'hive_metastore'
========================================================================================= short test summary info =========================================================================================
ERROR tests/catalog/test_hive.py
ERROR tests/integration/test_reads.py
ERROR tests/integration/test_writes/test_writes.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=================================================================================== 884 deselected, 3 errors in 15.41s ====================================================================================
make: *** [test] Error 2 |
|
Hi @Fokko , Do you know if running the test suite on OSX is supported ? Or should I switch to working from a Linux box ? |
|
Hey @arnaudbriche OSX should work fine (using it myself). We vendor the docker run -v `pwd`:/pyiceberg -t -i python:3.12 bash
root@bdb0eca99544:/# cd /pyiceberg/
root@bdb0eca99544:/pyiceberg# pip install poetry --force
root@bdb0eca99544:/pyiceberg# make install
...
root@bdb0eca99544:/pyiceberg# make lint
...
root@bdb0eca99544:/pyiceberg# make test
... |
9b28cb2 to
9f7aa22
Compare
|
Hi @Fokko I have version |
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @arnaudbriche This looks great to me! 👍
This change allow making use of the `version-hint.text` file when a static table is instantiated with a `metadata_location` not ending with '.metadata.json'. User can just point to the table location, and metadata file path will be read from `version-hint.text`. Closes apache#763 # Rationale for this change `version-hint.text` is useful in context where you does not want or need a full-fledge catalog. Our use case is sharing datasets publicly as Iceberg tables on S3. # Are these changes tested? No yet. # Are there any user-facing changes? Yes. User can now points `StaticTable` to the table location rather than a specific version file.
| elif content.isnumeric(): | ||
| return os.path.join(metadata_location, "metadata", "v%s.metadata.json").format(content) | ||
| else: | ||
| return os.path.join(metadata_location, "metadata", "%s.metadata.json").format(content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a syntax error, should be "{}".format.
In pytest.fixture table_location, the case metadata_filename = f"{uuid.uuid4()}.metadata.json" does not cover the faulty code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call @arnaudbriche 🙌 Maybe go with f-strings instead?
This change allow making use of the
version-hint.textfile when a static table is instantiated with ametadata_locationnot ending with '.metadata.json'.User can just point to the table location, and metadata file path will be read from
version-hint.text.Closes #763
Rationale for this change
version-hint.textis useful in context where you does not want or need a full-fledge catalog.Our use case is sharing datasets publicly as Iceberg tables on S3.
Are these changes tested?
No yet.
Are there any user-facing changes?
Yes. User can now points
StaticTableto the table location rather than a specific version file.