Commit 6ca85ac
committed
chore: merge latest commits on main
commit 1ed3abd
Author: Sung Yun <107272191+syun64@users.noreply.github.com>
Date: Wed Jul 17 02:04:52 2024 -0400
Allow writing `pa.Table` that are either a subset of table schema or in arbitrary order, and support type promotion on write (apache#921)
* merge
* thanks @HonahX :)
Co-authored-by: Honah J. <undefined.newdb.newtable@gmail.com>
* support promote
* revert promote
* use a visitor
* support promotion on write
* fix
* Thank you @Fokko !
Co-authored-by: Fokko Driesprong <fokko@apache.org>
* revert
* add-files promotiontest
* support promote for add_files
* add tests for uuid
* add_files subset schema test
---------
Co-authored-by: Honah J. <undefined.newdb.newtable@gmail.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
commit 0f2e19e
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon Jul 15 23:25:08 2024 -0700
Bump zstandard from 0.22.0 to 0.23.0 (apache#934)
Bumps [zstandard](https://github.com/indygreg/python-zstandard) from 0.22.0 to 0.23.0.
- [Release notes](https://github.com/indygreg/python-zstandard/releases)
- [Changelog](https://github.com/indygreg/python-zstandard/blob/main/docs/news.rst)
- [Commits](indygreg/python-zstandard@0.22.0...0.23.0)
---
updated-dependencies:
- dependency-name: zstandard
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit ec73d97
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon Jul 15 23:24:47 2024 -0700
Bump griffe from 0.47.0 to 0.48.0 (apache#933)
Bumps [griffe](https://github.com/mkdocstrings/griffe) from 0.47.0 to 0.48.0.
- [Release notes](https://github.com/mkdocstrings/griffe/releases)
- [Changelog](https://github.com/mkdocstrings/griffe/blob/main/CHANGELOG.md)
- [Commits](mkdocstrings/griffe@0.47.0...0.48.0)
---
updated-dependencies:
- dependency-name: griffe
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit d05a423
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon Jul 15 23:24:16 2024 -0700
Bump mkdocs-material from 9.5.28 to 9.5.29 (apache#932)
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.28 to 9.5.29.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](squidfunk/mkdocs-material@9.5.28...9.5.29)
---
updated-dependencies:
- dependency-name: mkdocs-material
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit e27cd90
Author: Yair Halevi (Spock) <118175475+spock-abadai@users.noreply.github.com>
Date: Sun Jul 14 22:11:04 2024 +0300
Allow empty `names` in mapped field of Name Mapping (apache#927)
* Remove check_at_least_one field validator
Iceberg spec permits an emtpy list of names in the default name mapping. check_at_least_one is therefore unnecessary.
* Remove irrelevant test case
* Fixing pydantic model
No longer requiring minimum length of names list to be 1.
* Added test case for empty names in name mapping
* Fixed formatting error
commit 3f44dfe
Author: Soumya Ghosh <ghoshsoumya92@gmail.com>
Date: Sun Jul 14 00:35:38 2024 +0530
Lowercase bool values in table properties (apache#924)
commit b11cdb5
Author: Sung Yun <107272191+syun64@users.noreply.github.com>
Date: Fri Jul 12 16:45:04 2024 -0400
Deprecate to_requested_schema (apache#918)
* deprecate to_requested_schema
* prep for release
commit a3dd531
Author: Honah J <honahx@apache.org>
Date: Fri Jul 12 13:14:40 2024 -0700
Glue endpoint config variable, continue apache#530 (apache#920)
Co-authored-by: Seb Pretzer <24555985+sebpretzer@users.noreply.github.com>
commit 32e8f88
Author: Sung Yun <107272191+syun64@users.noreply.github.com>
Date: Fri Jul 12 15:26:00 2024 -0400
support PyArrow timestamptz with Etc/UTC (apache#910)
Co-authored-by: Fokko Driesprong <fokko@apache.org>
commit f6d56e9
Author: Sung Yun <107272191+syun64@users.noreply.github.com>
Date: Fri Jul 12 05:31:06 2024 -0400
fix invalidation logic (apache#911)
commit 6488ad8
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu Jul 11 22:56:48 2024 -0700
Bump coverage from 7.5.4 to 7.6.0 (apache#917)
Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.5.4 to 7.6.0.
- [Release notes](https://github.com/nedbat/coveragepy/releases)
- [Changelog](https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst)
- [Commits](coveragepy/coveragepy@7.5.4...7.6.0)
---
updated-dependencies:
- dependency-name: coverage
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit dceedfa
Author: Sung Yun <107272191+syun64@users.noreply.github.com>
Date: Thu Jul 11 20:32:14 2024 -0400
Check if schema is compatible in `add_files` API (apache#907)
Co-authored-by: Fokko Driesprong <fokko@apache.org>
commit aceed2a
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu Jul 11 15:52:06 2024 +0200
Bump mypy-boto3-glue from 1.34.136 to 1.34.143 (apache#912)
Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.34.136 to 1.34.143.
- [Release notes](https://github.com/youtype/mypy_boto3_builder/releases)
- [Commits](https://github.com/youtype/mypy_boto3_builder/commits)
---
updated-dependencies:
- dependency-name: mypy-boto3-glue
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit 1b9b884
Author: Fokko Driesprong <fokko@apache.org>
Date: Thu Jul 11 12:45:20 2024 +0200
PyArrow: Don't enforce the schema when reading/writing (apache#902)
* PyArrow: Don't enforce the schema
PyIceberg struggled with the different type of arrow, such as
the `string` and `large_string`. They represent the same, but are
different under the hood.
My take is that we should hide these kind of details from the user
as much as possible. Now we went down the road of passing in the
Iceberg schema into Arrow, but when doing this, Iceberg has to
decide if it is a large or non-large type.
This PR removes passing down the schema in order to let Arrow decide
unless:
- The type should be evolved
- In case of re-ordering, we reorder the original types
* WIP
* Reuse Table schema
* Make linter happy
* Squash some bugs
* Thanks Sung!
Co-authored-by: Sung Yun <107272191+syun64@users.noreply.github.com>
* Moar code moar bugs
* Remove the variables wrt file sizes
* Linting
* Go with large ones for now
* Missed one there!
---------
Co-authored-by: Sung Yun <107272191+syun64@users.noreply.github.com>
commit 8f47dfd
Author: Soumya Ghosh <ghoshsoumya92@gmail.com>
Date: Thu Jul 11 11:52:55 2024 +0530
Move determine_partitions and helper methods to io.pyarrow (apache#906)
commit 5aa451d
Author: Soumya Ghosh <ghoshsoumya92@gmail.com>
Date: Thu Jul 11 07:57:05 2024 +0530
Rename data_sequence_number to sequence_number in ManifestEntry (apache#900)
commit 77a07c9
Author: Honah J <honahx@apache.org>
Date: Wed Jul 10 03:56:13 2024 -0700
Support MergeAppend operations (apache#363)
* add ListPacker + tests
* add merge append
* add merge_append
* fix snapshot inheritance
* test manifest file and entries
* add doc
* fix lint
* change test name
* address review comments
* rename _MergingSnapshotProducer to _SnapshotProducer
* fix a serious bug
* update the doc
* remove merge_append as public API
* make default to false
* add test description
* fix merge conflict
* fix snapshot_id issue
commit 66b92ff
Author: Fokko Driesprong <fokko@apache.org>
Date: Wed Jul 10 10:09:20 2024 +0200
GCS: Fix incorrect token description (apache#909)
commit c25e080
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue Jul 9 20:50:29 2024 -0700
Bump zipp from 3.17.0 to 3.19.1 (apache#905)
Bumps [zipp](https://github.com/jaraco/zipp) from 3.17.0 to 3.19.1.
- [Release notes](https://github.com/jaraco/zipp/releases)
- [Changelog](https://github.com/jaraco/zipp/blob/main/NEWS.rst)
- [Commits](jaraco/zipp@v3.17.0...v3.19.1)
---
updated-dependencies:
- dependency-name: zipp
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit 301e336
Author: Sung Yun <107272191+syun64@users.noreply.github.com>
Date: Tue Jul 9 23:35:11 2024 -0400
Cast 's', 'ms' and 'ns' PyArrow timestamp to 'us' precision on write (apache#848)
commit 3f574d3
Author: Fokko Driesprong <fokko@apache.org>
Date: Tue Jul 9 11:36:43 2024 +0200
Support partial deletes (apache#569)
* Add option to delete datafiles
This is done through the Iceberg metadata, resulting
in efficient deletes if the data is partitioned correctly
* Pull in main
* WIP
* Change DataScan to accept Metadata and io
For the partial deletes I want to do a scan on in
memory metadata. Changing this API allows this.
* fix name-mapping issue
* WIP
* WIP
* Moar tests
* Oops
* Cleanup
* WIP
* WIP
* Fix summary generation
* Last few bits
* Fix the requirement
* Make ruff happy
* Comments, thanks Kevin!
* Comments
* Append rather than truncate
* Fix merge conflicts
* Make the tests pass
* Add another test
* Conflicts
* Add docs (apache#33)
* docs
* docs
* Add a partitioned overwrite test
* Fix comment
* Skip empty manifests
---------
Co-authored-by: HonahX <honahx@apache.org>
Co-authored-by: Sung Yun <107272191+syun64@users.noreply.github.com>
commit cdc3e54
Author: Fokko Driesprong <fokko@apache.org>
Date: Tue Jul 9 08:28:27 2024 +0200
Disallow writing empty Manifest files (apache#876)
* Disallow writing empty Avro files/blocks
Raising an exception when doing this might look extreme, but
there is no real good reason to allow this.
* Relax the constaints a bit
commit b68e109
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon Jul 8 22:16:23 2024 -0700
Bump fastavro from 1.9.4 to 1.9.5 (apache#904)
Bumps [fastavro](https://github.com/fastavro/fastavro) from 1.9.4 to 1.9.5.
- [Release notes](https://github.com/fastavro/fastavro/releases)
- [Changelog](https://github.com/fastavro/fastavro/blob/master/ChangeLog)
- [Commits](fastavro/fastavro@1.9.4...1.9.5)
---
updated-dependencies:
- dependency-name: fastavro
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit 90547bb
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon Jul 8 22:15:39 2024 -0700
Bump moto from 5.0.10 to 5.0.11 (apache#903)
Bumps [moto](https://github.com/getmoto/moto) from 5.0.10 to 5.0.11.
- [Release notes](https://github.com/getmoto/moto/releases)
- [Changelog](https://github.com/getmoto/moto/blob/master/CHANGELOG.md)
- [Commits](getmoto/moto@5.0.10...5.0.11)
---
updated-dependencies:
- dependency-name: moto
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit 7dff359
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Sun Jul 7 07:50:19 2024 +0200
Bump tenacity from 8.4.2 to 8.5.0 (apache#898)
commit 4aa469e
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Sat Jul 6 22:30:59 2024 +0200
Bump certifi from 2024.2.2 to 2024.7.4 (apache#899)
Bumps [certifi](https://github.com/certifi/python-certifi) from 2024.2.2 to 2024.7.4.
- [Commits](certifi/python-certifi@2024.02.02...2024.07.04)
---
updated-dependencies:
- dependency-name: certifi
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit aa7ad78
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Sat Jul 6 20:37:51 2024 +0200
Bump deptry from 0.16.1 to 0.16.2 (apache#897)
Bumps [deptry](https://github.com/fpgmaas/deptry) from 0.16.1 to 0.16.2.
- [Release notes](https://github.com/fpgmaas/deptry/releases)
- [Changelog](https://github.com/fpgmaas/deptry/blob/main/CHANGELOG.md)
- [Commits](fpgmaas/deptry@0.16.1...0.16.2)
---
updated-dependencies:
- dependency-name: deptry
dependency-type: direct:development
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>1 parent e1893a0 commit 6ca85ac
File tree
36 files changed
+3314
-838
lines changed- mkdocs
- docs
- pyiceberg
- catalog
- io
- table
- utils
- tests
- avro
- catalog
- cli
- integration
- test_writes
- io
- table
- utils
36 files changed
+3314
-838
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
331 | 331 | | |
332 | 332 | | |
333 | 333 | | |
334 | | - | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
335 | 339 | | |
336 | | - | |
337 | | - | |
| 340 | + | |
| 341 | + | |
338 | 342 | | |
339 | | - | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
340 | 353 | | |
341 | 354 | | |
342 | 355 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
64 | 79 | | |
65 | 80 | | |
66 | 81 | | |
| |||
129 | 144 | | |
130 | 145 | | |
131 | 146 | | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
145 | 160 | | |
146 | 161 | | |
147 | 162 | | |
| |||
273 | 288 | | |
274 | 289 | | |
275 | 290 | | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
276 | 301 | | |
277 | 302 | | |
278 | 303 | | |
| |||
305 | 330 | | |
306 | 331 | | |
307 | 332 | | |
308 | | - | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
26 | 41 | | |
27 | 42 | | |
28 | 43 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
0 commit comments