Skip to content

Commit 24086cf

Browse files
Merge remote-tracking branch 'github/main' into group_first
2 parents 5623009 + d38e42c commit 24086cf

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+3132
-330
lines changed

CHANGELOG.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,31 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.14.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.13.0...v2.14.0) (2025-08-05)
8+
9+
10+
### Features
11+
12+
* Dynamic table width for better display across devices (https://github.com/googleapis/python-bigquery-dataframes/issues/1948) ([a6d30ae](https://github.com/googleapis/python-bigquery-dataframes/commit/a6d30ae3f4358925c999c53b558c1ecd3ee03e6c)) ([a6d30ae](https://github.com/googleapis/python-bigquery-dataframes/commit/a6d30ae3f4358925c999c53b558c1ecd3ee03e6c))
13+
* Retry AI/ML jobs that fail more often ([#1965](https://github.com/googleapis/python-bigquery-dataframes/issues/1965)) ([25bde9f](https://github.com/googleapis/python-bigquery-dataframes/commit/25bde9f9b89112db0efcc119bf29b6d1f3896c33))
14+
* Support series input in managed function ([#1920](https://github.com/googleapis/python-bigquery-dataframes/issues/1920)) ([62a189f](https://github.com/googleapis/python-bigquery-dataframes/commit/62a189f4d69f6c05fe348a1acd1fbac364fa60b9))
15+
16+
17+
### Bug Fixes
18+
19+
* Enhance type error messages for bigframes functions ([#1958](https://github.com/googleapis/python-bigquery-dataframes/issues/1958)) ([770918e](https://github.com/googleapis/python-bigquery-dataframes/commit/770918e998bf1fde7a656e8f8a0ff0a8c68509f2))
20+
21+
22+
### Performance Improvements
23+
24+
* Use promote_offsets for consistent row number generation for index.get_loc ([#1957](https://github.com/googleapis/python-bigquery-dataframes/issues/1957)) ([c67a25a](https://github.com/googleapis/python-bigquery-dataframes/commit/c67a25a879ab2a35ca9053a81c9c85b5660206ae))
25+
26+
27+
### Documentation
28+
29+
* Add code snippet for storing dataframes to a CSV file ([#1943](https://github.com/googleapis/python-bigquery-dataframes/issues/1943)) ([a511e09](https://github.com/googleapis/python-bigquery-dataframes/commit/a511e09e6924d2e8302af2eb4a602c6b9e5d2d72))
30+
* Add code snippet for storing dataframes to a CSV file ([#1953](https://github.com/googleapis/python-bigquery-dataframes/issues/1953)) ([a298a02](https://github.com/googleapis/python-bigquery-dataframes/commit/a298a02b451f03ca200fe0756b9a7b57e3d1bf0e))
31+
732
## [2.13.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.12.0...v2.13.0) (2025-07-25)
833

934

GEMINI.md

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# Contribution guidelines, tailored for LLM agents
2+
3+
## Testing
4+
5+
We use `nox` to instrument our tests.
6+
7+
- To test your changes, run unit tests with `nox`:
8+
9+
```bash
10+
nox -r -s unit
11+
```
12+
13+
- To run a single unit test:
14+
15+
```bash
16+
nox -r -s unit-3.13 -- -k <name of test>
17+
```
18+
19+
- To run system tests, you can execute::
20+
21+
# Run all system tests
22+
$ nox -r -s system
23+
24+
# Run a single system test
25+
$ nox -r -s system-3.13 -- -k <name of test>
26+
27+
- The codebase must have better coverage than it had previously after each
28+
change. You can test coverage via `nox -s unit system cover` (takes a long
29+
time).
30+
31+
## Code Style
32+
33+
- We use the automatic code formatter `black`. You can run it using
34+
the nox session `format`. This will eliminate many lint errors. Run via:
35+
36+
```bash
37+
nox -r -s format
38+
```
39+
40+
- PEP8 compliance is required, with exceptions defined in the linter configuration.
41+
If you have ``nox`` installed, you can test that you have not introduced
42+
any non-compliant code via:
43+
44+
```
45+
nox -r -s lint
46+
```
47+
48+
## Documentation
49+
50+
If a method or property is implementing the same interface as a third-party
51+
package such as pandas or scikit-learn, place the relevant docstring in the
52+
corresponding `third_party/bigframes_vendored/package_name` directory, not in
53+
the `bigframes` directory. Implementations may be placed in the `bigframes`
54+
directory, though.
55+
56+
### Testing code samples
57+
58+
Code samples are very important for accurate documentation. We use the "doctest"
59+
framework to ensure the samples are functioning as expected. After adding a code
60+
sample, please ensure it is correct by running doctest. To run the samples
61+
doctests for just a single method, refer to the following example:
62+
63+
```bash
64+
pytest --doctest-modules bigframes/pandas/__init__.py::bigframes.pandas.cut
65+
```
66+
67+
## Tips for implementing common BigFrames features
68+
69+
### Adding a scalar operator
70+
71+
For an example, see commit
72+
[c5b7fdae74a22e581f7705bc0cf5390e928f4425](https://github.com/googleapis/python-bigquery-dataframes/commit/c5b7fdae74a22e581f7705bc0cf5390e928f4425).
73+
74+
To add a new scalar operator, follow these steps:
75+
76+
1. **Define the operation dataclass:**
77+
- In `bigframes/operations/`, find the relevant file (e.g., `geo_ops.py` for geography functions) or create a new one.
78+
- Create a new dataclass inheriting from `base_ops.UnaryOp` for unary
79+
operators, `base_ops.BinaryOp` for binary operators, `base_ops.TernaryOp`
80+
for ternary operators, or `base_ops.NaryOp for operators with many
81+
arguments. Note that these operators are counting the number column-like
82+
arguments. A function that takes only a single column but several literal
83+
values would still be a `UnaryOp`.
84+
- Define the `name` of the operation and any parameters it requires.
85+
- Implement the `output_type` method to specify the data type of the result.
86+
87+
2. **Export the new operation:**
88+
- In `bigframes/operations/__init__.py`, import your new operation dataclass and add it to the `__all__` list.
89+
90+
3. **Implement the user-facing function (pandas-like):**
91+
92+
- Identify the canonical function from pandas / geopandas / awkward array /
93+
other popular Python package that this operator implements.
94+
- Find the corresponding class in BigFrames. For example, the implementation
95+
for most geopandas.GeoSeries methods is in
96+
`bigframes/geopandas/geoseries.py`. Pandas Series methods are implemented
97+
in `bigframes/series.py` or one of the accessors, such as `StringMethods`
98+
in `bigframes/operations/strings.py`.
99+
- Create the user-facing function that will be called by users (e.g., `length`).
100+
- If the SQL method differs from pandas or geopandas in a way that can't be
101+
made the same, raise a `NotImplementedError` with an appropriate message and
102+
link to the feedback form.
103+
- Add the docstring to the corresponding file in
104+
`third_party/bigframes_vendored`, modeled after pandas / geopandas.
105+
106+
4. **Implement the user-facing function (SQL-like):**
107+
108+
- In `bigframes/bigquery/_operations/`, find the relevant file (e.g., `geo.py`) or create a new one.
109+
- Create the user-facing function that will be called by users (e.g., `st_length`).
110+
- This function should take a `Series` for any column-like inputs, plus any other parameters.
111+
- Inside the function, call `series._apply_unary_op`,
112+
`series._apply_binary_op`, or similar passing the operation dataclass you
113+
created.
114+
- Add a comprehensive docstring with examples.
115+
- In `bigframes/bigquery/__init__.py`, import your new user-facing function and add it to the `__all__` list.
116+
117+
5. **Implement the compilation logic:**
118+
- In `bigframes/core/compile/scalar_op_compiler.py`:
119+
- If the BigQuery function has a direct equivalent in Ibis, you can often reuse an existing Ibis method.
120+
- If not, define a new Ibis UDF using `@ibis_udf.scalar.builtin` to map to the specific BigQuery function signature.
121+
- Create a new compiler implementation function (e.g., `geo_length_op_impl`).
122+
- Register this function to your operation dataclass using `@scalar_op_compiler.register_unary_op` or `@scalar_op_compiler.register_binary_op`.
123+
- This implementation will translate the BigQuery DataFrames operation into the appropriate Ibis expression.
124+
125+
6. **Add Tests:**
126+
- Add system tests in the `tests/system/` directory to verify the end-to-end
127+
functionality of the new operator. Test various inputs, including edge cases
128+
and `NULL` values.
129+
130+
Where possible, run the same test code against pandas or GeoPandas and
131+
compare that the outputs are the same (except for dtypes if BigFrames
132+
differs from pandas).
133+
- If you are overriding a pandas or GeoPandas property, add a unit test to
134+
ensure the correct behavior (e.g., raising `NotImplementedError` if the
135+
functionality is not supported).
136+
137+
138+
## Constraints
139+
140+
- Only add git commits. Do not change git history.
141+
- Follow the spec file for development.
142+
- Check off items in the "Acceptance
143+
criteria" and "Detailed steps" sections with `[x]`.
144+
- Please do this as they are completed.
145+
- Refer back to the spec after each step.

bigframes/_importing.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
import importlib
1515
from types import ModuleType
1616

17+
import numpy
1718
from packaging import version
1819

1920
# Keep this in sync with setup.py
@@ -22,9 +23,13 @@
2223

2324
def import_polars() -> ModuleType:
2425
polars_module = importlib.import_module("polars")
25-
imported_version = version.Version(polars_module.build_info()["version"])
26-
if imported_version < POLARS_MIN_VERSION:
26+
# Check for necessary methods instead of the version number because we
27+
# can't trust the polars version until
28+
# https://github.com/pola-rs/polars/issues/23940 is fixed.
29+
try:
30+
polars_module.lit(numpy.int64(100), dtype=polars_module.Int64())
31+
except TypeError:
2732
raise ImportError(
28-
f"Imported polars version: {imported_version} is below the minimum version: {POLARS_MIN_VERSION}"
33+
f"Imported polars version is likely below the minimum version: {POLARS_MIN_VERSION}"
2934
)
3035
return polars_module

bigframes/bigquery/__init__.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,9 @@
2929
)
3030
from bigframes.bigquery._operations.geo import (
3131
st_area,
32+
st_buffer,
33+
st_centroid,
34+
st_convexhull,
3235
st_difference,
3336
st_distance,
3437
st_intersection,
@@ -54,11 +57,18 @@
5457
# approximate aggregate ops
5558
"approx_top_count",
5659
# array ops
57-
"array_length",
5860
"array_agg",
61+
"array_length",
5962
"array_to_string",
63+
# datetime ops
64+
"unix_micros",
65+
"unix_millis",
66+
"unix_seconds",
6067
# geo ops
6168
"st_area",
69+
"st_buffer",
70+
"st_centroid",
71+
"st_convexhull",
6272
"st_difference",
6373
"st_distance",
6474
"st_intersection",
@@ -81,8 +91,4 @@
8191
"sql_scalar",
8292
# struct ops
8393
"struct",
84-
# datetime ops
85-
"unix_micros",
86-
"unix_millis",
87-
"unix_seconds",
8894
]

0 commit comments

Comments
 (0)