Skip to content

Commit 54f48c6

Browse files
authored
Merge branch 'main' into feature/migrate-strconcat-op
2 parents d5ee487 + 60b28bf commit 54f48c6

File tree

10 files changed

+233
-14
lines changed

10 files changed

+233
-14
lines changed

CHANGELOG.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,34 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.25.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.24.0...v2.25.0) (2025-10-13)
8+
9+
10+
### Features
11+
12+
* Add barh, pie plot types ([#2146](https://github.com/googleapis/python-bigquery-dataframes/issues/2146)) ([5cc3c5b](https://github.com/googleapis/python-bigquery-dataframes/commit/5cc3c5b1391a7dfa062b1d77f001726b013f6337))
13+
* Add Index.__eq__ for consts, aligned objects ([#2141](https://github.com/googleapis/python-bigquery-dataframes/issues/2141)) ([8514200](https://github.com/googleapis/python-bigquery-dataframes/commit/85142008ec895fa078d192bbab942d0257f70df3))
14+
* Add output_schema parameter to ai.generate() ([#2139](https://github.com/googleapis/python-bigquery-dataframes/issues/2139)) ([ef0b0b7](https://github.com/googleapis/python-bigquery-dataframes/commit/ef0b0b73843da2a93baf08e4cd5457fbb590b89c))
15+
* Create session-scoped `cut`, `DataFrame`, `MultiIndex`, `Index`, `Series`, `to_datetime`, and `to_timedelta` methods ([#2157](https://github.com/googleapis/python-bigquery-dataframes/issues/2157)) ([5e1e809](https://github.com/googleapis/python-bigquery-dataframes/commit/5e1e8098ecf212c91d73fa80d722d1cb3e46668b))
16+
* Replace ML.GENERATE_TEXT with AI.GENERATE for audio transcription ([#2151](https://github.com/googleapis/python-bigquery-dataframes/issues/2151)) ([a410d0a](https://github.com/googleapis/python-bigquery-dataframes/commit/a410d0ae43ef3b053b650804156eda0b1f569da9))
17+
* Support string literal inputs for AI functions ([#2152](https://github.com/googleapis/python-bigquery-dataframes/issues/2152)) ([7600001](https://github.com/googleapis/python-bigquery-dataframes/commit/760000122dc190ac8a3303234cf4cbee1bbb9493))
18+
19+
20+
### Bug Fixes
21+
22+
* Address typo in error message ([#2142](https://github.com/googleapis/python-bigquery-dataframes/issues/2142)) ([cdf2dd5](https://github.com/googleapis/python-bigquery-dataframes/commit/cdf2dd55a0c03da50ab92de09788cafac0abf6f6))
23+
* Avoid possible circular imports in global session ([#2115](https://github.com/googleapis/python-bigquery-dataframes/issues/2115)) ([095c0b8](https://github.com/googleapis/python-bigquery-dataframes/commit/095c0b85a25a2e51087880909597cc62a0341c93))
24+
* Fix too many cluster columns requested by caching ([#2155](https://github.com/googleapis/python-bigquery-dataframes/issues/2155)) ([35c1c33](https://github.com/googleapis/python-bigquery-dataframes/commit/35c1c33b85d1b92e402aab73677df3ffe43a51b4))
25+
* Show progress even in job optional queries ([#2119](https://github.com/googleapis/python-bigquery-dataframes/issues/2119)) ([1f48d3a](https://github.com/googleapis/python-bigquery-dataframes/commit/1f48d3a62e7e6dac4acb39e911daf766b8e2fe62))
26+
* Yield row count from read session if otherwise unknown ([#2148](https://github.com/googleapis/python-bigquery-dataframes/issues/2148)) ([8997d4d](https://github.com/googleapis/python-bigquery-dataframes/commit/8997d4d7d9965e473195f98c550c80657035b7e1))
27+
28+
29+
### Documentation
30+
31+
* Add a brief intro notebook for bbq AI functions ([#2150](https://github.com/googleapis/python-bigquery-dataframes/issues/2150)) ([1f434fb](https://github.com/googleapis/python-bigquery-dataframes/commit/1f434fb5c7c00601654b3ab19c6ad7fceb258bd6))
32+
* Fix ai function related docs ([#2149](https://github.com/googleapis/python-bigquery-dataframes/issues/2149)) ([93a0749](https://github.com/googleapis/python-bigquery-dataframes/commit/93a0749392b84f27162654fe5ea5baa329a23f99))
33+
* Remove progress bar from getting started template ([#2143](https://github.com/googleapis/python-bigquery-dataframes/issues/2143)) ([d13abad](https://github.com/googleapis/python-bigquery-dataframes/commit/d13abadbcd68d03997e8dc11bb7a2b14bbd57fcc))
34+
735
## [2.24.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.23.0...v2.24.0) (2025-10-07)
836

937

bigframes/bigquery/_operations/ai.py

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,14 +65,21 @@ def generate(
6565
1 Ottawa\\n
6666
Name: result, dtype: string
6767
68-
You get structured output when the `output_schema` parameter is set:
68+
You get structured output when the `output_schema` parameter is set:
6969
7070
>>> animals = bpd.Series(["Rabbit", "Spider"])
7171
>>> bbq.ai.generate(animals, output_schema={"number_of_legs": "INT64", "is_herbivore": "BOOL"})
7272
0 {'is_herbivore': True, 'number_of_legs': 4, 'f...
7373
1 {'is_herbivore': False, 'number_of_legs': 8, '...
7474
dtype: struct<is_herbivore: bool, number_of_legs: int64, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow]
7575
76+
.. note::
77+
78+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
79+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
80+
and might have limited support. For more information, see the launch stage descriptions
81+
(https://cloud.google.com/products#product-launch-stages).
82+
7683
Args:
7784
prompt (str | Series | List[str|Series] | Tuple[str|Series, ...]):
7885
A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series
@@ -165,6 +172,13 @@ def generate_bool(
165172
2 False
166173
Name: result, dtype: boolean
167174
175+
.. note::
176+
177+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
178+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
179+
and might have limited support. For more information, see the launch stage descriptions
180+
(https://cloud.google.com/products#product-launch-stages).
181+
168182
Args:
169183
prompt (str | Series | List[str|Series] | Tuple[str|Series, ...]):
170184
A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series
@@ -240,6 +254,13 @@ def generate_int(
240254
2 8
241255
Name: result, dtype: Int64
242256
257+
.. note::
258+
259+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
260+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
261+
and might have limited support. For more information, see the launch stage descriptions
262+
(https://cloud.google.com/products#product-launch-stages).
263+
243264
Args:
244265
prompt (str | Series | List[str|Series] | Tuple[str|Series, ...]):
245266
A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series
@@ -315,6 +336,13 @@ def generate_double(
315336
2 8.0
316337
Name: result, dtype: Float64
317338
339+
.. note::
340+
341+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
342+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
343+
and might have limited support. For more information, see the launch stage descriptions
344+
(https://cloud.google.com/products#product-launch-stages).
345+
318346
Args:
319347
prompt (str | Series | List[str|Series] | Tuple[str|Series, ...]):
320348
A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series
@@ -371,6 +399,7 @@ def if_(
371399
provides optimization such that not all rows are evaluated with the LLM.
372400
373401
**Examples:**
402+
374403
>>> import bigframes.pandas as bpd
375404
>>> import bigframes.bigquery as bbq
376405
>>> bpd.options.display.progress_bar = None
@@ -386,6 +415,13 @@ def if_(
386415
1 Illinois
387416
dtype: string
388417
418+
.. note::
419+
420+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
421+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
422+
and might have limited support. For more information, see the launch stage descriptions
423+
(https://cloud.google.com/products#product-launch-stages).
424+
389425
Args:
390426
prompt (str | Series | List[str|Series] | Tuple[str|Series, ...]):
391427
A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series
@@ -433,6 +469,13 @@ def classify(
433469
<BLANKLINE>
434470
[2 rows x 2 columns]
435471
472+
.. note::
473+
474+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
475+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
476+
and might have limited support. For more information, see the launch stage descriptions
477+
(https://cloud.google.com/products#product-launch-stages).
478+
436479
Args:
437480
input (str | Series | List[str|Series] | Tuple[str|Series, ...]):
438481
A mixture of Series and string literals that specifies the input to send to the model. The Series can be BigFrames Series
@@ -482,6 +525,13 @@ def score(
482525
2 3.0
483526
dtype: Float64
484527
528+
.. note::
529+
530+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
531+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
532+
and might have limited support. For more information, see the launch stage descriptions
533+
(https://cloud.google.com/products#product-launch-stages).
534+
485535
Args:
486536
prompt (str | Series | List[str|Series] | Tuple[str|Series, ...]):
487537
A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series

bigframes/core/compile/sqlglot/aggregations/ordered_unary_compiler.py

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,8 @@
1414

1515
from __future__ import annotations
1616

17-
import typing
18-
1917
import sqlglot.expressions as sge
2018

21-
from bigframes.core import window_spec
2219
import bigframes.core.compile.sqlglot.aggregations.op_registration as reg
2320
import bigframes.core.compile.sqlglot.expressions.typed_expr as typed_expr
2421
from bigframes.operations import aggregations as agg_ops
@@ -29,9 +26,35 @@
2926
def compile(
3027
op: agg_ops.WindowOp,
3128
column: typed_expr.TypedExpr,
32-
window: typing.Optional[window_spec.WindowSpec] = None,
33-
order_by: typing.Sequence[sge.Expression] = [],
29+
*,
30+
order_by: tuple[sge.Expression, ...],
31+
) -> sge.Expression:
32+
return ORDERED_UNARY_OP_REGISTRATION[op](op, column, order_by=order_by)
33+
34+
35+
@ORDERED_UNARY_OP_REGISTRATION.register(agg_ops.ArrayAggOp)
36+
def _(
37+
op: agg_ops.ArrayAggOp,
38+
column: typed_expr.TypedExpr,
39+
*,
40+
order_by: tuple[sge.Expression, ...],
3441
) -> sge.Expression:
35-
return ORDERED_UNARY_OP_REGISTRATION[op](
36-
op, column, window=window, order_by=order_by
37-
)
42+
expr = column.expr
43+
if len(order_by) > 0:
44+
expr = sge.Order(this=column.expr, expressions=list(order_by))
45+
return sge.IgnoreNulls(this=sge.ArrayAgg(this=expr))
46+
47+
48+
@ORDERED_UNARY_OP_REGISTRATION.register(agg_ops.StringAggOp)
49+
def _(
50+
op: agg_ops.StringAggOp,
51+
column: typed_expr.TypedExpr,
52+
*,
53+
order_by: tuple[sge.Expression, ...],
54+
) -> sge.Expression:
55+
expr = column.expr
56+
if len(order_by) > 0:
57+
expr = sge.Order(this=expr, expressions=list(order_by))
58+
59+
expr = sge.GroupConcat(this=expr, separator=sge.convert(op.sep))
60+
return sge.func("COALESCE", expr, sge.convert(""))

bigframes/version.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
__version__ = "2.24.0"
15+
__version__ = "2.25.0"
1616

1717
# {x-release-please-start-date}
18-
__release_date__ = "2025-10-07"
18+
__release_date__ = "2025-10-13"
1919
# {x-release-please-end}

docs/templates/toc.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,8 @@
219219
- name: BigQuery built-in functions
220220
uid: bigframes.bigquery
221221
- name: BigQuery AI Functions
222-
uid: bigframes.bigquery.ai
222+
uid: bigframes.bigquery._operations.ai
223+
status: beta
223224
name: bigframes.bigquery
224225
- items:
225226
- name: GeoSeries

tests/system/large/blob/test_function.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15+
import logging
1516
import os
1617
import traceback
1718
from typing import Generator
@@ -434,6 +435,15 @@ def test_blob_transcribe(
434435
actual_text = actual[0]["content"]
435436
else:
436437
actual_text = actual[0]
438+
439+
if pd.isna(actual_text) or actual_text == "":
440+
# Ensure the tests are robust to flakes in the model, which isn't
441+
# particularly useful information for the bigframes team.
442+
logging.warning(
443+
f"blob_transcribe() model {model_name} verbose={verbose} failure"
444+
)
445+
return
446+
437447
actual_len = len(actual_text)
438448

439449
relative_length_tolerance = 0.2
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
WITH `bfcte_0` AS (
2+
SELECT
3+
`int64_col` AS `bfcol_0`
4+
FROM `bigframes-dev`.`sqlglot_test`.`scalar_types`
5+
), `bfcte_1` AS (
6+
SELECT
7+
ARRAY_AGG(`bfcol_0` IGNORE NULLS ORDER BY `bfcol_0` IS NULL ASC, `bfcol_0` ASC) AS `bfcol_1`
8+
FROM `bfcte_0`
9+
)
10+
SELECT
11+
`bfcol_1` AS `int64_col`
12+
FROM `bfcte_1`
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
WITH `bfcte_0` AS (
2+
SELECT
3+
`string_col` AS `bfcol_0`
4+
FROM `bigframes-dev`.`sqlglot_test`.`scalar_types`
5+
), `bfcte_1` AS (
6+
SELECT
7+
COALESCE(STRING_AGG(`bfcol_0`, ','
8+
ORDER BY
9+
`bfcol_0` IS NULL ASC,
10+
`bfcol_0` ASC), '') AS `bfcol_1`
11+
FROM `bfcte_0`
12+
)
13+
SELECT
14+
`bfcol_1` AS `string_col`
15+
FROM `bfcte_1`
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Copyright 2025 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import sys
16+
import typing
17+
18+
import pytest
19+
20+
from bigframes.core import agg_expressions as agg_exprs
21+
from bigframes.core import array_value, identifiers, nodes, ordering
22+
from bigframes.operations import aggregations as agg_ops
23+
import bigframes.pandas as bpd
24+
25+
pytest.importorskip("pytest_snapshot")
26+
27+
28+
def _apply_ordered_unary_agg_ops(
29+
obj: bpd.DataFrame,
30+
ops_list: typing.Sequence[agg_exprs.UnaryAggregation],
31+
new_names: typing.Sequence[str],
32+
ordering_args: typing.Sequence[str],
33+
) -> str:
34+
ordering_exprs = tuple(ordering.ascending_over(arg) for arg in ordering_args)
35+
aggs = [(op, identifiers.ColumnId(name)) for op, name in zip(ops_list, new_names)]
36+
37+
agg_node = nodes.AggregateNode(
38+
obj._block.expr.node,
39+
aggregations=tuple(aggs),
40+
by_column_ids=(),
41+
order_by=ordering_exprs,
42+
)
43+
result = array_value.ArrayValue(agg_node)
44+
45+
sql = result.session._executor.to_sql(result, enable_cache=False)
46+
return sql
47+
48+
49+
def test_array_agg(scalar_types_df: bpd.DataFrame, snapshot):
50+
# TODO: Verify "NULL LAST" syntax issue on Python < 3.12
51+
if sys.version_info < (3, 12):
52+
pytest.skip(
53+
"Skipping test due to inconsistent SQL formatting on Python < 3.12.",
54+
)
55+
56+
col_name = "int64_col"
57+
bf_df = scalar_types_df[[col_name]]
58+
agg_expr = agg_ops.ArrayAggOp().as_expr(col_name)
59+
sql = _apply_ordered_unary_agg_ops(
60+
bf_df, [agg_expr], [col_name], ordering_args=[col_name]
61+
)
62+
63+
snapshot.assert_match(sql, "out.sql")
64+
65+
66+
def test_string_agg(scalar_types_df: bpd.DataFrame, snapshot):
67+
# TODO: Verify "NULL LAST" syntax issue on Python < 3.12
68+
if sys.version_info < (3, 12):
69+
pytest.skip(
70+
"Skipping test due to inconsistent SQL formatting on Python < 3.12.",
71+
)
72+
73+
col_name = "string_col"
74+
bf_df = scalar_types_df[[col_name]]
75+
agg_expr = agg_ops.StringAggOp(sep=",").as_expr(col_name)
76+
sql = _apply_ordered_unary_agg_ops(
77+
bf_df, [agg_expr], [col_name], ordering_args=[col_name]
78+
)
79+
80+
snapshot.assert_match(sql, "out.sql")

third_party/bigframes_vendored/version.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
__version__ = "2.24.0"
15+
__version__ = "2.25.0"
1616

1717
# {x-release-please-start-date}
18-
__release_date__ = "2025-10-07"
18+
__release_date__ = "2025-10-13"
1919
# {x-release-please-end}

0 commit comments

Comments
 (0)