Skip to content

Commit b5e35fa

Browse files
Merge remote-tracking branch 'github/main' into new_execute_result
2 parents eb5cb76 + 4c98c95 commit b5e35fa

File tree

139 files changed

+3472
-2526
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

139 files changed

+3472
-2526
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,4 @@ system_tests/local_test_setup
6262
# Make sure a generated file isn't accidentally committed.
6363
pylintrc
6464
pylintrc.test
65+
dummy.pkl

CHANGELOG.md

Lines changed: 45 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,50 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.27.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.26.0...v2.27.0) (2025-10-24)
8+
9+
10+
### Features
11+
12+
* Add __abs__ to dataframe ([#2186](https://github.com/googleapis/python-bigquery-dataframes/issues/2186)) ([c331dfe](https://github.com/googleapis/python-bigquery-dataframes/commit/c331dfed59174962fbdc8ace175dd00fcc3d5d50))
13+
* Add df.groupby().corr()/cov() support ([#2190](https://github.com/googleapis/python-bigquery-dataframes/issues/2190)) ([ccd7c07](https://github.com/googleapis/python-bigquery-dataframes/commit/ccd7c0774a65d09e6cf31d2b62d0bc64bd7c4248))
14+
* Add str accessor to index ([#2179](https://github.com/googleapis/python-bigquery-dataframes/issues/2179)) ([cd87ce0](https://github.com/googleapis/python-bigquery-dataframes/commit/cd87ce0d504747f44d1b5a55f869a2e0fca6df17))
15+
* Add support for `np.isnan` and `np.isfinite` ufuncs ([#2188](https://github.com/googleapis/python-bigquery-dataframes/issues/2188)) ([68723bc](https://github.com/googleapis/python-bigquery-dataframes/commit/68723bc1f08013e43a8b11752f908bf8fd6d51f5))
16+
* Include local data bytes in the dry run report when available ([#2185](https://github.com/googleapis/python-bigquery-dataframes/issues/2185)) ([ee2c40c](https://github.com/googleapis/python-bigquery-dataframes/commit/ee2c40c6789535e259fb6a9774831d6913d16212))
17+
* Support len() on Groupby objects ([#2183](https://github.com/googleapis/python-bigquery-dataframes/issues/2183)) ([4191821](https://github.com/googleapis/python-bigquery-dataframes/commit/4191821b0976281a96c8965336ef51f061b0c481))
18+
* Support pa.json_(pa.string()) in struct/list if available ([#2180](https://github.com/googleapis/python-bigquery-dataframes/issues/2180)) ([5ec3cc0](https://github.com/googleapis/python-bigquery-dataframes/commit/5ec3cc0298c7a6195d5bd12a08d996e7df57fc5f))
19+
20+
21+
### Documentation
22+
23+
* Update AI operators deprecation notice ([#2182](https://github.com/googleapis/python-bigquery-dataframes/issues/2182)) ([2c50310](https://github.com/googleapis/python-bigquery-dataframes/commit/2c503107e17c59232b14b0d7bc40c350bb087d6f))
24+
25+
## [2.26.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.25.0...v2.26.0) (2025-10-17)
26+
27+
28+
### ⚠ BREAKING CHANGES
29+
30+
* turn Series.struct.dtypes into a property to match pandas (https://github.com/googleapis/python-bigquery-dataframes/pull/2169)
31+
32+
### Features
33+
34+
* Add df.sort_index(axis=1) ([#2173](https://github.com/googleapis/python-bigquery-dataframes/issues/2173)) ([ebf95e3](https://github.com/googleapis/python-bigquery-dataframes/commit/ebf95e3ef77822650f2e190df7b868011174d412))
35+
* Enhanced multimodal error handling with verbose mode for blob image functions ([#2024](https://github.com/googleapis/python-bigquery-dataframes/issues/2024)) ([f9e28fe](https://github.com/googleapis/python-bigquery-dataframes/commit/f9e28fe3f883cc4d486178fe241bc8b76473700f))
36+
* Implement cos, sin, and log operations for polars compiler ([#2170](https://github.com/googleapis/python-bigquery-dataframes/issues/2170)) ([5613e44](https://github.com/googleapis/python-bigquery-dataframes/commit/5613e4454f198691209ec28e58ce652104ac2de4))
37+
* Make `all` and `any` compatible with integer columns on Polars session ([#2154](https://github.com/googleapis/python-bigquery-dataframes/issues/2154)) ([6353d6e](https://github.com/googleapis/python-bigquery-dataframes/commit/6353d6ecad5139551ef68376c08f8749dd440014))
38+
39+
40+
### Bug Fixes
41+
42+
* `blob.display()` shows <NA> for null rows ([#2158](https://github.com/googleapis/python-bigquery-dataframes/issues/2158)) ([ddb4df0](https://github.com/googleapis/python-bigquery-dataframes/commit/ddb4df0dd991bef051e2a365c5cacf502803014d))
43+
* Turn Series.struct.dtypes into a property to match pandas (https://github.com/googleapis/python-bigquery-dataframes/pull/2169) ([62f7e9f](https://github.com/googleapis/python-bigquery-dataframes/commit/62f7e9f38f26b6eb549219a4cbf2c9b9023c9c35))
44+
45+
46+
### Documentation
47+
48+
* Clarify that only NULL values are handled by fillna/isna, not NaN ([#2176](https://github.com/googleapis/python-bigquery-dataframes/issues/2176)) ([8f27e73](https://github.com/googleapis/python-bigquery-dataframes/commit/8f27e737fc78a182238090025d09479fac90b326))
49+
* Remove import bigframes.pandas as bpd boilerplate from many samples ([#2147](https://github.com/googleapis/python-bigquery-dataframes/issues/2147)) ([1a01ab9](https://github.com/googleapis/python-bigquery-dataframes/commit/1a01ab97f103361f489f37b0af8c4b4d7806707c))
50+
751
## [2.25.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.24.0...v2.25.0) (2025-10-13)
852

953

@@ -463,7 +507,7 @@
463507

464508
* Address `read_csv` with both `index_col` and `use_cols` behavior inconsistency with pandas ([#1785](https://github.com/googleapis/python-bigquery-dataframes/issues/1785)) ([ba7c313](https://github.com/googleapis/python-bigquery-dataframes/commit/ba7c313c8d308e3ff3f736b60978cb7a51715209))
465509
* Allow KMeans model init parameter as k-means++ alias ([#1790](https://github.com/googleapis/python-bigquery-dataframes/issues/1790)) ([0b59cf1](https://github.com/googleapis/python-bigquery-dataframes/commit/0b59cf1008613770fa1433c6da395e755c86fe22))
466-
* Replace function now can handle bpd.NA value. ([#1786](https://github.com/googleapis/python-bigquery-dataframes/issues/1786)) ([7269512](https://github.com/googleapis/python-bigquery-dataframes/commit/7269512a28eb42029447d5380c764353278a74e1))
510+
* Replace function now can handle pd.NA value. ([#1786](https://github.com/googleapis/python-bigquery-dataframes/issues/1786)) ([7269512](https://github.com/googleapis/python-bigquery-dataframes/commit/7269512a28eb42029447d5380c764353278a74e1))
467511

468512

469513
### Documentation

bigframes/bigquery/_operations/ai.py

Lines changed: 90 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,15 @@
1919
from __future__ import annotations
2020

2121
import json
22-
from typing import Any, List, Literal, Mapping, Tuple, Union
22+
from typing import Any, Iterable, List, Literal, Mapping, Tuple, Union
2323

2424
import pandas as pd
2525

26-
from bigframes import clients, dtypes, series, session
26+
from bigframes import clients, dataframe, dtypes
27+
from bigframes import pandas as bpd
28+
from bigframes import series, session
2729
from bigframes.core import convert, log_adapter
30+
from bigframes.ml import core as ml_core
2831
from bigframes.operations import ai_ops, output_schemas
2932

3033
PROMPT_TYPE = Union[
@@ -53,7 +56,6 @@ def generate(
5356
5457
>>> import bigframes.pandas as bpd
5558
>>> import bigframes.bigquery as bbq
56-
>>> bpd.options.display.progress_bar = None
5759
>>> country = bpd.Series(["Japan", "Canada"])
5860
>>> bbq.ai.generate(("What's the capital city of ", country, " one word only"))
5961
0 {'result': 'Tokyo\\n', 'full_response': '{"cand...
@@ -155,7 +157,6 @@ def generate_bool(
155157
156158
>>> import bigframes.pandas as bpd
157159
>>> import bigframes.bigquery as bbq
158-
>>> bpd.options.display.progress_bar = None
159160
>>> df = bpd.DataFrame({
160161
... "col_1": ["apple", "bear", "pear"],
161162
... "col_2": ["fruit", "animal", "animal"]
@@ -240,7 +241,6 @@ def generate_int(
240241
241242
>>> import bigframes.pandas as bpd
242243
>>> import bigframes.bigquery as bbq
243-
>>> bpd.options.display.progress_bar = None
244244
>>> animal = bpd.Series(["Kangaroo", "Rabbit", "Spider"])
245245
>>> bbq.ai.generate_int(("How many legs does a ", animal, " have?"))
246246
0 {'result': 2, 'full_response': '{"candidates":...
@@ -322,7 +322,6 @@ def generate_double(
322322
323323
>>> import bigframes.pandas as bpd
324324
>>> import bigframes.bigquery as bbq
325-
>>> bpd.options.display.progress_bar = None
326325
>>> animal = bpd.Series(["Kangaroo", "Rabbit", "Spider"])
327326
>>> bbq.ai.generate_double(("How many legs does a ", animal, " have?"))
328327
0 {'result': 2.0, 'full_response': '{"candidates...
@@ -402,7 +401,6 @@ def if_(
402401
403402
>>> import bigframes.pandas as bpd
404403
>>> import bigframes.bigquery as bbq
405-
>>> bpd.options.display.progress_bar = None
406404
>>> us_state = bpd.Series(["Massachusetts", "Illinois", "Hawaii"])
407405
>>> bbq.ai.if_((us_state, " has a city called Springfield"))
408406
0 True
@@ -459,7 +457,6 @@ def classify(
459457
460458
>>> import bigframes.pandas as bpd
461459
>>> import bigframes.bigquery as bbq
462-
>>> bpd.options.display.progress_bar = None
463460
>>> df = bpd.DataFrame({'creature': ['Cat', 'Salmon']})
464461
>>> df['type'] = bbq.ai.classify(df['creature'], ['Mammal', 'Fish'])
465462
>>> df
@@ -517,7 +514,6 @@ def score(
517514
518515
>>> import bigframes.pandas as bpd
519516
>>> import bigframes.bigquery as bbq
520-
>>> bpd.options.display.progress_bar = None
521517
>>> animal = bpd.Series(["Tiger", "Rabbit", "Blue Whale"])
522518
>>> bbq.ai.score(("Rank the relative weights of ", animal, " on the scale from 1 to 3")) # doctest: +SKIP
523519
0 2.0
@@ -555,6 +551,91 @@ def score(
555551
return series_list[0]._apply_nary_op(operator, series_list[1:])
556552

557553

554+
@log_adapter.method_logger(custom_base_name="bigquery_ai")
555+
def forecast(
556+
df: dataframe.DataFrame | pd.DataFrame,
557+
*,
558+
data_col: str,
559+
timestamp_col: str,
560+
model: str = "TimesFM 2.0",
561+
id_cols: Iterable[str] | None = None,
562+
horizon: int = 10,
563+
confidence_level: float = 0.95,
564+
context_window: int | None = None,
565+
) -> dataframe.DataFrame:
566+
"""
567+
Forecast time series at future horizon. Using Google Research's open source TimesFM(https://github.com/google-research/timesfm) model.
568+
569+
.. note::
570+
571+
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the
572+
Service Specific Terms(https://cloud.google.com/terms/service-terms#1). Pre-GA products and features are available "as is"
573+
and might have limited support. For more information, see the launch stage descriptions
574+
(https://cloud.google.com/products#product-launch-stages).
575+
576+
Args:
577+
df (DataFrame):
578+
The dataframe that contains the data that you want to forecast. It could be either a BigFrames Dataframe or
579+
a pandas DataFrame. If it's a pandas DataFrame, the global BigQuery session will be used to load the data.
580+
data_col (str):
581+
A str value that specifies the name of the data column. The data column contains the data to forecast.
582+
The data column must use one of the following data types: INT64, NUMERIC and FLOAT64
583+
timestamp_col (str):
584+
A str value that specified the name of the time points column.
585+
The time points column provides the time points used to generate the forecast.
586+
The time points column must use one of the following data types: TIMESTAMP, DATE and DATETIME
587+
model (str, default "TimesFM 2.0"):
588+
A str value that specifies the name of the model. TimesFM 2.0 is the only supported value, and is the default value.
589+
id_cols (Iterable[str], optional):
590+
An iterable of str value that specifies the names of one or more ID columns. Each ID identifies a unique time series to forecast.
591+
Specify one or more values for this argument in order to forecast multiple time series using a single query.
592+
The columns that you specify must use one of the following data types: STRING, INT64, ARRAY<STRING> and ARRAY<INT64>
593+
horizon (int, default 10):
594+
An int value that specifies the number of time points to forecast. The default value is 10. The valid input range is [1, 10,000].
595+
confidence_level (float, default 0.95):
596+
A FLOAT64 value that specifies the percentage of the future values that fall in the prediction interval.
597+
The default value is 0.95. The valid input range is [0, 1).
598+
context_window (int, optional):
599+
An int value that specifies the context window length used by BigQuery ML's built-in TimesFM model.
600+
The context window length determines how many of the most recent data points from the input time series are use by the model.
601+
If you don't specify a value, the AI.FORECAST function automatically chooses the smallest possible context window length to use
602+
that is still large enough to cover the number of time series data points in your input data.
603+
604+
Returns:
605+
DataFrame:
606+
The forecast dataframe matches that of the BigQuery AI.FORECAST function.
607+
See: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-forecast
608+
609+
Raises:
610+
ValueError: when any column ID does not exist in the dataframe.
611+
"""
612+
613+
if isinstance(df, pd.DataFrame):
614+
# Load the pandas DataFrame with global session
615+
df = bpd.read_pandas(df)
616+
617+
columns = [timestamp_col, data_col]
618+
if id_cols:
619+
columns += id_cols
620+
for column in columns:
621+
if column not in df.columns:
622+
raise ValueError(f"Column `{column}` not found")
623+
624+
options: dict[str, Union[int, float, str, Iterable[str]]] = {
625+
"data_col": data_col,
626+
"timestamp_col": timestamp_col,
627+
"model": model,
628+
"horizon": horizon,
629+
"confidence_level": confidence_level,
630+
}
631+
if id_cols:
632+
options["id_cols"] = id_cols
633+
if context_window:
634+
options["context_window"] = context_window
635+
636+
return ml_core.BaseBqml(df._session).ai_forecast(input_data=df, options=options)
637+
638+
558639
def _separate_context_and_series(
559640
prompt: PROMPT_TYPE,
560641
) -> Tuple[List[str | None], List[series.Series]]:

bigframes/bigquery/_operations/approx_agg.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,6 @@ def approx_top_count(
4040
4141
>>> import bigframes.pandas as bpd
4242
>>> import bigframes.bigquery as bbq
43-
>>> bpd.options.display.progress_bar = None
4443
>>> s = bpd.Series(["apple", "apple", "pear", "pear", "pear", "banana"])
4544
>>> bbq.approx_top_count(s, number=2)
4645
[{'value': 'pear', 'count': 3}, {'value': 'apple', 'count': 2}]

bigframes/bigquery/_operations/array.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,6 @@ def array_length(series: series.Series) -> series.Series:
4040
4141
>>> import bigframes.pandas as bpd
4242
>>> import bigframes.bigquery as bbq
43-
>>> bpd.options.display.progress_bar = None
4443
4544
>>> s = bpd.Series([[1, 2, 8, 3], [], [3, 4]])
4645
>>> bbq.array_length(s)
@@ -78,8 +77,6 @@ def array_agg(
7877
7978
>>> import bigframes.pandas as bpd
8079
>>> import bigframes.bigquery as bbq
81-
>>> import numpy as np
82-
>>> bpd.options.display.progress_bar = None
8380
8481
For a SeriesGroupBy object:
8582
@@ -128,8 +125,6 @@ def array_to_string(series: series.Series, delimiter: str) -> series.Series:
128125
129126
>>> import bigframes.pandas as bpd
130127
>>> import bigframes.bigquery as bbq
131-
>>> import numpy as np
132-
>>> bpd.options.display.progress_bar = None
133128
134129
>>> s = bpd.Series([["H", "i", "!"], ["Hello", "World"], np.nan, [], ["Hi"]])
135130
>>> bbq.array_to_string(s, delimiter=", ")

bigframes/bigquery/_operations/datetime.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,8 @@ def unix_seconds(input: series.Series) -> series.Series:
2121
2222
**Examples:**
2323
24-
>>> import pandas as pd
2524
>>> import bigframes.pandas as bpd
2625
>>> import bigframes.bigquery as bbq
27-
>>> bpd.options.display.progress_bar = None
2826
2927
>>> s = bpd.Series([pd.Timestamp("1970-01-02", tz="UTC"), pd.Timestamp("1970-01-03", tz="UTC")])
3028
>>> bbq.unix_seconds(s)
@@ -48,10 +46,8 @@ def unix_millis(input: series.Series) -> series.Series:
4846
4947
**Examples:**
5048
51-
>>> import pandas as pd
5249
>>> import bigframes.pandas as bpd
5350
>>> import bigframes.bigquery as bbq
54-
>>> bpd.options.display.progress_bar = None
5551
5652
>>> s = bpd.Series([pd.Timestamp("1970-01-02", tz="UTC"), pd.Timestamp("1970-01-03", tz="UTC")])
5753
>>> bbq.unix_millis(s)
@@ -75,10 +71,8 @@ def unix_micros(input: series.Series) -> series.Series:
7571
7672
**Examples:**
7773
78-
>>> import pandas as pd
7974
>>> import bigframes.pandas as bpd
8075
>>> import bigframes.bigquery as bbq
81-
>>> bpd.options.display.progress_bar = None
8276
8377
>>> s = bpd.Series([pd.Timestamp("1970-01-02", tz="UTC"), pd.Timestamp("1970-01-03", tz="UTC")])
8478
>>> bbq.unix_micros(s)

bigframes/bigquery/_operations/geo.py

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,6 @@ def st_area(
5353
>>> import bigframes.pandas as bpd
5454
>>> import bigframes.bigquery as bbq
5555
>>> from shapely.geometry import Polygon, LineString, Point
56-
>>> bpd.options.display.progress_bar = None
5756
5857
>>> series = bigframes.geopandas.GeoSeries(
5958
... [
@@ -125,7 +124,6 @@ def st_buffer(
125124
>>> import bigframes.pandas as bpd
126125
>>> import bigframes.bigquery as bbq
127126
>>> from shapely.geometry import Point
128-
>>> bpd.options.display.progress_bar = None
129127
130128
>>> series = bigframes.geopandas.GeoSeries(
131129
... [
@@ -195,7 +193,6 @@ def st_centroid(
195193
>>> import bigframes.pandas as bpd
196194
>>> import bigframes.bigquery as bbq
197195
>>> from shapely.geometry import Polygon, LineString, Point
198-
>>> bpd.options.display.progress_bar = None
199196
200197
>>> series = bigframes.geopandas.GeoSeries(
201198
... [
@@ -250,7 +247,6 @@ def st_convexhull(
250247
>>> import bigframes.pandas as bpd
251248
>>> import bigframes.bigquery as bbq
252249
>>> from shapely.geometry import Polygon, LineString, Point
253-
>>> bpd.options.display.progress_bar = None
254250
255251
>>> series = bigframes.geopandas.GeoSeries(
256252
... [
@@ -312,7 +308,6 @@ def st_difference(
312308
>>> import bigframes.bigquery as bbq
313309
>>> import bigframes.geopandas
314310
>>> from shapely.geometry import Polygon, LineString, Point
315-
>>> bpd.options.display.progress_bar = None
316311
317312
We can check two GeoSeries against each other, row by row:
318313
@@ -407,7 +402,6 @@ def st_distance(
407402
>>> import bigframes.bigquery as bbq
408403
>>> import bigframes.geopandas
409404
>>> from shapely.geometry import Polygon, LineString, Point
410-
>>> bpd.options.display.progress_bar = None
411405
412406
We can check two GeoSeries against each other, row by row.
413407
@@ -489,7 +483,6 @@ def st_intersection(
489483
>>> import bigframes.bigquery as bbq
490484
>>> import bigframes.geopandas
491485
>>> from shapely.geometry import Polygon, LineString, Point
492-
>>> bpd.options.display.progress_bar = None
493486
494487
We can check two GeoSeries against each other, row by row.
495488
@@ -583,7 +576,6 @@ def st_isclosed(
583576
>>> import bigframes.bigquery as bbq
584577
585578
>>> from shapely.geometry import Point, LineString, Polygon
586-
>>> bpd.options.display.progress_bar = None
587579
588580
>>> series = bigframes.geopandas.GeoSeries(
589581
... [
@@ -650,7 +642,6 @@ def st_length(
650642
>>> import bigframes.bigquery as bbq
651643
652644
>>> from shapely.geometry import Polygon, LineString, Point, GeometryCollection
653-
>>> bpd.options.display.progress_bar = None
654645
655646
>>> series = bigframes.geopandas.GeoSeries(
656647
... [

0 commit comments

Comments
 (0)