Skip to content

Commit 87dd1c4

Browse files
authored
Merge branch 'main' into migrate-datetime-to-integer-label-op
2 parents 787e47c + 41630b5 commit 87dd1c4

File tree

158 files changed

+11189
-1799
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

158 files changed

+11189
-1799
lines changed

.github/workflows/js-tests.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
name: js-tests
2+
on:
3+
pull_request:
4+
branches:
5+
- main
6+
push:
7+
branches:
8+
- main
9+
jobs:
10+
build:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- name: Checkout
14+
uses: actions/checkout@v4
15+
- name: Install modules
16+
working-directory: ./tests/js
17+
run: npm install
18+
- name: Run tests
19+
working-directory: ./tests/js
20+
run: npm test

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ coverage.xml
5858

5959
# System test environment variables.
6060
system_tests/local_test_setup
61+
tests/js/node_modules/
6162

6263
# Make sure a generated file isn't accidentally committed.
6364
pylintrc

.librarian/state.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
image: us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:c8612d3fffb3f6a32353b2d1abd16b61e87811866f7ec9d65b59b02eb452a620
22
libraries:
33
- id: bigframes
4-
version: 2.28.0
4+
version: 2.29.1
55
apis: []
66
source_roots:
77
- .

README.rst

Lines changed: 18 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,25 @@ BigQuery DataFrames (BigFrames)
66
|GA| |pypi| |versions|
77

88
BigQuery DataFrames (also known as BigFrames) provides a Pythonic DataFrame
9-
and machine learning (ML) API powered by the BigQuery engine.
9+
and machine learning (ML) API powered by the BigQuery engine. It provides modules
10+
for many use cases, including:
1011

11-
* `bigframes.pandas` provides a pandas API for analytics. Many workloads can be
12+
* `bigframes.pandas <https://dataframes.bigquery.dev/reference/api/bigframes.pandas.html>`_
13+
is a pandas API for analytics. Many workloads can be
1214
migrated from pandas to bigframes by just changing a few imports.
13-
* ``bigframes.ml`` provides a scikit-learn-like API for ML.
15+
* `bigframes.ml <https://dataframes.bigquery.dev/reference/index.html#ml-apis>`_
16+
is a scikit-learn-like API for ML.
17+
* `bigframes.bigquery.ai <https://dataframes.bigquery.dev/reference/api/bigframes.bigquery.ai.html>`_
18+
are a collection of powerful AI methods, powered by Gemini.
1419

15-
BigQuery DataFrames is an open-source package.
20+
BigQuery DataFrames is an `open-source package <https://github.com/googleapis/python-bigquery-dataframes>`_.
1621

17-
**Version 2.0 introduces breaking changes for improved security and performance. See below for details.**
22+
.. |GA| image:: https://img.shields.io/badge/support-GA-gold.svg
23+
:target: https://github.com/googleapis/google-cloud-python/blob/main/README.rst#general-availability
24+
.. |pypi| image:: https://img.shields.io/pypi/v/bigframes.svg
25+
:target: https://pypi.org/project/bigframes/
26+
.. |versions| image:: https://img.shields.io/pypi/pyversions/bigframes.svg
27+
:target: https://pypi.org/project/bigframes/
1828

1929
Getting started with BigQuery DataFrames
2030
----------------------------------------
@@ -38,7 +48,8 @@ To use BigFrames in your local development environment,
3848
3949
import bigframes.pandas as bpd
4050
41-
bpd.options.bigquery.project = your_gcp_project_id
51+
bpd.options.bigquery.project = your_gcp_project_id # Optional in BQ Studio.
52+
bpd.options.bigquery.ordering_mode = "partial" # Recommended for performance.
4253
df = bpd.read_gbq("bigquery-public-data.usa_names.usa_1910_2013")
4354
print(
4455
df.groupby("name")
@@ -48,49 +59,16 @@ To use BigFrames in your local development environment,
4859
.to_pandas()
4960
)
5061
51-
5262
Documentation
5363
-------------
5464

5565
To learn more about BigQuery DataFrames, visit these pages
5666

5767
* `Introduction to BigQuery DataFrames (BigFrames) <https://cloud.google.com/bigquery/docs/bigquery-dataframes-introduction>`_
5868
* `Sample notebooks <https://github.com/googleapis/python-bigquery-dataframes/tree/main/notebooks>`_
59-
* `API reference <https://cloud.google.com/python/docs/reference/bigframes/latest/summary_overview>`_
69+
* `API reference <https://dataframes.bigquery.dev/>`_
6070
* `Source code (GitHub) <https://github.com/googleapis/python-bigquery-dataframes>`_
6171

62-
⚠️ Warning: Breaking Changes in BigQuery DataFrames v2.0
63-
--------------------------------------------------------
64-
65-
Version 2.0 introduces breaking changes for improved security and performance. Key default behaviors have changed, including
66-
67-
* **Large Results (>10GB):** The default value for ``allow_large_results`` has changed to ``False``.
68-
Methods like ``to_pandas()`` will now fail if the query result's compressed data size exceeds 10GB,
69-
unless large results are explicitly permitted.
70-
* **Remote Function Security:** The library no longer automatically lets the Compute Engine default service
71-
account become the identity of the Cloud Run functions. If that is desired, it has to be indicated by passing
72-
``cloud_function_service_account="default"``. And network ingress now defaults to ``"internal-only"``.
73-
* **@remote_function Argument Passing:** Arguments other than ``input_types``, ``output_type``, and ``dataset``
74-
to ``remote_function`` must now be passed using keyword syntax, as positional arguments are no longer supported.
75-
* **@udf Argument Passing:** Arguments ``dataset`` and ``name`` to ``udf`` are now mandatory.
76-
* **Endpoint Connections:** Automatic fallback to locational endpoints in certain regions is removed.
77-
* **LLM Updates (Gemini Integration):** Integrations now default to the ``gemini-2.0-flash-001`` model.
78-
PaLM2 support has been removed; please migrate any existing PaLM2 usage to Gemini. **Note:** The current default
79-
model will be removed in Version 3.0.
80-
81-
**Important:** If you are not ready to adapt to these changes, please pin your dependency to a version less than 2.0
82-
(e.g., ``bigframes==1.42.0``) to avoid disruption.
83-
84-
To learn about these changes and how to migrate to version 2.0, see the
85-
`updated introduction guide <https://cloud.google.com/bigquery/docs/bigquery-dataframes-introduction>`_.
86-
87-
.. |GA| image:: https://img.shields.io/badge/support-GA-gold.svg
88-
:target: https://github.com/googleapis/google-cloud-python/blob/main/README.rst#general-availability
89-
.. |pypi| image:: https://img.shields.io/pypi/v/bigframes.svg
90-
:target: https://pypi.org/project/bigframes/
91-
.. |versions| image:: https://img.shields.io/pypi/pyversions/bigframes.svg
92-
:target: https://pypi.org/project/bigframes/
93-
9472
License
9573
-------
9674

bigframes/_config/__init__.py

Lines changed: 14 additions & 165 deletions
Original file line numberDiff line numberDiff line change
@@ -17,175 +17,24 @@
1717
DataFrames from this package.
1818
"""
1919

20-
from __future__ import annotations
21-
22-
import copy
23-
from dataclasses import dataclass, field
24-
import threading
25-
from typing import Optional
26-
27-
import bigframes_vendored.pandas._config.config as pandas_config
28-
29-
import bigframes._config.bigquery_options as bigquery_options
30-
import bigframes._config.compute_options as compute_options
31-
import bigframes._config.display_options as display_options
32-
import bigframes._config.experiment_options as experiment_options
33-
import bigframes._config.sampling_options as sampling_options
34-
35-
36-
@dataclass
37-
class ThreadLocalConfig(threading.local):
38-
# If unset, global settings will be used
39-
bigquery_options: Optional[bigquery_options.BigQueryOptions] = None
40-
# Note: use default factory instead of default instance so each thread initializes to default values
41-
display_options: display_options.DisplayOptions = field(
42-
default_factory=display_options.DisplayOptions
43-
)
44-
sampling_options: sampling_options.SamplingOptions = field(
45-
default_factory=sampling_options.SamplingOptions
46-
)
47-
compute_options: compute_options.ComputeOptions = field(
48-
default_factory=compute_options.ComputeOptions
49-
)
50-
experiment_options: experiment_options.ExperimentOptions = field(
51-
default_factory=experiment_options.ExperimentOptions
52-
)
53-
54-
55-
class Options:
56-
"""Global options affecting BigQuery DataFrames behavior."""
57-
58-
def __init__(self):
59-
self.reset()
60-
61-
def reset(self) -> Options:
62-
"""Reset the option settings to defaults.
63-
64-
Returns:
65-
bigframes._config.Options: Options object with default values.
66-
"""
67-
self._local = ThreadLocalConfig()
68-
69-
# BigQuery options are special because they can only be set once per
70-
# session, so we need an indicator as to whether we are using the
71-
# thread-local session or the global session.
72-
self._bigquery_options = bigquery_options.BigQueryOptions()
73-
return self
74-
75-
def _init_bigquery_thread_local(self):
76-
"""Initialize thread-local options, based on current global options."""
77-
78-
# Already thread-local, so don't reset any options that have been set
79-
# already. No locks needed since this only modifies thread-local
80-
# variables.
81-
if self._local.bigquery_options is not None:
82-
return
83-
84-
self._local.bigquery_options = copy.deepcopy(self._bigquery_options)
85-
self._local.bigquery_options._session_started = False
86-
87-
@property
88-
def bigquery(self) -> bigquery_options.BigQueryOptions:
89-
"""Options to use with the BigQuery engine.
90-
91-
Returns:
92-
bigframes._config.bigquery_options.BigQueryOptions:
93-
Options for BigQuery engine.
94-
"""
95-
if self._local.bigquery_options is not None:
96-
# The only way we can get here is if someone called
97-
# _init_bigquery_thread_local.
98-
return self._local.bigquery_options
99-
100-
return self._bigquery_options
101-
102-
@property
103-
def display(self) -> display_options.DisplayOptions:
104-
"""Options controlling object representation.
105-
106-
Returns:
107-
bigframes._config.display_options.DisplayOptions:
108-
Options for controlling object representation.
109-
"""
110-
return self._local.display_options
111-
112-
@property
113-
def sampling(self) -> sampling_options.SamplingOptions:
114-
"""Options controlling downsampling when downloading data
115-
to memory.
116-
117-
The data can be downloaded into memory explicitly
118-
(e.g., to_pandas, to_numpy, values) or implicitly (e.g.,
119-
matplotlib plotting). This option can be overridden by
120-
parameters in specific functions.
121-
122-
Returns:
123-
bigframes._config.sampling_options.SamplingOptions:
124-
Options for controlling downsampling.
125-
"""
126-
return self._local.sampling_options
127-
128-
@property
129-
def compute(self) -> compute_options.ComputeOptions:
130-
"""Thread-local options controlling object computation.
131-
132-
Returns:
133-
bigframes._config.compute_options.ComputeOptions:
134-
Thread-local options for controlling object computation
135-
"""
136-
return self._local.compute_options
137-
138-
@property
139-
def experiments(self) -> experiment_options.ExperimentOptions:
140-
"""Options controlling experiments
141-
142-
Returns:
143-
bigframes._config.experiment_options.ExperimentOptions:
144-
Thread-local options for controlling experiments
145-
"""
146-
return self._local.experiment_options
147-
148-
@property
149-
def is_bigquery_thread_local(self) -> bool:
150-
"""Indicator that we're using a thread-local session.
151-
152-
A thread-local session can be started by using
153-
`with bigframes.option_context("bigquery.some_option", "some-value"):`.
154-
155-
Returns:
156-
bool:
157-
A boolean value, where a value is True if a thread-local session
158-
is in use; otherwise False.
159-
"""
160-
return self._local.bigquery_options is not None
161-
162-
@property
163-
def _allow_large_results(self) -> bool:
164-
"""The effective 'allow_large_results' setting.
165-
166-
This value is `self.compute.allow_large_results` if set (not `None`),
167-
otherwise it defaults to `self.bigquery.allow_large_results`.
168-
169-
Returns:
170-
bool:
171-
Whether large query results are permitted.
172-
- `True`: The BigQuery result size limit (e.g., 10 GB) is removed.
173-
- `False`: Results are restricted to this limit (potentially faster).
174-
BigQuery will raise an error if this limit is exceeded.
175-
"""
176-
if self.compute.allow_large_results is None:
177-
return self.bigquery.allow_large_results
178-
return self.compute.allow_large_results
179-
180-
181-
options = Options()
182-
"""Global options for default session."""
183-
184-
option_context = pandas_config.option_context
20+
from bigframes._config.bigquery_options import BigQueryOptions
21+
from bigframes._config.compute_options import ComputeOptions
22+
from bigframes._config.display_options import DisplayOptions
23+
from bigframes._config.experiment_options import ExperimentOptions
24+
from bigframes._config.global_options import option_context, Options
25+
import bigframes._config.global_options as global_options
26+
from bigframes._config.sampling_options import SamplingOptions
18527

28+
options = global_options.options
29+
"""Global options for the default session."""
18630

18731
__all__ = (
18832
"Options",
18933
"options",
19034
"option_context",
35+
"BigQueryOptions",
36+
"ComputeOptions",
37+
"DisplayOptions",
38+
"ExperimentOptions",
39+
"SamplingOptions",
19140
)

0 commit comments

Comments
 (0)