Skip to content

Commit ae435ba

Browse files
sander-goossusodapop
authored andcommitted
Use date_as_object and timestamp_as_object in pandas conversion pysql client
Use date_as_object and timestamp_as_object options in the to_pandas conversion from Arrow to prevent errors with out-of-bound values. See also: https://issues.apache.org/jira/browse/ARROW-5359 Add test cases to TimestampTestsMixin (would be nice if we could replace those with unit tests)
1 parent 10ab6f8 commit ae435ba

File tree

1 file changed

+2
-8
lines changed
  • cmdexec/clients/python/src/databricks/sql

1 file changed

+2
-8
lines changed

cmdexec/clients/python/src/databricks/sql/client.py

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -537,14 +537,8 @@ def _convert_arrow_table(self, table):
537537

538538
# Need to rename columns, as the to_pandas function cannot handle duplicate column names
539539
table_renamed = table.rename_columns([str(c) for c in range(table.num_columns)])
540-
df = table_renamed.to_pandas(types_mapper=dtype_mapping.get)
541-
542-
for (i, col) in enumerate(df.columns):
543-
# Check for 0 because .dt doesn't work on empty series
544-
if self.description[i][1] == 'timestamp' and len(df) > 0:
545-
# We store the dtype as object so we don't use the pandas datetime dtype but
546-
# a native datetime.datetime
547-
df[col] = pandas.Series(df[col].dt.to_pydatetime(), dtype='object')
540+
df = table_renamed.to_pandas(
541+
types_mapper=dtype_mapping.get, date_as_object=True, timestamp_as_object=True)
548542

549543
res = df.to_numpy(na_value=None)
550544
return [ResultRow(*v) for v in res]

0 commit comments

Comments
 (0)