-
Notifications
You must be signed in to change notification settings - Fork 36
FIX: Cursor.describe invalid data #355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes issue #352 by implementing two main improvements: (1) correcting cursor.description to return SQL type codes (integers) instead of Python type objects, ensuring DB-API 2.0 specification compliance, and (2) adding support for SQL Server spatial data types (geography, geometry, hierarchyid) by handling the SQL_SS_UDT type code (-151).
Key changes:
- Fixed
cursor.description[i][1]to return SQL type integer codes (e.g., 4, -9) instead of Python types (int, str) per DB-API 2.0 spec - Added SQL_SS_UDT (-151) support for SQL Server User-Defined Types including spatial data types
- Updated output converter lookup to use SQL type codes consistently throughout the codebase
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| mssql_python/cursor.py | Changed cursor.description to return SQL type codes instead of Python types; added _column_metadata storage; updated _build_converter_map to use SQL type codes; added mappings for UDT, XML, DATETIME2, SMALLDATETIME types |
| mssql_python/constants.py | Added SQL_SS_UDT = -151 constant for SQL Server User-Defined Types |
| mssql_python/pybind/ddbc_bindings.cpp | Added C++ constants for SQL_SS_UDT, SQL_DATETIME2, SQL_SMALLDATETIME; implemented UDT handling in SQLGetData_wrap, FetchBatchData, FetchMany_wrap, and FetchAll_wrap for LOB streaming |
| tests/test_004_cursor.py | Added lob_wvarchar_column to test schema; updated test_cursor_description to expect SQL type codes; added comprehensive geography type test suite (14 new tests); separated LOB and non-LOB fetch tests; fixed output converter test for UTF-16LE encoding |
| tests/test_003_connection.py | Simplified converter integration tests to use SQL type constants directly instead of dynamic type detection |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
3f52dd7 to
646ef04
Compare
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/connection.pyLines 978-986 Lines 997-1005 mssql_python/cursor.pyLines 130-138 Lines 148-156 Lines 1447-1455 Lines 1472-1480 Lines 2525-2533 📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.4%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.connection.py: 83.8%
mssql_python.__init__.py: 84.9%🔗 Quick Links
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
mssql_python/mssql_python.pyi:206
- The stub now types
Cursor.description[*][1]asSQLTypeCode, but runtime code still setsdescriptionto fallback tuples containing Python types (e.g.,tables()/columns()fallback_description usesstr,int, etc. inmssql_python/cursor.py). This makes the public.pyioverly strict and inaccurate for metadata result sets. Consider widening the stub toUnion[SQLTypeCode, int, type](orAny) for thetype_codeposition, and do the same for theRowconstructordescriptionparameter if it’s meant to model the same structure.
# description is a sequence of 7-item tuples:
# (name, type_code, display_size, internal_size, precision, scale, null_ok)
# type_code is SQLTypeCode which compares equal to both SQL integers and Python types
description: Optional[
List[
Tuple[
str,
SQLTypeCode,
Optional[int],
Optional[int],
Optional[int],
Optional[int],
Optional[bool],
]
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- test_004_cursor.py: Assert int(desc[1][1]) == -151 for geography type code - ddbc_bindings.cpp: Log message now includes geometry/geography/hierarchyid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (3)
mssql_python/mssql_python.pyi:170
- The stubs now require Row(..., description=...) to use SQLTypeCode for the type_code field, but Cursor._prepare_metadata_result_set can fall back to hard-coded descriptions that use built-in Python types (e.g., str/int) when DescribeCol fails. To match runtime behavior, consider widening this type to allow both SQLTypeCode and type/int (or using Any/Union).
cursor: "Cursor",
description: List[
Tuple[
str,
SQLTypeCode,
Optional[int],
Optional[int],
Optional[int],
Optional[int],
Optional[bool],
]
],
mssql_python/mssql_python.pyi:208
- Cursor.description is typed as SQLTypeCode in the stubs, but Cursor._prepare_metadata_result_set can assign a fallback_description containing built-in Python types (str/int) when metadata retrieval fails. To avoid incorrect type checking for those result sets, consider widening the stub type_code element to Union[SQLTypeCode, type, int] (or similar).
# DB-API 2.0 Required Attributes
# description is a sequence of 7-item tuples:
# (name, type_code, display_size, internal_size, precision, scale, null_ok)
# type_code is SQLTypeCode which compares equal to both SQL integers and Python types
description: Optional[
List[
Tuple[
str,
SQLTypeCode,
Optional[int],
Optional[int],
Optional[int],
Optional[int],
Optional[bool],
]
]
]
mssql_python/cursor.py:1111
- _build_converter_map() currently falls back to the SQL_WVARCHAR converter whenever no type-specific converter is registered, regardless of the column’s actual SQL type. With the optimized converter path, this can (a) unintentionally apply a text converter to non-text columns (e.g., VARBINARY/INT), and/or (b) trigger exceptions that get silently swallowed, hurting performance and masking converter issues. Consider only applying the WVARCHAR fallback for text SQL types (CHAR/VARCHAR/WCHAR/WVARCHAR/LONG variants), and leaving other columns’ converter as None.
for col_meta in self._column_metadata:
# Use the raw SQL type code from metadata, not the mapped Python type
sql_type = col_meta["DataType"]
converter = self.connection.get_output_converter(sql_type)
# If no converter found for the SQL type, try the WVARCHAR converter as a fallback
if converter is None:
from mssql_python.constants import ConstantsDDBC
converter = self.connection.get_output_converter(ConstantsDDBC.SQL_WVARCHAR.value)
converter_map.append(converter)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- cursor.py: Clear _column_metadata wherever description is set to None
(prevents stale metadata from previous queries)
- test_004_cursor.py: Add is_alive() check after thread join to detect hung threads
- test_004_cursor.py: Use encode('utf-16-le') instead of hard-coded bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add #ifndef guards for SQL_DATETIME2 and SQL_SMALLDATETIME to prevent macro redefinition errors with ODBC headers (-Werror on GCC/Clang) - Add missing charEncoding argument to FetchLobColumnData call for SQL_SS_UDT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
mssql_python/cursor.py:1109
- _build_converter_map() imports ConstantsDDBC inside the per-column loop. This can be avoided by using the already-imported ddbc_sql_const (or importing once outside the loop) to reduce repeated import overhead.
# If no converter found for the SQL type, try the WVARCHAR converter as a fallback
if converter is None:
from mssql_python.constants import ConstantsDDBC
converter = self.connection.get_output_converter(ConstantsDDBC.SQL_WVARCHAR.value)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
…e_code The fallback_description paths in _prepare_metadata_result_set can return plain Python types, so the stub should reflect this to avoid incorrect type-checker failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- cursor.py: Use Optional[type] annotation for SQLTypeCode.__init__ python_type - test_004_cursor.py: Make threads daemon to prevent pytest from hanging if threads don't finish within timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
Work Item / Issue Reference
Summary
This PR fixes
cursor.descriptionto properly return SQL type codes per DB-API 2.0 specification, while maintaining backwards compatibility with libraries like pandas that compare against Python types.Key Changes
New
SQLTypeCodeclass - A dual-compatible type code that compares equal to both:desc[1] == -9) for DB-API 2.0 compliancedesc[1] == str) for pandas/library compatibilitySQL Server spatial type support - Added handling for:
geography(e.g., GPS coordinates, regions)geometry(e.g., shapes, planar coordinates)hierarchyid(e.g., org charts, tree structures)ODBC 3.x date/time type codes - Added support for:
SQL_TYPE_DATE(91)SQL_TYPE_TIME(92)SQL_TYPE_TIMESTAMP(93)SQL_TYPE_TIMESTAMP_WITH_TIMEZONE(95)Hash/equality contract fix - Made
SQLTypeCodeunhashable since__eq__compares to multiple types with different hash valuesFiles Changed
mssql_python/cursor.pySQLTypeCodeclass, updated_initialize_description()mssql_python/constants.pySQL_SS_UDT,SQL_SS_TIME2,SQL_SS_XMLconstantsmssql_python/pybind/ddbc_bindings.cppmssql_python/__init__.pySQLTypeCodemssql_python/mssql_python.pyiSQLTypeCodetests/test_002_types.pySQLTypeCodeunit teststests/test_004_cursor.pyUsage Examples
SQLTypeCode - Dual Compatibility
Spatial Types
Testing
_column_metadataSQLTypeCodeunit testsCopilot Review Comments Addressed
_type_map__hash__ = None(unhashable)geo_colBreaking Changes
None. The
SQLTypeCodeclass maintains full backwards compatibility:cursor.description[i][1] == strstill works ✅cursor.description[i][1] == -9also works ✅