Skip to content

Conversation

@dlevy-msft-sql
Copy link
Contributor

@dlevy-msft-sql dlevy-msft-sql commented Dec 1, 2025

Work Item / Issue Reference

GitHub Issue: #352


Summary

This PR fixes cursor.description to properly return SQL type codes per DB-API 2.0 specification, while maintaining backwards compatibility with libraries like pandas that compare against Python types.

Key Changes

  1. New SQLTypeCode class - A dual-compatible type code that compares equal to both:

    • SQL type integers (e.g., desc[1] == -9) for DB-API 2.0 compliance
    • Python types (e.g., desc[1] == str) for pandas/library compatibility
  2. SQL Server spatial type support - Added handling for:

    • geography (e.g., GPS coordinates, regions)
    • geometry (e.g., shapes, planar coordinates)
    • hierarchyid (e.g., org charts, tree structures)
  3. ODBC 3.x date/time type codes - Added support for:

    • SQL_TYPE_DATE (91)
    • SQL_TYPE_TIME (92)
    • SQL_TYPE_TIMESTAMP (93)
    • SQL_TYPE_TIMESTAMP_WITH_TIMEZONE (95)
  4. Hash/equality contract fix - Made SQLTypeCode unhashable since __eq__ compares to multiple types with different hash values


Files Changed

File Changes
mssql_python/cursor.py Added SQLTypeCode class, updated _initialize_description()
mssql_python/constants.py Added SQL_SS_UDT, SQL_SS_TIME2, SQL_SS_XML constants
mssql_python/pybind/ddbc_bindings.cpp Added C++ handling for spatial types
mssql_python/__init__.py Exported SQLTypeCode
mssql_python/mssql_python.pyi Added type stubs for SQLTypeCode
tests/test_002_types.py Added SQLTypeCode unit tests
tests/test_004_cursor.py Added spatial type tests, thread safety tests

Usage Examples

SQLTypeCode - Dual Compatibility

cursor.execute("SELECT id, name FROM users")
desc = cursor.description

# Style 1: Compare with Python types (pandas compatibility)
if desc[0][1] == int:
    print("Integer column")

# Style 2: Compare with SQL type codes (DB-API 2.0)
if desc[0][1] == 4:  # SQL_INTEGER
    print("Integer column")

# Get raw SQL type code
type_code = int(desc[0][1])  # Returns 4 for SQL_INTEGER

Spatial Types

# Geography (GPS coordinates)
cursor.execute("""
    INSERT INTO locations (geo) 
    VALUES (geography::STGeomFromText('POINT(-122.349 47.651)', 4326))
""")

# Fetch as binary
cursor.execute("SELECT geo FROM locations")
geo_bytes = cursor.fetchone()[0]  # Returns bytes

# Or fetch as WKT text
cursor.execute("SELECT geo.STAsText() FROM locations")
wkt = cursor.fetchone()[0]  # Returns "POINT (-122.349 47.651)"

Testing

  • All existing tests pass
  • Added 40+ new tests for spatial types (geography, geometry, hierarchyid)
  • Added thread safety tests for _column_metadata
  • Added SQLTypeCode unit tests
  • Coverage: 98% diff coverage, 76% overall

Copilot Review Comments Addressed

Comment Resolution
Missing ODBC 3.x date/time codes Added 91, 92, 93, 95 to _type_map
Hash/eq contract violation Set __hash__ = None (unhashable)
Geography executemany test incomplete Test now inserts and verifies geo_col

Breaking Changes

None. The SQLTypeCode class maintains full backwards compatibility:

  • cursor.description[i][1] == str still works ✅
  • cursor.description[i][1] == -9 also works ✅

Copilot AI review requested due to automatic review settings December 1, 2025 17:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes issue #352 by implementing two main improvements: (1) correcting cursor.description to return SQL type codes (integers) instead of Python type objects, ensuring DB-API 2.0 specification compliance, and (2) adding support for SQL Server spatial data types (geography, geometry, hierarchyid) by handling the SQL_SS_UDT type code (-151).

Key changes:

  • Fixed cursor.description[i][1] to return SQL type integer codes (e.g., 4, -9) instead of Python types (int, str) per DB-API 2.0 spec
  • Added SQL_SS_UDT (-151) support for SQL Server User-Defined Types including spatial data types
  • Updated output converter lookup to use SQL type codes consistently throughout the codebase

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
mssql_python/cursor.py Changed cursor.description to return SQL type codes instead of Python types; added _column_metadata storage; updated _build_converter_map to use SQL type codes; added mappings for UDT, XML, DATETIME2, SMALLDATETIME types
mssql_python/constants.py Added SQL_SS_UDT = -151 constant for SQL Server User-Defined Types
mssql_python/pybind/ddbc_bindings.cpp Added C++ constants for SQL_SS_UDT, SQL_DATETIME2, SQL_SMALLDATETIME; implemented UDT handling in SQLGetData_wrap, FetchBatchData, FetchMany_wrap, and FetchAll_wrap for LOB streaming
tests/test_004_cursor.py Added lob_wvarchar_column to test schema; updated test_cursor_description to expect SQL type codes; added comprehensive geography type test suite (14 new tests); separated LOB and non-LOB fetch tests; fixed output converter test for UTF-16LE encoding
tests/test_003_connection.py Simplified converter integration tests to use SQL type constants directly instead of dynamic type detection

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dlevy-msft-sql dlevy-msft-sql added bug Something isn't working pr-size: medium Moderate update size labels Dec 1, 2025
@github-actions
Copy link

github-actions bot commented Jan 5, 2026

📊 Code Coverage Report

🔥 Diff Coverage

90%


🎯 Overall Coverage

76%


📈 Total Lines Covered: 5529 out of 7199
📁 Project: mssql-python


Diff Coverage

Diff: main...HEAD, staged and unstaged changes

  • mssql_python/init.py (100%)
  • mssql_python/connection.py (66.7%): Missing lines 982,1001
  • mssql_python/constants.py (100%)
  • mssql_python/cursor.py (87.8%): Missing lines 134,152,1451,1476,2529
  • mssql_python/pybind/ddbc_bindings.cpp (100%)

Summary

  • Total: 72 lines
  • Missing: 7 lines
  • Coverage: 90%

mssql_python/connection.py

Lines 978-986


Lines 997-1005


mssql_python/cursor.py

Lines 130-138


Lines 148-156


Lines 1447-1455


Lines 1472-1480


Lines 2525-2533



📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)
mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.4%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.connection.py: 83.8%
mssql_python.__init__.py: 84.9%

🔗 Quick Links

⚙️ Build Summary 📋 Coverage Details

View Azure DevOps Build

Browse Full Coverage Report

@dlevy-msft-sql dlevy-msft-sql added pr-size: small Minimal code update pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Jan 16, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

mssql_python/mssql_python.pyi:206

  • The stub now types Cursor.description[*][1] as SQLTypeCode, but runtime code still sets description to fallback tuples containing Python types (e.g., tables()/columns() fallback_description uses str, int, etc. in mssql_python/cursor.py). This makes the public .pyi overly strict and inaccurate for metadata result sets. Consider widening the stub to Union[SQLTypeCode, int, type] (or Any) for the type_code position, and do the same for the Row constructor description parameter if it’s meant to model the same structure.
    # description is a sequence of 7-item tuples:
    # (name, type_code, display_size, internal_size, precision, scale, null_ok)
    # type_code is SQLTypeCode which compares equal to both SQL integers and Python types
    description: Optional[
        List[
            Tuple[
                str,
                SQLTypeCode,
                Optional[int],
                Optional[int],
                Optional[int],
                Optional[int],
                Optional[bool],
            ]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- test_004_cursor.py: Assert int(desc[1][1]) == -151 for geography type code
- ddbc_bindings.cpp: Log message now includes geometry/geography/hierarchyid
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (3)

mssql_python/mssql_python.pyi:170

  • The stubs now require Row(..., description=...) to use SQLTypeCode for the type_code field, but Cursor._prepare_metadata_result_set can fall back to hard-coded descriptions that use built-in Python types (e.g., str/int) when DescribeCol fails. To match runtime behavior, consider widening this type to allow both SQLTypeCode and type/int (or using Any/Union).
        cursor: "Cursor",
        description: List[
            Tuple[
                str,
                SQLTypeCode,
                Optional[int],
                Optional[int],
                Optional[int],
                Optional[int],
                Optional[bool],
            ]
        ],

mssql_python/mssql_python.pyi:208

  • Cursor.description is typed as SQLTypeCode in the stubs, but Cursor._prepare_metadata_result_set can assign a fallback_description containing built-in Python types (str/int) when metadata retrieval fails. To avoid incorrect type checking for those result sets, consider widening the stub type_code element to Union[SQLTypeCode, type, int] (or similar).
    # DB-API 2.0 Required Attributes
    # description is a sequence of 7-item tuples:
    # (name, type_code, display_size, internal_size, precision, scale, null_ok)
    # type_code is SQLTypeCode which compares equal to both SQL integers and Python types
    description: Optional[
        List[
            Tuple[
                str,
                SQLTypeCode,
                Optional[int],
                Optional[int],
                Optional[int],
                Optional[int],
                Optional[bool],
            ]
        ]
    ]

mssql_python/cursor.py:1111

  • _build_converter_map() currently falls back to the SQL_WVARCHAR converter whenever no type-specific converter is registered, regardless of the column’s actual SQL type. With the optimized converter path, this can (a) unintentionally apply a text converter to non-text columns (e.g., VARBINARY/INT), and/or (b) trigger exceptions that get silently swallowed, hurting performance and masking converter issues. Consider only applying the WVARCHAR fallback for text SQL types (CHAR/VARCHAR/WCHAR/WVARCHAR/LONG variants), and leaving other columns’ converter as None.
        for col_meta in self._column_metadata:
            # Use the raw SQL type code from metadata, not the mapped Python type
            sql_type = col_meta["DataType"]
            converter = self.connection.get_output_converter(sql_type)
            # If no converter found for the SQL type, try the WVARCHAR converter as a fallback
            if converter is None:
                from mssql_python.constants import ConstantsDDBC

                converter = self.connection.get_output_converter(ConstantsDDBC.SQL_WVARCHAR.value)

            converter_map.append(converter)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- cursor.py: Clear _column_metadata wherever description is set to None
  (prevents stale metadata from previous queries)
- test_004_cursor.py: Add is_alive() check after thread join to detect hung threads
- test_004_cursor.py: Use encode('utf-16-le') instead of hard-coded bytes
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add #ifndef guards for SQL_DATETIME2 and SQL_SMALLDATETIME to prevent
  macro redefinition errors with ODBC headers (-Werror on GCC/Clang)
- Add missing charEncoding argument to FetchLobColumnData call for SQL_SS_UDT
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

mssql_python/cursor.py:1109

  • _build_converter_map() imports ConstantsDDBC inside the per-column loop. This can be avoided by using the already-imported ddbc_sql_const (or importing once outside the loop) to reduce repeated import overhead.
            # If no converter found for the SQL type, try the WVARCHAR converter as a fallback
            if converter is None:
                from mssql_python.constants import ConstantsDDBC

                converter = self.connection.get_output_converter(ConstantsDDBC.SQL_WVARCHAR.value)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…e_code

The fallback_description paths in _prepare_metadata_result_set can return
plain Python types, so the stub should reflect this to avoid incorrect
type-checker failures.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- cursor.py: Use Optional[type] annotation for SQLTypeCode.__init__ python_type
- test_004_cursor.py: Make threads daemon to prevent pytest from hanging
  if threads don't finish within timeout
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working pr-size: medium Moderate update size pr-size: small Minimal code update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cursor.description returning incorrect values

2 participants