Skip to content

Incorrect Timestamp gt behavior breaks watermark-based queries in Azure Table Storage #45046

@debadyuti0510

Description

@debadyuti0510
  • Package Name: azure-data-tables
  • Package Version: 12.7.0
  • Operating System: Linux Mint 22.2 (First encountered through Azure Databricks)
  • Python Version: 3.12.3

Describe the bug
Azure Table Storage returns entities when queried with a filter that requests timestamps strictly greater than the maximum timestamp previously returned by the service. In other words, a query of the form “Timestamp greater than the latest observed Timestamp” still yields rows whose timestamps are less than or equal to that value. This breaks expected ordering guarantees relied on by incremental and watermark-based workflows.

To Reproduce
Steps to reproduce the behavior:

  1. Insert four entities into an azure table.
  2. Find the min and max metadata timestamps (min_ts and max_ts) from the the list of entities.
  3. When you use query_entities with the filter, query_filter=f"Timestamp gt datetime'{max_ts}'", it still returns the last record, where an empty list is expected.
  4. Similarly, if queried with min_ts, instead of returning only the last 3 records, it returns all of them.
    Please see below screenshots for clear examples.

Expected behavior
When an explicit filter is passed with Timestamp gt datetime'{max_ts}', no records should be returned.
In short, the gt operator works more like a ge operator.

Screenshots

Image Image

Additional context
This behavior is observed against Azure Table Storage (standard storage account), not Cosmos DB Table. The issue was first observed in an Azure Databricks environment and was later reproduced consistently on a local machine to rule out environment-specific factors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Service AttentionWorkflow: This issue is responsible by Azure service team.StorageStorage Service (Queues, Blobs, Files)customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions