Skip to content

Poor performance for boundless reads #227

@AsgerPetersen

Description

@AsgerPetersen

I would like to propose exposing a way to toggle boundless reading for two reasons:

  1. There are use cases where features being outside the raster extent is an error. For example in my job I am provided with countrywide rasters and I collect statistics from these rasters for buildings and roads. If a feature is outside the raster extent something is wrong with either the feature or the raster.

  2. Enabling boundless reading in rasterio seriously degrades performance in some cases. It looks like the dataset is opened for each feature and the block cache is effectively disabled when using boundless reading. This gist https://gist.github.com/AsgerPetersen/6f9c8120b85e462ccbc26191a2117b3a demonstrates a performance improvement about 50x when disabling boundless reading. On my real world data the performance improvement is in the order of 200x.

I implemented it for my own usage here: AsgerPetersen@c375094.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions