chain.grid_resampling_chain

Module for a Grid Resampling Chain # @doc

gridr.chain.grid_resampling_chain.GEOMETRY_RASTERIZE_KWARGS = {'alg': GridRasterizeAlg.RASTERIO_RASTERIZE}

Performs an additional validation of the grid source boundaries to ensure topological consistency.

This check computes the source boundaries from all valid grid data within the current computed region, verifying that the source boundaries extracted from grid metrics align with the hull border. When using grid metrics only, we assumes that points inside the source hull correspond to points within the target hull, maintaining topological integrity. If this assumption is violated, the read window may be insufficient, potentially causing a Rust panic when attempting to access out-of-bounds indices.

This safety check helps prevent such runtime errors by proactively extending boundary conditions if required.

gridr.chain.grid_resampling_chain.basic_grid_resampling_array(interp, grid_arr, grid_arr_shape, grid_resolution, grid_nodata, grid_mask_arr, grid_mask_in_unmasked_value, array_src_ds, array_src_bands, array_src_mask_ds, array_src_mask_band, array_src_mask_validity_pair, array_src_geometry_origin, array_src_geometry_pair, oversampled_grid_win, margin, sma_out_buffer, out_win, nodata_out, boundary_condition, sma_out_mask_buffer, logger_msg_prefix, logger)[source]

Resamples source data into a target oversampled window on a grid.

This method processes a 3D raster grid (grid_arr) that includes row and column coordinates. The grid_arr may represent a sub-region of a larger, non-oversampled grid. Additionally, since grid_arr might correspond to a uniquely allocated buffer used for multiple sub-regions, we cannot rely on its shape attribute to determine the dimensions of the sub-region. Therefore, the grid_arr_shape argument is required.

Parameters:
  • interp (Interpolator) – The interpolator.

  • grid_arr (numpy.ndarray) – A 3D array of shape (2, rows, cols), containing the raster grid’s row and column coordinates. It may represent a sub-region of a larger (non-oversampled) grid. The local origin (0, 0) corresponds to grid_arr[:, 0, 0].

  • grid_arr_shape (tuple of int) – Shape of the active sub-region in grid_arr, given as (rows, cols). Required because grid_arr may be a larger buffer reused across tiles or subregions.

  • grid_resolution (tuple of int) – Resolution of the coarse grid, typically in pixels or map units per pixel (e.g., (10, 10)).

  • grid_nodata (scalar or None) – The NoData value associated with grid_arr, marking invalid or missing data points.

  • grid_mask_arr (numpy.ndarray, optional) – Optional 2D uint8 or int8 mask array aligned with grid_arr (shape: (rows, cols)). Indicates valid (unmasked) and invalid (masked) data. Defaults to None.

  • grid_mask_in_unmasked_value (numpy.uint8) – Value in grid_mask_arr that represents a valid/unmasked data point.

  • array_src_ds (rasterio.io.DatasetReader) – The source dataset (e.g., a GDAL or Rasterio object) from which raster data will be read and resampled.

  • array_src_bands (int or list of int) – List of band indices to read from array_src_ds. If a single band is provided, it can be an integer.

  • array_src_mask_ds (rasterio.io.DatasetReader or None, optional) – Optional dataset representing the mask associated with array_src_ds. Defaults to None.

  • array_src_mask_band (int or None, optional) – Band index to read from array_src_mask_ds for the source mask. Defaults to None.

  • array_src_mask_validity_pair (tuple of int, optional) –

    A tuple containing two integer :

    • The first integer corresponds to the value to consider as valid in the mask array.

    • The second integer corresponds to the value to consider as invalid in the mask array.

    If the tuple differs from (Validity.VALID, Validity.INVALID) a replace operation will be performed in order to make the mask compliant with the core resampling method.

  • array_src_geometry_origin (tuple of float or None, optional) – This optional parameter specifies the origin convention for the array_src_geometry_pair definition. GridR uses a (0, 0) image coordinate system to address the first pixel of the array_src raster. This parameter allows you to align the array_src_geometry_pair definition with GridR’s convention, ensuring proper spatial referencing. Please note, its internal usage is solely for modifying array_src_geometry_pair. Defaults to None.

  • array_src_geometry_pair (tuple of (GeometryType or None), optional) –

    A tuple containing two optional GeometryType elements:

    • The first element: Represents the valid geometries.

    • The second element: Represents the invalid geometries.

    If provided, a rasterization of those geometries is performed locally on the current array_src raster window. This generated mask is then merged with any additional raster mask supplied via the array_src_mask_ds dataset. The rasterization itself is delegated to the build_mask gridr’s core method. Defaults to None.

  • oversampled_grid_win (numpy.ndarray) – Target window for resampling, defined in full-resolution coordinates, relative to the local origin of grid_arr.

  • margin (numpy.ndarray) – Pixel margin to apply when computing the minimal read window from array_src_ds, ensuring context for resampling (e.g., for kernels). Format: [[top_margin, bottom_margin], [left_margin, right_margin]].

  • sma_out_buffer (SharedMemoryArray or numpy.ndarray) – Output array (or shared memory buffer) where resampled values will be written.

  • out_win (numpy.ndarray) – Window within sma_out_buffer specifying where to write the output data. Format: [[row_start, row_end], [col_start, col_end]].

  • nodata_out (scalar or None) – NoData value to fill the output if the grid metrics are invalid or if no valid data points can be found.

  • boundary_condition (str or None) – Optional padding mode when required data for interpolation lies outside the source dataset domain. Available values are a subset of the numpy.pad method modes: ‘edge’, ‘reflect’, ‘symmetric’, or ‘wrap’. Uses a GridR-specific in-place padding implementation instead of numpy.pad to avoid unnecessary memory allocation.

  • sma_out_mask_buffer (SharedMemoryArray or numpy.ndarray or None, optional) – Optional output array (or shared memory buffer) where output mask will be written. Defaults to None.

  • logger_msg_prefix (str) – Prefix to prepend to all logger messages, useful for debugging and tracing within logs.

  • logger (logging.Logger) – Logger instance used for debug and informational messages.

Notes

The goal is to generate data within a specific target window (oversampled_grid_win), which is defined in a full-resolution geometry. The coordinates of this target window are relative to the local origin of grid_arr.

To optimize the loading of only the necessary extent from the source image (array_src_ds), this method calls array_compute_resampling_grid_geometries on the minimal low-resolution grid window that completely contains the target full-resolution window. The provided margin parameter is also incorporated into this calculation to ensure sufficient data coverage.

The method writes the resampled output to a shared memory array (sma_out_buffer), using out_win to specify the target writing window within this buffer.

Optional masks for both the grid and the source array may be passed.

Masks can be passed as either int8 or uint8 arrays, but the values must be positive and within the uint8 range of [0-255]. If you provide an int8 mask, the method will internally shadow it with a uint8 view.

If the grid metrics are not valid (i.e., there was not sufficient valid data to determine the grid and source boundaries), the method fills the windowed output with the nodata_out value.

gridr.chain.grid_resampling_chain.basic_grid_resampling_chain(grid_ds, grid_row_coords_band, grid_col_coords_band, grid_resolution, array_src_ds, array_src_bands, array_out_ds, interp, nodata_out, grid_col_ds=None, interp_kwargs=None, boundary_condition=None, win=None, grid_shift=None, array_src_mask_ds=None, array_src_mask_band=None, array_src_mask_validity_pair=None, mask_out_ds=None, grid_mask_in_ds=None, grid_mask_in_unmasked_value=None, grid_mask_in_band=None, array_src_geometry_origin=None, array_src_geometry_pair=None, io_strip_size=1000, io_strip_size_target=GridRIOMode.INPUT, ncpu=1, tile_shape=(1000, 1000), logger=None)[source]

Performs a comprehensive grid-based resampling operation.

This function orchestrates the entire resampling process from input grid and source raster datasets to an output raster dataset, handling various masking, interpolation, and I/O strategies. It leverages the basic_grid_resampling_array core method for the actual array processing.

Parameters:
  • grid_ds (rasterio.io.DatasetReader) – Input dataset containing the grid coordinates. This dataset provides the destination geometry for resampling.

  • grid_row_coords_band (int) – Band index in grid_ds corresponding to the row coordinates of the grid.

  • grid_col_coords_band (int) – Band index in grid_ds (or grid_col_ds) corresponding to the column coordinates of the grid.

  • grid_resolution (tuple of int) – Resolution of the coarse grid, typically in pixels or map units per pixel (e.g., (10, 10)).

  • array_src_ds (rasterio.io.DatasetReader) – The source dataset from which raster data will be read and resampled.

  • array_src_bands (int or list of int) – Band index or list of band indices to read from array_src_ds.

  • array_out_ds (rasterio.io.DatasetWriter) – Output dataset where the resampled raster data will be written.

  • interp (InterpolatorIdentifier) –

    The interpolator identifier to use. It can be:

    • A string representing the interpolator name (e.g., “nearest”, “linear” , “cubic”, “bspline3”, “bspline11”, etc.).

    • A PyInterpolatorType enum value.

    • An instance of an interpolator class.

    See gridr.core.interp.interpolator for further details

  • nodata_out (scalar) – NoData value to fill the output raster when no valid data points can be found for a given output pixel.

  • grid_col_ds (rasterio.io.DatasetReader or None, optional) – Optional separate dataset for grid column coordinates if they are not in grid_ds. Defaults to None.

  • interp_kwargs (Any, default None) – Optional keyword parameters that will be passed for the interpolator creation to the get_interpolator function. They will be used if the interpolator passed through the interp is either of type str or PyInterpolatorType.

  • boundary_condition (str, default None) – Optional padding mode when required data for interpolation lies outside the source dataset domain. Available values are a subset of the numpy.pad method modes: ‘edge’, ‘reflect’, ‘symmetric’, or ‘wrap’. Uses a GridR-specific in-place padding implementation instead of numpy.pad to avoid unnecessary memory allocation.

  • win (numpy.ndarray, optional) – Optional output window of the grid_ds to process, defined as [[row_start, row_end], [col_start, col_end]]. This defines the region of interest for the resampling. If None, the full grid extent is considered. Defaults to None.

  • grid_shift (tuple of int or tuple of float, optional) – Optional shift vector applied to all grid coordinates, expressed in the source image coordinate system. The first component is applied to row coordinates and the second to column coordinates. The parameter allows adjustement of the pixel-center convention relative to that used by GridR during resampling - for example, to switch between half-pixel and whole-pixel coordinate conventions, the latter being the one used by GridR.

  • array_src_mask_ds (rasterio.io.DatasetReader or None, optional) – Optional dataset representing the mask associated with array_src_ds. Defaults to None.

  • array_src_mask_band (int or None, optional) – Band index to read from array_src_mask_ds for the source mask. Defaults to None.

  • array_src_mask_validity_pair (tuple of int, optional) –

    A tuple containing two integer :
    • The first integer corresponds to the value to consider as valid in the mask array.

    • The second integer corresponds to the value to consider as invalid in the mask array.

    If the tuple differs from (Validity.VALID, Validity.INVALID) a replace operation will be performed in order to make the mask compliant with the core resampling method.

  • mask_out_ds (rasterio.io.DatasetWriter) – Output dataset where the resampled validity mask will be written. This mask indicates which output pixels contain valid resampled data.

  • grid_mask_in_ds (rasterio.io.DatasetReader or None, optional) – Optional input dataset for the grid mask. This mask can define valid areas within the grid itself. Defaults to None.

  • grid_mask_in_unmasked_value (int or None, optional) – Value in grid_mask_in_ds that represents a valid/unmasked data point. Defaults to None.

  • grid_mask_in_band (int or None, optional) – Band index to read from grid_mask_in_ds for the input grid mask. Defaults to None.

  • array_src_geometry_origin (tuple of float or None, optional) – Specifies the origin convention for array_src_geometry_pair definition. GridR uses a (0, 0) image coordinate system to address the first pixel of the source raster. This parameter aligns the geometry definition with GridR’s convention. Defaults to None.

  • array_src_geometry_pair (tuple of (GeometryType or None), optional) –

    A tuple containing two optional GeometryType elements:
    • The first element: Represents the valid geometries.

    • The second element: Represents the invalid geometries.

    If provided, a rasterization of those geometries is performed locally on the current array_src raster window. This generated mask is then merged with any additional raster mask supplied via the array_src_mask_ds dataset. The rasterization itself is delegated to the build_mask gridr’s core method. Defaults to None.

  • io_strip_size (int, optional) – The number of rows per chunk for I/O operations. This parameter optimizes memory usage and processing speed by dividing the input and output operations into manageable strips. Defaults to DEFAULT_IO_STRIP_SIZE.

  • io_strip_size_target (GridRIOMode, optional) – Defines how io_strip_size is applied, e.g., to input or output strips. Defaults to GridRIOMode.INPUT.

  • ncpu (int, optional) – Number of CPU cores to use for parallel processing. Defaults to DEFAULT_NCPU.

  • tile_shape (tuple of int or None, optional) – Shape (rows, cols) for internal processing tiles within strips, optimizing cache usage. Defaults to DEFAULT_TILE_SHAPE.

  • logger (logging.Logger or None, optional) – Logger instance for debugging and informational messages. If None, a default logger is initialized internally. Defaults to None.

Returns:

Returns 1 upon successful completion of the resampling process. A return value other than 1 indicates an error.

Return type:

int

Notes

This function manages the reading of input data in chunks (strips), calls the basic_grid_resampling_array method for processing each chunk, and then writes the results to the output datasets.

The method handles grid data from one or two separate datasets for row and column coordinates. It also incorporates masking capabilities for both the input grid and the source array, allowing for flexible data validity management.

The win parameter is crucial for defining the specific output region to be processed, enabling partial grid resampling without loading the entire dataset into memory.