eodal.core.band module

A band is a two-dimensional array that can be located via a spatial coordinate system. Each band thus has a name and an array of values, which are usually numeric.

It relies on rasterio for all in- and output operations to read data from files (or URIs) using GDAL drivers.

eodal stores band data basically as numpy arrays. Masked arrays of the class ~numpy.ma.MaskedArray are also supported. For very large data sets that exceed the RAM of the computer, zarr can be used.

Copyright (C) 2022 Lukas Valentin Graf

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

class eodal.core.band.Band(band_name: str, values: ndarray | MaskedArray | Array, geo_info: GeoInfo, band_alias: str | None = '', wavelength_info: WavelengthInfo | None = None, scale: int | float | None = 1.0, offset: int | float | None = 0.0, unit: str | None = '', nodata: int | float | None = None, is_tiled: int | bool | None = 0, area_or_point: str | None = 'Area', vector_features: GeoDataFrame | None = None)[source]

Bases: object

Class for storing, accessing and modifying a raster band

Attrib band_name:

the band name identifies the raster band (e.g., ‘B1’). It can be any character string

Attrib values:

the actual raster data as numpy.ndarray, numpy.ma.MaskedArray or zarr. The type depends on how the constructor is called.

Attrib geo_info:

GeoInfo object defining the spatial reference system, upper left corner and pixel size (spatial resolution)

Attrib band_alias:

optional band alias to use in addition to band_name. Both, band_name and band_alias are interchangeable.

Attrib wavelength_info:

optional wavelength info about the band to allow for localizing the band data in the spectral domain (mostly required for data from optical imaging sensors).

Attrib scale:

scale (aka gain) parameter of the raster data.

Attrib offset:

offset parameter of the raster data.

Attrib unit:

optional (SI) physical unit of the band data (e.g., ‘meters’ for elevation data)

Attrib nodata:

numeric value indicating no-data. If not provided the nodata value is set to numpy.nan for floating point data, 0 and -999 for unsigned and signed integer data, respectively.

Attrib is_tiled:

boolean flag indicating if the raster data is sub-divided into tiles. False (zero) by default.

Attrib area_or_point:

Following GDAL standards, might be either Area (GDAL default) or Point. When Area pixel coordinates refer to the upper left corner of the pixel, whereas Point indicates that pixel coordinates are from the center of the pixel.

Attrib alias:

True if the band has a band_alias

Attrib bounds:

image bounds in cartographic projection

Attrib coordinates:

image coordinates in x and y direction

Attrib crs:

coordinate reference system as EPSG code

Attrib has_alias:

True if the band has a band_alias

Attrib is_zarr:

True if the band data is stored as zarr

Attrib is_ndarray:

True if the band data is stored as numpy.ndarray

Attrib is_masked_array:

True if the band data is stored as numpy.ma.core.maskedArray

Attrib meta:

rasterio compatible representation of essential image metadata

Attrib transform:

Affine transform representation of the image geo-localisation

Attrib vector_features:

geopandas.GeoDataFrame with vector features used for reading the image (clipping or masking). Can be None if no features were used for reading.

__init__(band_name: str, values: ndarray | MaskedArray | Array, geo_info: GeoInfo, band_alias: str | None = '', wavelength_info: WavelengthInfo | None = None, scale: int | float | None = 1.0, offset: int | float | None = 0.0, unit: str | None = '', nodata: int | float | None = None, is_tiled: int | bool | None = 0, area_or_point: str | None = 'Area', vector_features: GeoDataFrame | None = None)[source]

Constructor to instantiate a new band object.

Parameters:
  • band_name – name of the band.

  • values – data of the band. Can be any numpy ndarray or maskedArray as well as a zarr instance as long as its two-dimensional.

  • geo_info~eodal.core.band.GeoInfo instance to allow for localizing the band data in a spatial reference system

  • band_alias – optional alias name of the band

  • wavelength_info – optional ~eodal.core.band.WavelengthInfo instance denoting the spectral wavelength properties of the band. It is recommended to pass this parameter for optical sensor data.

  • scale – optional scale (aka gain) factor for the raster band data. Many floating point datasets are scaled by a large number to allow for storing data as integer arrays to save disk space. The scale factor should allow to scale the data back into its original value range. For instance, Sentinel-2 MSI data is stored as unsigned 16-bit integer arrays but actually contain reflectance factor values between 0 and 1. If not provided, scale is set to 1.

  • offset – optional offset for the raster band data. As for the gain factor the idea is to scale the original band data in such a way that it’s either possible to store the data in a certain data type or to avoid certain values. If not provided, offset is set to 0.

  • unit – optional (SI) physical unit of the band data (e.g., ‘meters’ for elevation data)

  • nodata – numeric value indicating no-data. If not provided the nodata value is set to numpy.nan for floating point data, 0 and -999 for unsigned and signed integer data, respectively.

  • is_tiled – boolean flag indicating if the raster data is sub-divided into tiles. False (zero) by default.

  • area_or_point – Following GDAL standards, might be either Area (GDAL default) or Point. When Area pixel coordinates refer to the upper left corner of the pixel, whereas Point indicates that pixel coordinates are from the center of the pixel.

  • vector_featuresgeopandas.GeoDataFrame with vector features used for reading the image (clipping or masking). Can be None if no features were used for reading (optional).

property alias: str | None

Alias of the band name (if available)

property bounds: box

Spatial bounding box of the band

clip(clipping_bounds: Path | GeoDataFrame | GeoSeries | Tuple[float, float, float, float] | Polygon | MultiPolygon, full_bounding_box_only: bool | None = False, inplace: bool | None = False)[source]

Clip a band object to a geometry or the bounding box of one or more geometries. By default, pixel values outside the geometry are masked. The spatial extent of the returned Band instance is always cropped to the bounding box of the geomtry/ geometries.

Note

When passing a GeoDataFrame with more than one feature, the single feature geometries are dissolved into a single one!

Parameters:
  • clipping_bounds – spatial bounds to clip the Band to. Can be either a vector file, a shapely Polygon or MultiPolygon, a GeoDataFrame, GeoSeries or a coordinate tuple with (xmin, ymin, xmax, ymax). Vector files and GeoDataFrame are reprojected into the bands’ coordinate system if required, while the coordinate tuple and shapely geometry MUST be provided in the CRS of the band.

  • full_bounding_box_only – if False (default), clips to the bounding box of the geometry and masks values outside the actual geometry boundaries. To obtain all values within the bounding box set to True. .. versionadded:: 0.1.1

  • inplace – if False (default) returns a copy of the Band instance with the changes applied. If True overwrites the values in the current instance.

Returns:

clipped band instance.

property coordinates: Dict[str, ndarray]

x-y spatial band coordinates

copy()[source]

Returns a copy of the current Band instance

property crs: CRS

Coordinate Reference System of the band

classmethod from_rasterio(fpath_raster: Path | Dict, band_idx: int | None = 1, band_name_src: str | None = '', band_name_dst: str | None = 'B1', vector_features: Path | GeoDataFrame | None = None, full_bounding_box_only: bool | None = False, epsg_code: int | None = None, **kwargs)[source]

Creates a new Band instance from any raster dataset understood by rasterio. Reads exactly one band from the input dataset!

Note

To read a spatial subset of raster band data only pass vector_features which can be one to N (multi)polygon features. For Point features refer to the read_pixels method.

Parameters:
  • fpath_raster – file-path to the raster file from which to read a band or

  • band_idx – band index of the raster band to read (starting with 1). If not provided the first band will be always read. Ignored if band_name_src is provided.

  • band_name_src – instead of providing a band index to read (band_idx) a band name can be passed. If provided band_idx is ignored.

  • band_name_dst – name of the raster band in the resulting Band instance. If not provided the default value (‘B1’) is used. Whenever the band name is known it is recommended to use a meaningful band name!

  • vector_featuresGeoDataFrame or file with vector features in a format understood by fiona with one or more vector features of type Polygon or MultiPolygon. Unless full_bounding_box_only is set to True masks out all pixels not covered by the provided vector features. Otherwise the spatial bounding box encompassing all vector features is read as a spatial subset of the input raster band. If the coordinate system of the vector differs from the raster data source the vector features are projected into the CRS of the raster band before extraction.

  • full_bounding_box_only – if False (default) pixels not covered by the vector features are masked out using maskedArray in the back. If True, does not mask pixels within the spatial bounding box of the vector_features.

  • epsg_code – custom EPSG code of the raster dataset in case the raster has no internally-described EPSG code or no EPSG code at all.

  • kwargs – further key-word arguments to pass to ~eodal.core.band.Band.

Returns:

new Band instance from a rasterio dataset.

classmethod from_vector(vector_features: Path | GeoDataFrame, geo_info: GeoInfo, band_name_src: str | None = None, band_name_dst: str | None = 'B1', nodata_dst: int | float | None = 0, snap_bounds: Polygon | None = None, dtype_src: str | None = 'float32', **kwargs)[source]

Creates a new Band instance from a GeoDataFrame or a file with vector features in a format understood by fiona with geometries of type Point, Polygon or MultiPolygon using a single user- defined attribute (column in the data frame). The spatial reference system of the resulting band will be the same as for the input vector data.

Parameters:
  • vector_featueres – file-path to a vector file or GeoDataFrame from which to convert a column to raster. Please note that the column must have a numerical data type.

  • GeoInfo~eodal.core.band.GeoInfo instance to allow for localizing the band data in a spatial reference system

  • band_name_src – name of the attribute in the vector features’ attribute table to convert to a new Band instance. If left empty generates a binary raster with 1 for cells overlapping the vector geometries and zero elsewhere.

  • band_name_dst – name of the resulting Band instance. “B1” by default.

  • nodata_dst – nodata value in the resulting band data to fill raster grid cells having no value assigned from the input vector features. If not provided the nodata value is set to 0 (rasterio default)

  • dtype_src – data type of the resulting raster array. Per default “float32” is used.

  • kwargs – additional key-word arguments to pass to ~eodal.core.Band

Returns:

new Band instance from a vector features source

get_attributes(**kwargs) Dict[str, Any][source]

Returns raster data attributes in rasterio compatible way

Parameters:

kwargs – key-word arguments to insert into the raster attributes

Returns:

dictionary compatible with rasterio attributes

get_meta(driver: str | None = 'gTiff', **kwargs) Dict[str, Any][source]

Returns a rasterio compatible dictionary with raster dataset metadata.

Parameters:
  • driver – name of the rasterio driver. gTiff (GeoTiff) by default

  • kwargs – additional keyword arguments to append to metadata dictionary or to overwrite defaults such as the “compress” attribute.

Returns:

rasterio compatible metadata dictionary to be used for writing new raster datasets

get_pixels(vector_features: Path | GeoDataFrame)[source]

Returns pixel values from a Band instance raster values.

The extracted band array values are stored in a new column in the returned vector_features GeoDataFrame named like the name of the band.

If you do not want to read the entire raster data first consider using ~eodal.core.Band.read_pixels instead.

Note

Masked pixels are set to the band’s nodata value.

Parameters:

vector_features – file-path or GeoDataFrame to features defining the pixels to read from the Band raster values. The geometries can be of type Point, Polygon or MultiPolygon. In the latter two cases the centroids are used to extract pixel values, whereas for point features the closest raster grid cell is selected.

property has_alias: bool

Checks if a color name can be used for aliasing

hist(ax: Axes | None = None, ylabel: str | None = None, xlabel: str | None = None, fontsize: int | None = 12, **kwargs) Figure[source]

Plots the raster histogram using matplotlib

Parameters:
  • nbins – optional number of histogram bins

  • ax – optional matplotlib.axes object to plot onto

  • ylabel – optional y axis label

  • xlabel – optional x axis label

  • fontsize – fontsize to use for axes labels, plot title and colorbar label. 12 pts by default.

property is_masked_array: bool

Checks if the band values are a numpy masked array

property is_ndarray: bool

Checks if the band values are a numpy ndarray

property is_zarr: bool

Checks if the band values are a zarr array

mask(mask: ndarray, inplace: bool | None = False)[source]

Mask out pixels based on a boolean array.

Note

If the band is already masked, the new mask updates the existing one. I.e., pixels already masked before remain masked.

Parameters:
  • masknumpy.ndarray of dtype boolean to use as mask. The mask must match the shape of the raster data.

  • inplace – if False (default) returns a copy of the Band instance with the changes applied. If True overwrites the values in the current instance.

Returns:

Band instance if inplace is False, None instead.

property meta: Dict[str, Any]

Provides a rasterio compatible dictionary with raster metadata

property ncols: int

Number of columns of the band

property nrows: int

Number of rows of the band

plot(colormap: str | None = 'viridis', discrete_values: bool | None = False, user_defined_colors: ListedColormap | None = None, user_defined_ticks: List[str | int | float] | None = None, colorbar_label: str | None = None, vmin: int | float | None = None, vmax: int | float | None = None, fontsize: int | None = 12, ax: Axes | None = None) Figure[source]

Plots the raster values using matplotlib

Parameters:
  • colormap – String identifying one of matplotlib’s colormaps. The default will plot the band using the viridis colormap.

  • discrete_values – if True (Default) assumes that the band has continuous values (i.e., ordinary spectral data). If False assumes that the data only takes a limited set of discrete values (e.g., in case of a classification or mask layer).

  • user_defined_colors – possibility to pass a custom, i.e., user-created color map object not part of the standard matplotlib color maps. If passed, the colormap argument is ignored.

  • user_defined_ticks – list of ticks to overwrite matplotlib derived defaults (optional).

  • colorbar_label – optional text label to set to the colorbar.

  • vmin – lower value to use for ~matplotlib.pyplot.imshow(). If None it is set to the lower 5% percentile of the data to plot.

  • vmin – upper value to use for ~matplotlib.pyplot.imshow(). If None it is set to the upper 95% percentile of the data to plot.

  • fontsize – fontsize to use for axes labels, plot title and colorbar label. 12 pts by default.

  • ax – optional matplotlib.axes object to plot onto

Returns:

matplotlib figure object with the band data plotted as map

classmethod read_pixels(fpath_raster: Path, vector_features: Path | GeoDataFrame, band_idx: int | None = 1, band_name_src: str | None = '', band_name_dst: str | None = 'B1') GeoDataFrame[source]

Reads single pixel values from a raster dataset into a GeoDataFrame

Note

The pixels to read are defined by a GeoDataFrame or file with vector features understood by fiona. If the geometry type is not Point the centroids will be used for extracting the closest grid cell value.

Parameters:
  • fpath_raster – file-path to the raster dataset from which to extract pixel values

  • vector_features – file-path or GeoDataFrame to features defining the pixels to read from a raster dataset. The geometries can be of type Point, Polygon or MultiPolygon. In the latter two cases the centroids are used to extract pixel values, whereas for point features the closest raster grid cell is selected.

  • band_idx – band index of the raster band to read (starting with 1). If not provided the first band will be always read. Ignored if band_name_src is provided.

  • band_name_src – instead of providing a band index to read (band_idx) a band name can be passed. If provided band_idx is ignored. NOTE: This works only if the raster dataset has band names set in its descriptions (often not the case)!

  • band_name_dst – name of the raster band in the resulting GeoDataFrame (i.e., column name)

Returns:

GeoDataFrame with extracted pixel values. If the vector features defining the sampling points are not within the spatial extent of the raster dataset the pixel values are set to nodata (inferred from the raster source)

reduce(method: List[str | Callable[[...], Number]] | None = ['min', 'mean', 'std', 'max', 'count'], by: Path | GeoDataFrame | Polygon | str | None = None, keep_nans: bool | None = False) List[Dict[str, int | float]][source]

Reduces the raster data to scalar values by calling rasterstats.

The reduction can be done on the whole band or by using vector features.

Important

NaNs in the data are handled by rasterstats internally. Therefore, passing numpy nan-functions (e.g., nanmedian) is NOT necessary and users are discouraged from doing so as passing nanmedian will ignore existing masks.

Parameters:
  • method – list of numpy function names and/ or custom function prototypes to use for reducing raster data. Please see also the official rasterstats docs /https://pythonhosted.org/rasterstats/manual.html#user-defined-statistics) about how to pass custom functions.

  • by – define optional vector features by which to reduce the band. By passing ‘self’ the method uses the features with which the band was read, otherwise specify a file-path to vector features or provide a GeoDataFrame.

  • keep_nans

    New in version 0.2.0.

    whether to keep or discard results that were nan. This could happen if a feature does not overlap the raster.

Returns:

list of dictionaries with scalar results per feature including their geometry and further attributes

rename(name: str, alias: bool | None = False, autoupdate_alias: bool | None = True) None[source]

Sets a new band name or alias

Parameters:
  • name – new band name or alias

  • alias – if False (defaults) renames the actual band name, otherwise changes the alias

  • autoupdate_alias – if True (default) the band alias is set to the same value as the band name if alias==False

reproject(target_crs: int | CRS, interpolation_method: int | None = Resampling.nearest, inplace: bool | None = False, **kwargs)[source]

Projects the raster data into a different spatial coordinate system

Parameters:
  • target_crs – EPSG code of the target spatial coordinate system the raster data should be projected to

  • dst_transfrom – optional Affine transformation of the raster data in the target spatial coordinate system

  • interpolation_method – interpolation method to use for interpolating grid cells after reprojection. Default is neares neighbor interpolation.

  • inplace – if False (default) returns a copy of the Band instance with the changes applied. If True overwrites the values in the current instance.

  • kwargs – optional keyword arguments to pass to rasterio.warp.reproject.

Returns:

Band instance if inplace is False, None instead.

resample(target_resolution: int | float, interpolation_method: int | None = 6, target_shape: Tuple[int, int] | None = None, inplace: bool | None = False)[source]

Changes the raster grid cell (pixel) size. Nodata pixels are not used for resampling.

Parameters:
  • target_resolution – spatial resolution (grid cell size) in units of the spatial reference system. Applies to x and y direction.

  • interpolation_method – opencv interpolation method. Per default nearest neighbor interpolation is used (~cv2.INTER_NEAREST_EXACT). See the ~cv2 documentation for a list of available methods.

  • target_shape – shape of the output in terms of number of rows and columns. If None (default) the target_shape parameter is inferred from the band data. If you want to make sure the output is aligned with another raster band (co-registered) provide this parameter.

  • inplace – if False (default) returns a copy of the Band instance with the changes applied. If True overwrites the values in the current instance.

Returns:

Band instance if inplace is False, None instead.

scale_data(inplace: bool | None = False, pixel_values_to_ignore: List[int | float] | None = [])[source]

Applies scale and offset factors to the data.

New in version 0.2.3: No-data values are ignored when applying scale and offset.

Parameters:
  • inplace – if False (default) returns a copy of the Band instance with the changes applied. If True overwrites the values in the current instance.

  • pixel_values_to_ignore – optional list of pixel values to ignore, i.e., where scaling has no effect. From version 0.2.3 onwards, no-data values are always ignored.

Returns:

Band instance if inplace is False, None instead.

to_dataframe() GeoDataFrame[source]

Returns a GeoDataFrame from the raster band data

Returns:

GeoDataFrame of raster values in the spatial coordinate system of the raster band data. The geometry type is always Point.

to_rasterio(fpath_raster: Path, **kwargs) None[source]

Writes the band data to a raster dataset using rasterio.

Parameters:
  • fpath_raster – file-path to the raster dataset to create. The rasterio driver is identified by the file-name extension. In case jp2 is selected, loss-less compression is carried out.

  • kwargs – additional keyword arguments to append to metadata dictionary used by rasterio to write datasets

to_xarray(attributes: Dict[str, Any] = {}, **kwargs) DataArray[source]

Returns a xarray.Dataset from the raster band data (dime

Note

To ensure consistency with xarray pixel coordinates are shifted from the upper left pixel corner to the center.

Parameters:
  • attributes – additional raster attributes to update or add

  • kwargs – additional key-word arguments to pass to ~xarray.Dataset

Returns:

xarray.DataArray with x and y coordinates. Raster attributes are preserved.

property transform: Affine

Affine transformation of the band

class eodal.core.band.BandOperator[source]

Bases: Operator

Band operator supporting basic algebraic operations on Band objects

classmethod calc(a, other: Number | ndarray, operator: str, inplace: bool | None = False, band_name: str | None = None, right_sided: bool | None = False) None | ndarray[source]

executes a custom algebraic operator on Band objects

Parameters:
  • aBand object with values (non-empty)

  • other – scalar, Band or two-dimemsional numpy.array to use on the right-hand side of the operator. If a numpy.array is passed the array must have the same x and y dimensions as the current Band data.

  • operator – symbolic representation of the operator (e.g., ‘+’ for addition)

  • inplace – returns a new Band object if False (default) otherwise overwrites the current Band data

  • band_name – optional name of the resulting Band object if inplace is False.

  • right_sided – optional flag indicated that the order of a and other has to be switched. False by default. Set to True if the order of argument matters, i.e., for right-hand sided expression in case of subtraction, division and power.

Returns:

numpy.ndarray if inplace is False, None instead

class eodal.core.band.GeoInfo(epsg: int | CRS, ulx: int | float, uly: int | float, pixres_x: int | float, pixres_y: int | float)[source]

Bases: object

Class for storing geo-localization information required to reference a raster band object in a spatial coordinate system. At its core this class contains all the attributes necessary to define a Affine transformation.

Attrib epsg:

EPSG code of the spatial reference system the raster data is projected to.

Attrib ulx:

upper left x coordinate of the raster band in the spatial reference system defined by the EPSG code. We assume GDAL defaults, therefore the coordinate should refer to the upper left pixel corner.

Attrib uly:

upper left y coordinate of the raster band in the spatial reference system defined by the EPSG code. We assume GDAL defaults, therefore the coordinate should refer to the upper left pixel corner.

Attrib pixres_x:

pixel size (aka spatial resolution) in x direction. The unit is defined by the spatial coordinate system given by the EPSG code.

Attrib pixres_y:

pixel size (aka spatial resolution) in y direction. The unit is defined by the spatial coordinate system given by the EPSG code.

__init__(epsg: int | CRS, ulx: int | float, uly: int | float, pixres_x: int | float, pixres_y: int | float)[source]

Class constructor to get a new GeoInfo instance.

>>> geo_info = GeoInfo(4326, 11., 48., 0.02, 0.02)
>>> affine = geo_info.as_affine()
Parameters:
  • epsg – EPSG code identifying the spatial reference system (e.g., 4326 for WGS84).

  • ulx – upper left x coordinate in units of the spatial reference system. Should refer to the upper left pixel corner.

  • uly – upper left x coordinate in units of the spatial reference system. Should refer to the upper left pixel corner

  • pixres_x – pixel grid cell size in x direction in units of the spatial reference system.

  • pixres_y – pixel grid cell size in y direction in units of the spatial reference system.

as_affine() Affine[source]

Returns an rasterio.Affine compatible affine transformation

Returns:

GeoInfo instance as rasterio.Affine

classmethod from_affine(affine: Affine, epsg: int)[source]

Returns a GeoInfo instance from a rasterio.Affine object

Parameters:
  • affinerasterio.Affine object

  • epsg – EPSG code identifying the spatial coordinate system

Returns:

new GeoInfo instance

class eodal.core.band.WavelengthInfo(central_wavelength: int | float, wavelength_unit: str, band_width: int | float | None = 0.0)[source]

Bases: object

Class for storing information about the spectral wavelength of a raster band. Many optical sensors record data in spectral channels with a central wavelength and spectral band width.

Attrib central_wavelength:

central spectral wavelength.

Attrib band_width:

spectral band width. This is defined as the difference between the upper and lower spectral wavelength a sensor is recording in a spectral channel.

Attrib wavelength_unit:

physical unit in which central_wavelength and band_width are recorded. Usually ‘nm’ (nano-meters) or ‘um’ (micro-meters)

__init__(central_wavelength: int | float, wavelength_unit: str, band_width: int | float | None = 0.0)[source]

Constructor to derive a new WavelengthInfo instance for a (spectral) raster band.

Parameters:
  • central_wavelength – central wavelength of the band

  • wavelength_unit – physical unit in which the wavelength is provided

  • band_width – width of the spectral band (optional). If not provided assumes a width of zero wavelength units.