CAB-LAB API Reference

Data Cube read-only access:

from cablab import Cube
from datetime import datetime
cube = Cube.open('./cablab-cube-v05')
data = cube.data.get(['LAI', 'Precip'], [datetime(2001, 6, 1), datetime(2012, 1, 1)], 53.2, 12.8)

Data Cube creation/update:

from cablab import Cube, CubeConfig
from datetime import datetime
cube = Cube.create('./my-cablab-cube', CubeConfig(spatial_res=0.05))
cube.update(MyVar1SourceProvider(cube.config, './my-cube-sources/var1'))
cube.update(MyVar2SourceProvider(cube.config, './my-cube-sources/var2'))
class cablab.BaseCubeSourceProvider(cube_config)[source]

A partial implementation of the CubeSourceProvider interface that computes its output image data using weighted averages. The weights are computed according to the overlap of source time ranges and a requested target time range.

compute_variable_images(period_start, period_end)[source]

For each source time range that has an overlap with the given target time range compute a weight according to the overlapping range. Pass these weights as source index to weight mapping to compute_variable_images_from_sources(index_to_weight) and return the result.

Returns:A dictionary variable name –> image. Each image must be numpy array-like object of shape (grid_height, grid_width) as given by the CubeConfig. Return None if no such variables exists for the given target time range.
compute_variable_images_from_sources(index_to_weight)[source]

Compute the target images for all variables from the sources with the given time indices to weights mapping.

The time indices in index_to_weight are guaranteed to point into the time ranges list returned by get_source_time_ranges().

The weight values in index_to_weight are float values computed from the overlap of source time ranges with a requested target time range.

Parameters:index_to_weight – A dictionary mapping time indexes –> weight values.
Returns:A dictionary variable name –> image. Each image must be numpy array-like object of shape (grid_height, grid_width) as specified by the cube’s layout configuration CubeConfig. Return None if no such variables exists for the given target time range.
get_source_time_ranges() → list[source]

Return a sorted list of all time ranges of every source file. Items in this list must be 2-element tuples of datetime instances. The list should be pre-computed in the prepare() method.

get_temporal_coverage() -> (<class 'datetime.datetime'>, <class 'datetime.datetime'>)[source]

Return the temporal coverage derived from the value returned by get_source_time_ranges().

log(message)[source]

Log a message.

Parameters:message – The message to be logged.
class cablab.Cube(base_dir, config)[source]

Represents a data cube. Use the static open() or create() methods to obtain data cube objects.

base_dir

The cube’s base directory.

close()[source]

Closes the data cube.

closed

Checks if the cube has been closed.

config

The cube’s configuration. See CubeConfig class.

static create(base_dir, config=CubeConfig(spatial_res=0.250000, grid_x0=0, grid_y0=0, grid_width=1440, grid_height=720, temporal_res=8, ref_time=datetime.datetime(2001, 1, 1, 0, 0)))[source]

Create a new data cube. Use the Cube.update(provider) method to add data to the cube via a source data provider.

Parameters:
  • base_dir – The data cube’s base directory. Must not exists.
  • config – The data cube’s static information.
Returns:

A cube instance.

data

The cube’s data. See CubeData class.

info()[source]

Return a human-readable information string about this data cube (markdown formatted).

static open(base_dir)[source]

Open an existing data cube. Use the Cube.update(provider) method to add data to the cube via a source data provider.

Parameters:base_dir – The data cube’s base directory which must be empty or non-existent.
Returns:A cube instance.
update(provider)[source]

Updates the data cube with source data from the given image provider.

Parameters:provider – An instance of the abstract ImageProvider class
class cablab.CubeConfig(spatial_res=0.25, grid_x0=0, grid_y0=0, grid_width=1440, grid_height=720, temporal_res=8, calendar='gregorian', ref_time=datetime.datetime(2001, 1, 1, 0, 0), start_time=datetime.datetime(2001, 1, 1, 0, 0), end_time=datetime.datetime(2011, 1, 1, 0, 0), variables=None, file_format='NETCDF4_CLASSIC', compression=False, model_version='0.1')[source]

A data cube’s static configuration information.

Parameters:
  • spatial_res – The spatial image resolution in degree.
  • grid_x0 – The fixed grid X offset (longitude direction).
  • grid_y0 – The fixed grid Y offset (latitude direction).
  • grid_width – The fixed grid width in pixels (longitude direction).
  • grid_height – The fixed grid height in pixels (latitude direction).
  • temporal_res – The temporal resolution in days.
  • ref_time – A datetime value which defines the units in which time values are given, namely ‘days since ref_time‘.
  • start_time – The inclusive start time of the first image of any variable in the cube given as datetime value. None means unlimited.
  • end_time – The exclusive end time of the last image of any variable in the cube given as datetime value. None means unlimited.
  • variables – A list of variable names to be included in the cube.
  • file_format – The file format used. Must be one of ‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_CLASSIC’ or ‘NETCDF3_64BIT’.
  • compression – Whether the data should be compressed.
date2num(date)[source]

Return the number of days for the given date as a number in the time units given by the time_units property.

Parameters:date – The date as a datetime.datetime value
easting

The latitude position of the upper-left-most corner of the upper-left-most grid cell given by (grid_x0, grid_y0).

geo_bounds

The geographical boundary given as ((LL-lon, LL-lat), (UR-lon, UR-lat)).

static load(path)[source]

Load a CubeConfig from a text file.

Parameters:path – The file’s path name.
Returns:A new CubeConfig instance
northing

The longitude position of the upper-left-most corner of the upper-left-most grid cell given by (grid_x0, grid_y0).

num_periods_per_year

Return the number of target periods per year.

store(path)[source]

Store a CubeConfig in a text file.

Parameters:path – The file’s path name.
time_units

Return the time units used by the data cube as string using the format ‘days since ref_time‘.

class cablab.CubeData(cube)[source]

Represents the cube’s read-only data.

Parameters:cube – A Cube object.
close()[source]

Closes this CubeData by closing all open datasets.

get(variable=None, time=None, latitude=None, longitude=None)[source]

Get the cube’s data.

Parameters:
  • variable – an variable index or name or an iterable returning multiple of these (var1, var2, ...)
  • time – a single datetime.datetime object or a 2-element iterable (time_start, time_end)
  • latitude – a single latitude value or a 2-element iterable (latitude_start, latitude_end)
  • longitude – a single longitude value or a 2-element iterable (longitude_start, longitude_end)
Returns:

a dictionary mapping variable names –> data arrays of dimension (time, latitude, longitude)

get_variable(var_index)[source]

Get a cube variable. Same as, e.g. cube.data['Ozone'].

Parameters:var_index – The variable name or index according to the list returned by the variables property.
Returns:a data-access object representing the variable with the dimensions (time, latitude, longitude).
shape

Return the shape of the data cube.

variable_names

Return a dictionary of variable names to indices.

class cablab.CubeSourceProvider(cube_config)[source]

An abstract interface for objects representing data source providers for the data cube. Cube source providers are passed to the Cube.update() method.

Parameters:cube_config – Specifies the fixed layout and conventions used for the cube.
close()[source]

Called by the cube’s update() method after all images have been retrieved and the provider is no longer used.

compute_variable_images(period_start, period_end) → dict[source]

Return variable name to variable image mapping of all provided variables. Each image is a numpy array with the shape (height, width) derived from the get_spatial_coverage() method.

The images must be computed (by aggregation or interpolation or copy) from the source data in the given time period period_start <= source_data_time < period_end and taking into account other data cube configuration settings.

The method is called by a Cube instance’s update() method for all possible time periods in the time range given by the get_temporal_coverage() method. The times given are adjusted w.r.t. the cube’s reference time and temporal resolution.

Parameters:
  • period_start – The period start time as a datetime.datetime instance
  • period_end – The period end time as a datetime.datetime instance
Returns:

A dictionary variable name –> image. Each image must be numpy array-like object of shape (grid_height, grid_width) as given by the CubeConfig. Return None if no such variables exists for the given target time range.

cube_config

The data cube’s configuration.

get_spatial_coverage() → tuple[source]

Return the spatial coverage as a rectangle represented by a tuple of integers (x, y, width, height) in the cube’s image coordinates.

Returns:A tuple of integers (x, y, width, height) in the cube’s image coordinates.
get_temporal_coverage() → tuple[source]

Return the start and end time of the available source data.

Returns:A tuple of datetime.datetime instances (start_time, end_time).
get_variable_descriptors() → dict[source]

Return a variable name to variable descriptor mapping of all provided variables. Each descriptor is a dictionary of variable attribute names to their values. The attributes data_type (a numpy data type) and fill_value are mandatory.

Returns:dictionary of variable names to attribute dictionaries
name

The provider’s name.

prepare()[source]

Called by a Cube instance’s update() method before any other provider methods are called. Provider instances should prepare themselves w.r.t. the given cube configuration cube_config.