CAB-LAB API Reference¶
Data Cube read-only access:
from cablab import Cube
from datetime import datetime
cube = Cube.open('./cablab-cube-v05')
data = cube.data.get(['LAI', 'Precip'], [datetime(2001, 6, 1), datetime(2012, 1, 1)], 53.2, 12.8)
Data Cube creation/update:
from cablab import Cube, CubeConfig
from datetime import datetime
cube = Cube.create('./my-cablab-cube', CubeConfig(spatial_res=0.05))
cube.update(MyVar1SourceProvider(cube.config, './my-cube-sources/var1'))
cube.update(MyVar2SourceProvider(cube.config, './my-cube-sources/var2'))
-
class
cablab.
BaseCubeSourceProvider
(cube_config)[source]¶ A partial implementation of the CubeSourceProvider interface that computes its output image data using weighted averages. The weights are computed according to the overlap of source time ranges and a requested target time range.
-
compute_variable_images
(period_start, period_end)[source]¶ For each source time range that has an overlap with the given target time range compute a weight according to the overlapping range. Pass these weights as source index to weight mapping to compute_variable_images_from_sources(index_to_weight) and return the result.
Returns: A dictionary variable name –> image. Each image must be numpy array-like object of shape (grid_height, grid_width) as given by the CubeConfig. Return None
if no such variables exists for the given target time range.
-
compute_variable_images_from_sources
(index_to_weight)[source]¶ Compute the target images for all variables from the sources with the given time indices to weights mapping.
The time indices in index_to_weight are guaranteed to point into the time ranges list returned by get_source_time_ranges().
The weight values in index_to_weight are float values computed from the overlap of source time ranges with a requested target time range.
Parameters: index_to_weight – A dictionary mapping time indexes –> weight values. Returns: A dictionary variable name –> image. Each image must be numpy array-like object of shape (grid_height, grid_width) as specified by the cube’s layout configuration CubeConfig. Return None
if no such variables exists for the given target time range.
-
get_source_time_ranges
() → list[source]¶ Return a sorted list of all time ranges of every source file. Items in this list must be 2-element tuples of datetime instances. The list should be pre-computed in the prepare() method.
-
-
class
cablab.
Cube
(base_dir, config)[source]¶ Represents a data cube. Use the static open() or create() methods to obtain data cube objects.
-
base_dir
¶ The cube’s base directory.
-
closed
¶ Checks if the cube has been closed.
-
config
¶ The cube’s configuration. See CubeConfig class.
-
static
create
(base_dir, config=CubeConfig(spatial_res=0.250000, grid_x0=0, grid_y0=0, grid_width=1440, grid_height=720, temporal_res=8, ref_time=datetime.datetime(2001, 1, 1, 0, 0)))[source]¶ Create a new data cube. Use the Cube.update(provider) method to add data to the cube via a source data provider.
Parameters: - base_dir – The data cube’s base directory. Must not exists.
- config – The data cube’s static information.
Returns: A cube instance.
-
data
¶ The cube’s data. See CubeData class.
-
info
()[source]¶ Return a human-readable information string about this data cube (markdown formatted).
-
-
class
cablab.
CubeConfig
(spatial_res=0.25, grid_x0=0, grid_y0=0, grid_width=1440, grid_height=720, temporal_res=8, calendar='gregorian', ref_time=datetime.datetime(2001, 1, 1, 0, 0), start_time=datetime.datetime(2001, 1, 1, 0, 0), end_time=datetime.datetime(2011, 1, 1, 0, 0), variables=None, file_format='NETCDF4_CLASSIC', compression=False, model_version='0.1')[source]¶ A data cube’s static configuration information.
Parameters: - spatial_res – The spatial image resolution in degree.
- grid_x0 – The fixed grid X offset (longitude direction).
- grid_y0 – The fixed grid Y offset (latitude direction).
- grid_width – The fixed grid width in pixels (longitude direction).
- grid_height – The fixed grid height in pixels (latitude direction).
- temporal_res – The temporal resolution in days.
- ref_time – A datetime value which defines the units in which time values are given, namely ‘days since ref_time‘.
- start_time – The inclusive start time of the first image of any variable in the cube given as datetime value.
None
means unlimited. - end_time – The exclusive end time of the last image of any variable in the cube given as datetime value.
None
means unlimited. - variables – A list of variable names to be included in the cube.
- file_format – The file format used. Must be one of ‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_CLASSIC’ or ‘NETCDF3_64BIT’.
- compression – Whether the data should be compressed.
-
date2num
(date)[source]¶ Return the number of days for the given date as a number in the time units given by the
time_units
property.Parameters: date – The date as a datetime.datetime value
-
easting
¶ The latitude position of the upper-left-most corner of the upper-left-most grid cell given by (grid_x0, grid_y0).
-
geo_bounds
¶ The geographical boundary given as ((LL-lon, LL-lat), (UR-lon, UR-lat)).
-
static
load
(path)[source]¶ Load a CubeConfig from a text file.
Parameters: path – The file’s path name. Returns: A new CubeConfig instance
-
northing
¶ The longitude position of the upper-left-most corner of the upper-left-most grid cell given by (grid_x0, grid_y0).
-
num_periods_per_year
¶ Return the number of target periods per year.
-
time_units
¶ Return the time units used by the data cube as string using the format ‘days since ref_time‘.
-
class
cablab.
CubeData
(cube)[source]¶ Represents the cube’s read-only data.
Parameters: cube – A Cube object. -
get
(variable=None, time=None, latitude=None, longitude=None)[source]¶ Get the cube’s data.
Parameters: - variable – an variable index or name or an iterable returning multiple of these (var1, var2, ...)
- time – a single datetime.datetime object or a 2-element iterable (time_start, time_end)
- latitude – a single latitude value or a 2-element iterable (latitude_start, latitude_end)
- longitude – a single longitude value or a 2-element iterable (longitude_start, longitude_end)
Returns: a dictionary mapping variable names –> data arrays of dimension (time, latitude, longitude)
-
get_variable
(var_index)[source]¶ Get a cube variable. Same as, e.g.
cube.data['Ozone']
.Parameters: var_index – The variable name or index according to the list returned by the variables property. Returns: a data-access object representing the variable with the dimensions (time, latitude, longitude).
-
shape
¶ Return the shape of the data cube.
-
variable_names
¶ Return a dictionary of variable names to indices.
-
-
class
cablab.
CubeSourceProvider
(cube_config)[source]¶ An abstract interface for objects representing data source providers for the data cube. Cube source providers are passed to the Cube.update() method.
Parameters: cube_config – Specifies the fixed layout and conventions used for the cube. -
close
()[source]¶ Called by the cube’s update() method after all images have been retrieved and the provider is no longer used.
-
compute_variable_images
(period_start, period_end) → dict[source]¶ Return variable name to variable image mapping of all provided variables. Each image is a numpy array with the shape (height, width) derived from the get_spatial_coverage() method.
The images must be computed (by aggregation or interpolation or copy) from the source data in the given time period period_start <= source_data_time < period_end and taking into account other data cube configuration settings.
The method is called by a Cube instance’s update() method for all possible time periods in the time range given by the get_temporal_coverage() method. The times given are adjusted w.r.t. the cube’s reference time and temporal resolution.
Parameters: - period_start – The period start time as a datetime.datetime instance
- period_end – The period end time as a datetime.datetime instance
Returns: A dictionary variable name –> image. Each image must be numpy array-like object of shape (grid_height, grid_width) as given by the CubeConfig. Return
None
if no such variables exists for the given target time range.
-
cube_config
¶ The data cube’s configuration.
-
get_spatial_coverage
() → tuple[source]¶ Return the spatial coverage as a rectangle represented by a tuple of integers (x, y, width, height) in the cube’s image coordinates.
Returns: A tuple of integers (x, y, width, height) in the cube’s image coordinates.
-
get_temporal_coverage
() → tuple[source]¶ Return the start and end time of the available source data.
Returns: A tuple of datetime.datetime instances (start_time, end_time).
-
get_variable_descriptors
() → dict[source]¶ Return a variable name to variable descriptor mapping of all provided variables. Each descriptor is a dictionary of variable attribute names to their values. The attributes
data_type
(a numpy data type) andfill_value
are mandatory.Returns: dictionary of variable names to attribute dictionaries
-
name
¶ The provider’s name.
-