Welcome to phconvert’s documentation!

Version:0.9+43.g3a86e58 (release notes)

phconvert is a python 2 & 3 library which helps writing valid Photon-HDF5 files. This document contains the API documentation for phconvert.

The phconvert library contains two main modules: hdf5 and loader. The former contains functions to save and validate Photon-HDF5 files. The latter, contains functions to load other formats to be converted to Photon-HDF5.

The phconvert repository contains a set the notebooks to convert existing formats to Photon-HDF5 or to write Photon-HDF5 from scratch:

In particular see notebook Writing Photon-HDF5 files (read online) as an example of writing Photon-HDF5 files from scratch.

Finally, phconvert repository contains a JSON specification of the Photon-HDF5 format which lists all the valid field names and corresponding data types and descriptions.

Contents:

Module hdf5

The module hdf5 defines functions to save and validate Photon-HDF5 files. The main two functions in this module are:

This module also provides functions to save free-form dict to HDF5 (dict_to_group()) and read a HDF5 group into a dict (dict_from_group()). Finally there are utility functions to easily print HDF5 nodes and attributes (print_children(), print_attrs()).

For more info see: Writing Photon-HDF5 files.

List of functions

Main functions to save and validate Photon-HDF5 files.

phconvert.hdf5.save_photon_hdf5(data_dict, h5_fname=None, h5file=None, user_descr=None, overwrite=False, compression={'complevel': 6, 'complib': 'zlib'}, close=True, validate=True, warnings=True, skip_measurement_specs=False, require_setup=True, debug=False)

Saves the dict data_dict in the Photon-HDF5 format.

This function requires the data to be saved as data_dict argument. The data needs to have the hierarchical structure of a Photon-HDF5 file. For the purpose, we use a standard python dictionary: each keys is a Photon-HDF5 field name and each value contains data (e.g. array, string, etc..) or another dictionary (in which case, it represents an HDF5 sub-group). Similarly, sub-dictionaries contain data or other dictionaries, as needed to represent the hierarchy of Photon-HDF5 files.

Features of this function:

  • Checks that all field names are valid Photon-HDF5 field names.
  • Checks that all field type match the Photon-HDF5 specs (scalar, array, or string).
  • Populates automatically the identity group with filename, software, version and file creation date.
  • Populates automatically the provenance group with info on the original data file (if it can be found on disk): creation and modification date, path.
  • Computes field acquisition_duration when not provided (single-spot data only).

Minimal fields required to create a Photon-HDF5 file:

  • /description (string)
  • /photon_data/timestamps (array)
  • /photon_data/timestamps_specs/timestamps_unit (scalar float)
  • /setup/num_pixels (int): number of detectors
  • /setup/num_spots (int): number of excitation/detection spots
  • /setup/num_spectral_ch (int): number of detection spectral bands
  • /setup/num_polarization_ch (int): number of detected polarization states
  • /setup/num_split_ch (int): number of beam split channels
  • /setup/modulated_excitation (bool): True if there is any form of intensity or polarization modulation or interleaved excitation (PIE or nsALEX). This field has become obsolete in version 0.5 and maintained only for compatibility.
  • /setup/excitation_alternated (array of bool): New in version 0.5. Values are True if the respective excitation source is intensity-modulated. In us-ALEX both sources are alternated, while in PAX measurements only one source is alternated.
  • /setup/lifetime (bool): True if dataset contains TCSPC data.

See also Writing Photon-HDF5 files.

As a side effect data_dict is modified by adding the key ‘_data_file’ containing a reference to the pytables file.

Parameters:
  • data_dict (dict) – the dictionary containing the photon data. The keys must strings matching valid Photon-HDF5 paths. The values must be scalars, arrays, strings or another dict.
  • h5_fname (string or None) – file name for the output Photon-HDF5 file. If None and h5file is also None, the file name is taken from data_dict['_filename'] with extension changed to ‘.hdf5’.
  • h5file (pytables.File or None) – an already open and writable HDF5 file to use as container. This argument can be used to complete an HDF5 file already containing some arrays, or to update an already existing Photon-HDF5 file in-place. For more info see note below.
  • user_descr (dict or None) – dictionary of descriptions (strings) for user-defined fields. The keys must be strings representing the full HDF5 path of each field. The values must be binary (i.e. encoded) strings restricted to the ASCII set.
  • overwrite (bool) – if True, a pre-existing HDF5 file with same name is overwritten. If False, save the new file by adding the suffix “new_copy” (and if a “_new_copy” file is already present overwrites it).
  • compression (dict) – a dictionary containing the compression type and level. Passed to pytables tables.Filters().
  • close (bool) – If True (default) the HDF5 file is closed before returning. If False the file is left open.
  • validate (bool) – if True, after saving perform a validation step raising an error if the specs are not followed.
  • warnings (bool) – if True, print warnings for important optional fields that are missing. If False, don’t print warnings.
  • skip_measurement_specs (bool) – if True don’t print any warning for missing measurement_specs group.
  • require_setup (bool) – if True, raises an error if some mandatory fields in /setup are missing. If False, allows missing setup fields (or missing setup altogether). Use False when saving only detectors’ dark counts.
  • debug (bool) – if True prints additional debug information.

For description and specs of the Photon-HDF5 format see: http://photon-hdf5.readthedocs.org/

Note

The argument h5file accepts an already open HDF5 file for storage. This allows completing a partially written file (for example containing only photon_data arrays) or correcting and already complete Photon-HDF5 file. When using h5file, you need to pass a full data_dict structure as usual. If you don’t want update an array, put in data_dict a reference to the existing pytables array (instead of using a numpy array). Fields containing numpy arrays will be overwritten. Fields containing pytables Array (including CArray or EArray) will be left unmodified. In either cases the TITLE attribute is always updated.

phconvert.hdf5.assert_valid_photon_hdf5(datafile, warnings=True, verbose=False, strict_description=True, require_setup=True, skip_measurement_specs=False)

Asserts that datafile follows the Photon-HDF5 specs.

If the input datafile does not follow the specifications, it raises the Invalid_PhotonHDF5 exception, with a message indicating the cause of the error.

This function checks that:

  • all fields are valid Photon-HDF5 names
  • all fields have valid descriptions
  • all mandatory fields are present
  • if /setup/lifetime is True (i.e. 1), assures that nanotimes and nanotimes_specs are present
Parameters:
  • datafile (string or tables.File) – input data file to be validated
  • warnings (bool) – if True, print warnings for important optional fields that are missing. If False, don’t print warnings.
  • verbose (bool) – if True print details about the performed tests.
  • strict_description (bool) – if True consider a non-conforming description (TITLE) a specs violation.
  • require_setup (bool) – if True, raises an error if some mandatory fields in /setup are missing. If False, allows missing setup fields (or missing setup altogether).
  • skip_measurement_specs (bool) – if True don’t print any warning for missing measurement_specs group.

Utility functions

Utility functions to work with HDF5 files in pytables.

phconvert.hdf5.print_children(group)

Print all the sub-groups in group and leaf-nodes children of group.

Parameters:group (pytables group) – the group to be printed.
phconvert.hdf5.print_attrs(node, which='user')

Print the HDF5 attributes for node_name.

Parameters:
  • node (pytables node) – node whose attributes will be printed. Can be either a group or a leaf-node.
  • which (string) – Valid values are ‘user’ for user-defined attributes, ‘sys’ for pytables-specific attributes and ‘all’ to print both groups of attributes. Default ‘user’.
phconvert.hdf5.dict_from_group(group, read=True)

Return a dict with the content of a PyTables group.

phconvert.hdf5.dict_to_group(group, dictionary)

Save dictionary into HDF5 format in group.

Module loader

This module contains functions to load each supported data format. Each loader function loads data from a third-party formats into a python dictionary which has the structure of a Photon-HDF5 file. These dictionaries can be passed to phconvert.hdf5.save_photon_hdf5() to save the data in Photon-HDF5 format.

The loader module contains high-level functions which “fill” the dictionary with the appropriate arrays. The actual decoding of the input binary files is performed by low-level functions in other modules (smreader.py, pqreader.py, bhreader.py). When trying to decode a new file format, these modules can provide useful examples.

phconvert.loader.nsalex_bh(filename_spc, donor=4, acceptor=6, alex_period_donor=(10, 1500), alex_period_acceptor=(2000, 3500), excitation_wavelengths=(5.32e-07, 6.35e-07), detection_wavelengths=(5.8e-07, 6.8e-07), allow_missing_set=False, tcspc_num_bins=None, tcspc_unit=None)

Load a .spc and (optionally) .set files for ns-ALEX and return 2 dict.

The first dictionary can be passed to the phconvert.hdf5.save_photon_hdf5() function to save the data in Photon-HDF5 format.

Returns:the first contains the main photon data (timestamps, detectors, nanotime, …); the second contains the raw data from the .set file (it can be saved in a user group in Photon-HDF5).
Return type:Two dictionaries
phconvert.loader.nsalex_ht3(filename, donor=0, acceptor=1, alex_period_donor=(150, 1500), alex_period_acceptor=(1540, 3050), excitation_wavelengths=(5.23e-07, 6.28e-07), detection_wavelengths=(5.8e-07, 6.8e-07))

Load a .ht3 file containing ns-ALEX data and return a dict.

WARNING: This function is deprecated. Please use nsalex_pq() instead.

phconvert.loader.nsalex_pq(filename, donor=0, acceptor=1, alex_period_donor=(150, 1500), alex_period_acceptor=(1540, 3050), excitation_wavelengths=(5.23e-07, 6.28e-07), detection_wavelengths=(5.8e-07, 6.8e-07))

Load PicoQuant PTU, HT3 or PT3 files containing ns-ALEX data.

This function returns a dictionary that can be passed to phconvert.hdf5.save_photon_hdf5() to save a Photon-HDF5 file.

phconvert.loader.nsalex_pt3(filename, donor=0, acceptor=1, alex_period_donor=(150, 1500), alex_period_acceptor=(1540, 3050), excitation_wavelengths=(5.23e-07, 6.28e-07), detection_wavelengths=(5.8e-07, 6.8e-07))

Load a .pt3 file containing ns-ALEX data and return a dict.

WARNING: This function is deprecated. Please use nsalex_pq() instead.

phconvert.loader.nsalex_t3r(filename, donor=0, acceptor=1, alex_period_donor=(150, 1500), alex_period_acceptor=(1540, 3050), excitation_wavelengths=(5.23e-07, 6.28e-07), detection_wavelengths=(5.8e-07, 6.8e-07))

Load a .t3r file containing ns-ALEX data and return a dict.

This dictionary can be passed to the phconvert.hdf5.save_photon_hdf5() function to save the data in Photon-HDF5 format.

phconvert.loader.usalex_sm(filename, donor=0, acceptor=1, alex_period=4000, alex_offset=750, alex_period_donor=(2850, 580), alex_period_acceptor=(930, 2580), excitation_wavelengths=(5.32e-07, 6.35e-07), detection_wavelengths=(5.8e-07, 6.8e-07), software='LabVIEW Data Acquisition usALEX')

Load a .sm us-ALEX file and returns a dictionary.

This dictionary can be passed to the phconvert.hdf5.save_photon_hdf5() function to save the data in Photon-HDF5 format.

Module pqreader

This module contains functions to load and decode files from PicoQuant hardware.

The main functions to decode PicoQuant files (PTU, HT3, PT3, T3R) are respectively:

These functions return the arrays timestamps (also called macro-time or timetag), detectors (or channel), nanotimes (also called micro-time or TCSPC time) and an additional metadata dict.

Other lower level functions are:

  • ptu_reader() to load metadata and raw t3 records from PTU files
  • ht3_reader() to load metadata and raw t3 records from HT3 files
  • pt3_reader() to load metadata and raw t3 records from PT3 files
  • process_t3records() to decode the t3 records and return timestamps (after overflow correction), detectors and TCSPC nanotimes.
  • process_t3records_t3rfile() to decode the t3 records for t3r files.
  • process_t2records() to decode the t2 records and return timestamps (after overflow correction) and detectors.

The functions performing overflow/rollover correction can take advantage of numba, if installed, to significanly speed-up the processing.

List of functions

High-level functions to load and decode several PicoQuant file formats:

phconvert.pqreader.load_ptu(filename, ovcfunc=None)

Load data from a PicoQuant .ptu file.

Parameters:
  • filename (string) – the path of the PTU file to be loaded.
  • ovcfunc (function or None) – function to use for overflow/rollover correction of timestamps. If None, it defaults to the fastest available implementation for the current machine.
Returns:

A tuple of timestamps, detectors, nanotimes (integer arrays) and a dictionary with metadata containing the keys ‘timestamps_unit’, ‘nanotimes_unit’, ‘acquisition_duration’ and ‘tags’. The data in the PTU file header is returned as a dictionary of “tags”. Each item in the dictionary has ‘idx’, ‘type’, ‘value’ and ‘offset’ keys. Some tags also have a ‘data’ key. Use _ptu_print_tags() to print the tags as an easy-to-read table.

phconvert.pqreader.load_ht3(filename, ovcfunc=None)

Load data from a PicoQuant .ht3 file.

Parameters:
  • filename (string) – the path of the HT3 file to be loaded.
  • ovcfunc (function or None) – function to use for overflow/rollover correction of timestamps. If None, it defaults to the fastest available implementation for the current machine.
Returns:

A tuple of timestamps, detectors, nanotimes (integer arrays) and a dictionary with metadata containing at least the keys ‘timestamps_unit’ and ‘nanotimes_unit’.

phconvert.pqreader.load_pt3(filename, ovcfunc=None)

Load data from a PicoQuant .pt3 file.

Parameters:
  • filename (string) – the path of the PT3 file to be loaded.
  • ovcfunc (function or None) – function to use for overflow/rollover correction of timestamps. If None, it defaults to the fastest available implementation for the current machine.
Returns:

A tuple of timestamps, detectors, nanotimes (integer arrays) and a dictionary with metadata containing at least the keys ‘timestamps_unit’ and ‘nanotimes_unit’.

phconvert.pqreader.load_phu(filename)

Load data from a PicoQuant .phu file.

Parameters:filename (string) – the path of the PHU file to be loaded.
Returns:A tuple of histograms, histogram resolution, and tags. The latter is an dictionary of tags contained in the file header. Each item in the dictionary has ‘idx’, ‘type’, ‘value’ and ‘offset’ keys. Some tags also have a ‘data’ key. Use _ptu_print_tags() to print the tags as an easy-to-read table.

Low-level functions

These functions are the building blocks for loading and decoding the different file formats:

phconvert.pqreader.ptu_reader(filename)

Read the header and the raw t3 or t2 records from a PTU file.

phconvert.pqreader.ht3_reader(filename)

Load raw t3 records and metadata from an HT3 file.

phconvert.pqreader.pt3_reader(filename)

Load raw t3 records and metadata from a PT3 file.

phconvert.pqreader.process_t3records(t3records, time_bit=10, dtime_bit=15, ch_bit=6, special_bit=True, ovcfunc=None)

Extract the different fields from the raw t3records array.

The input array of t3records is an array of “records” (a C struct). It packs all the information of each detected photons. This function decodes the different fields and returns 3 arrays containing the timestamps (i.e. macro-time or number of sync, few-ns resolution), the nanotimes (i.e. the micro-time or TCSPC time, ps resolution) and the detectors.

t3records have these fields (in little-endian order):

| Optional special bit | detectors | nanotimes | timestamps |
  MSB                                                   LSB

Bit allocation of these fields, starting from the MSB:

  • special bit: 1 bit if special_bit = True (default), else no special bit.
  • channel: default 6 bit, (argument ch_bit), detector or special marker
  • nanotimes: default 15 bit (argument dtime_bit), nanotimes (TCSPC time)
  • timestamps: default 10 bit, (argument time_bit), the timestamps (macro-time)

Timestamps: The returned timestamps are overflow-corrected, and therefore should be monotonically increasing. Each overflow event is marked by a special detector (or a special bit) and this information is used for the correction. These overflow “events” are not removed in the returned arrays resulting in spurious detectors. This choice has been made for safety (you can always go and check where there was an overflow) and for efficiency (removing a few elements requires allocating a new array that is potentially expensive for big data files). Under normal usage the additional detectors take negligible space and can be safely ignored.

Parameters:
  • t3records (array) – raw array of t3records as saved in the PicoQuant file.
  • time_bit (int) – number of bits in the t3record used for timestamps (or macro-time).
  • dtime_bit (int) – number of bits in the t3record used for the nanotime (TCSPC time or micro-time)
  • ch_bit (int) – number of bits in the t3record used for the detector number.
  • special_bit (bool) – if True the t3record contains a special bit for overflow correction. This special bit will become the MSB in the returned detectors array. If False, it assumes no special bit in the t3record.
  • ovcfunc (function or None) – function to perform overflow correction of timestamps. If None use the default function. The default function is the numba-accelerated version is numba is installed otherwise it is function using plain numpy.
Returns:

A 3-element tuple containing the following 1D arrays (all of the same length):

  • timestamps (array of int64): the macro-time (or number of sync) of each photons after overflow correction. Units are specified in the file header.
  • nanotimes (array of uint16): the micro-time (TCSPC time), i.e. the time lag between the photon detection and the previous laser sync. Units (i.e. the bin width) are specified in the file header.
  • detectors (arrays of uint8): detector number. When special_bit = True the highest bit in detectors will be the special bit.

phconvert.pqreader.process_t2records(t2records, time_bit=25, ch_bit=6, special_bit=True, ovcfunc=None)

Extract the different fields from the raw t2records array.

The input array of t2records is an array of “records” (a C struct). It packs all the information of each detected photons. This function decodes the different fields and returns 2 arrays containing the timestamps (also called macro-time or timetag) and the detectors (or channel).

t2records have these fields (in little-endian order):

| Optional special bit | detectors |  timestamps |
  MSB                                        LSB
  • special bit: 1 bit if special_bit = True (default), else no special bit.
  • channel: default 6 bit, (argument ch_bit), detector or special marker
  • timestamps: default 25 bit, (argument time_bit), the timestamps (macro-time)

The returned timestamps are overflow-corrected, and therefore should be monotonically increasing. Each overflow event is marked by a special detector (or a special bit) and this information is used for the correction. These overflow “events” are not removed in the returned arrays resulting in spurious detectors. This choice has been made for safety (you can always go and check where there was an overflow) and for efficiency (removing a few elements requires allocating a new array that is potentially expensive for big data files). Under normal usage the additional detectors take negligible space and can be safely ignored.

Parameters:
  • t2records (array) – raw array of t2records as saved in the PicoQuant file.
  • time_bit (int) – number of bits in the t2record used for timestamps
  • ch_bit (int) – number of bits in the t2record used for the detector number.
  • special_bit (bool) – if True the t2record contains a special bit for overflow correction or external markers. This special bit will become the MSB in the returned detectors array. If False, it assumes no special bit in the t2record.
  • ovcfunc (function or None) – function to perform overflow correction of timestamps. If None use the default function. The default function is the numba-accelerated version if numba is installed otherwise it is function using plain numpy.
Returns:

A 2-element tuple containing the following 1D arrays (all of the same length):

  • timestamps (array of int64): the macro-time (or number of sync) of each photons after overflow correction. Units are specified in the file header.
  • detectors (arrays of uint8): detector number. When special_bit = True the highest bit in detectors will be the special bit.

Module bhreader

This module contains functions to load and decode files from Becker & Hickl hardware.

The high-level function in this module are:

  • load_spc() which loads and decoded the photon data from SPC files.
  • load_set() which returns a dictionary of metadata from SET files.

Becker & Hickl SPC Format

The structure of the SPC format is here described.

SPC-600/630

SPC-600/630 files have a record of 48-bit (6 bytes) in little endian (<) format. The first 6 bytes of the file are an header containing the timestamps_unit (in 0.1ns units) in the two central bytes (i.e. bytes 2 and 3). In the following drawing each char represents 2 bits:

bit: 64        48                          0
     0000 0000 XXXX XXXX XXXX XXXX XXXX XXXX
               '-------' '--' '--'   '-----'
field names:       a      c    b        d

     0000 0000 XXXX XXXX XXXX XXXX XXXX XXXX
               '-------' '--' '--' '-------'
numpy dtype:       a      c    b    field0

macrotime = [ b  ] [     a     ]  (24 bit)
detector  = [ c  ]                (8 bit)
nanotime  = [  d  ]               (12 bit)

overflow bit: 13, bit_mask = 2^(13-1) = 4096

SPC-134/144/154/830

SPC-134/144/154/830 files have a record of 32-bits (4 bytes) in little endian (<) format. The first 4 bytes of the file are an header containing the timestamps_unit (in 0.1ns units) in first two bytes. In the following drawing each char represents 2 bits:

bit:                     32                0
                         XXXX XXXX XXXX XXXX
                         '''-----' '''-----'
field names:             a    b    c    d

                         XXXX XXXX XXXX XXXX
                         '-------' '-------'
numpy dtype:              field1    field0

macrotime = [ d ]       (12 bit)
detector  = [ c ]       (4 bit)
nanotime  = [ b ]       (12 bit)
aux       = [ a ]       (4 bit)

aux = [invalid, overflow, gap, mark]

If overflow == 1 and invalid == 1 --> number of overflows = [ b ][ c ][ d ]

List of functions

High-level functions to load and decode Becker & Hickl SPC/SET pair of files:

phconvert.bhreader.load_spc(fname, spc_model='SPC-630')

Load data from Becker & Hickl SPC files.

Parameters:spc_model (string) – name of the board model. Valid values are ‘SPC-630’, ‘SPC-134’, ‘SPC-144’, ‘SPC-154’ and ‘SPC-830’.
Returns:3 numpy arrays (timestamps, detector, nanotime) and a float (timestamps_unit).
phconvert.bhreader.load_set(fname_set)

Return a dict with data from the Becker & Hickl .SET file.

Low-level functions

These functions are the building blocks for decoding Becker & Hickl files:

phconvert.bhreader.bh_set_identification(fname_set)

Return a dict containing the IDENTIFICATION section of .SET files.

The both keys and values are native strings (binary strings on py2 and unicode strings on py3).

phconvert.bhreader.bh_set_sys_params(fname_set)

Return a dict containing the SYS_PARAMS section of .SET files.

The keys are native strings (traditional strings on py2 and unicode strings on py3) while values are numerical type or byte strings.

phconvert.bhreader.bh_decode(s)

Replace code strings from .SET files with human readable label strings.

phconvert.bhreader.bh_print_sys_params(sys_params)

Print a summary of the Becker & Hickl system parameters (.SET file).

Indices and tables