Welcome to maidenhair’s documentation!

maidenhair

Build status Coverage Downloads Latest version Wheel Status Egg Status License

A plugin based data load and manimupulation library.

Installation

Use pip like:

$ pip install maidenhair

Usage

Assume that there are three kinds of samples and each samples have 5 indipendent experimental results. All filenames are written as the following format:

sample-type<type number>.<experiment number>.txt

And files are saved in data directory like:

+- data
    |
    +- sample-type1.001.txt
    +- sample-type1.002.txt
    +- sample-type1.003.txt
    +- sample-type1.004.txt
    +- sample-type1.005.txt
    +- sample-type2.001.txt
    +- sample-type2.002.txt
    +- sample-type2.003.txt
    +- sample-type2.004.txt
    +- sample-type2.005.txt
    +- sample-type3.001.txt
    +- sample-type3.002.txt
    +- sample-type3.003.txt
    +- sample-type3.004.txt
    +- sample-type3.005.txt

Then, the code for plotting the data will be:

>>> import matplotlib.pyplot as plt
>>> import maidenhair
>>> import maidenhair.statistics
>>> dataset = []
>>> dataset += maidenhair.load('data/sample-type1.*.txt', unite=True)
>>> dataset += maidenhair.load('data/sample-type2.*.txt', unite=True)
>>> dataset += maidenhair.load('data/sample-type3.*.txt', unite=True)
>>> nameset = ['Type1', 'Type2', 'Type3']
>>> for name, (x, y) in zip(nameset, dataset):
...     xa = maidenhair.statistics.average(x)
...     ya = maidenhair.statistics.average(y)
...     ye = maidenhair.statistics.confidential_interval(y)
...     plt.errorbar(xa, ya, yerr=ye, label=name)
...
>>> plt.show()

API documents

maidenhair Package

maidenhair Package

compat Module

functions Module

maidenhair shortcut function module

maidenhair.functions.load(pathname, using=None, unite=False, basecolumn=0, relative=False, baseline=None, parser=None, loader=None, with_filename=False, recursive=False, natsort=True, **kwargs)[source]

Load data from file matched with given glob pattern.

Return value will be a list of data unless unite is True. If unite is True then all data will be united into a single data.

Parameters:

pathname : string or list

A glob pattern or a list of glob pattern which will be used to load data.

using : integer list or slice instance, optional

A list of index or slice instance which will be used to slice data columns.

unite : boolean, optional:

If it is True then dataset will be united into a single numpy array. See usage for more detail.

basecolumn : integer, optional

An index of base column. all data will be trimmed based on the order of this column when the number of samples are different among the dataset. It only affect when unite is specified as True.

relative : boolean, optional

Make the dataset relative to the first data by using maidenhair.filters.relative.relative() function.

baseline : function, None, optional

A function which will take data columns and return regulated data columns. It is useful to regulate baseline of each data in dataset.

parser : instance, string, None, optional

An instance or registered name of parser class. If it is not specified, default parser specified with maidenhair.functions.set_default_parser() will be used instead.

loader : instance, string, None, optional

An instance or registered name of loader class. If it is not specified, default loader specified with maidenhair.functions.set_default_loader() will be used instead.

with_filename : boolean, optional

If it is True, returning dataset will contain filename in the first column. It is cannot be used with unite = True

recursive : boolean, optional

Recursively find pattern in the directory

natsort : boolean

Naturally sort found files.

Returns:

list :

A list of numpy array

Examples

Assume that there are five independent experimental data for three types of samples, namely there are fifteen data. Each data file would have two direction (X and Y) and 100 data points. Its filenames would be formatted as <type number>.<experimental number>.txt and save in tests/fixtures directory.

Then the loading code will be

>>> import maidenhair
>>> dataset = []
>>> dataset += maidenhair.load('tests/fixtures/1.*.txt',
...                             unite=True, using=(0, 1))
>>> dataset += maidenhair.load('tests/fixtures/2.*.txt',
...                             unite=True, using=(0, 1))
>>> dataset += maidenhair.load('tests/fixtures/3.*.txt',
...                             unite=True, using=(0, 1))
>>> len(dataset)            # number of samples
3
>>> len(dataset[0])         # number of axis (X and Y)
2
>>> len(dataset[0][0])      # number of data points
100
>>> len(dataset[0][0][0])   # number of columns
5

Without using unite=True, the dataset will be

>>> import numpy as np
>>> import maidenhair
>>> dataset = []
>>> dataset += maidenhair.load('tests/fixtures/1.*.txt', using=(0, 1))
>>> dataset += maidenhair.load('tests/fixtures/2.*.txt', using=(0, 1))
>>> dataset += maidenhair.load('tests/fixtures/3.*.txt', using=(0, 1))
>>> len(dataset)            # number of samples
15
>>> len(dataset[0])         # number of axis (X and Y)
2
>>> len(dataset[0][0])      # number of data points
100
>>> isinstance(dataset[0][0][0], np.float64)
True
maidenhair.functions.set_default_parser(parser)[source]

Set defaulr parser instance

Parameters:

parser : instance or string

An instance or registered name of parser class. The specified parser instance will be used when user did not specified parser in maidenhair.functions.load() function.

See also

maidenhair.utils.plugins.Registry.register()
Register new class
maidenhair.functions.get_default_parser()[source]

Get default parser instance

Returns:

instance :

An instance of parser class

See also

maidenhair.utils.plugins.Registry.register()
Register new class
maidenhair.functions.set_default_loader(loader)[source]

Set defaulr loader instance

Parameters:

loader : instance or string

An instance or registered name of loader class. The specified loader instance will be used when user did not specified loader in maidenhair.functions.load() function.

See also

maidenhair.utils.plugins.Registry.register()
Register new class
maidenhair.functions.get_default_loader()[source]

Get default loader instance

Returns:

instance :

An instance of loader class

See also

maidenhair.utils.plugins.Registry.register()
Register new class

Subpackages

classification Package
classify Module
maidenhair.classification.classify.classify_dataset(dataset, fn)[source]

Classify dataset via fn

Parameters:

dataset : list

A list of data

fn : function

A function which recieve data and return classification string. It if is None, a function which return the first item of the data will be used (See with_filename parameter of maidenhair.load() function).

Returns:

dict :

A classified dataset

maidenhair.classification.classify.default_classify_function(data)[source]

A default classify_function which recieve data and return filename without characters just after the last underscore

>>> # [<filename>] is mimicking `data`
>>> default_classify_function(['./foo/foo_bar_hoge.piyo'])
'./foo/foo_bar.piyo'
>>> default_classify_function(['./foo/foo_bar.piyo'])
'./foo/foo.piyo'
>>> default_classify_function(['./foo/foo.piyo'])
'./foo/foo.piyo'
>>> default_classify_function(['./foo/foo'])
'./foo/foo'
unite Module
maidenhair.classification.unite.default_unite_function(data)[source]

A default unite_function which recieve data and return filename without middle extensions

>>> # [<filename>] is mimicking `data`
>>> default_unite_function(['./foo/foo.bar.hoge.piyo'])
'./foo/foo.piyo'
>>> default_unite_function(['./foo/foo.piyo'])
'./foo/foo.piyo'
>>> default_unite_function(['./foo/foo'])
'./foo/foo'
maidenhair.classification.unite.unite_dataset(dataset, basecolumn, fn=None)[source]

Unite dataset via fn

Parameters:

dataset : list

A list of data

basecolumn : int

A number of column which will be respected in uniting dataset

fn : function

A function which recieve data and return classification string. It if is None, a function which return the first item of the data will be used (See with_filename parameter of maidenhair.load() function).

Returns:

list :

A united dataset

filters Package
filters Package
baseline Module

Baseline regulation filter module

maidenhair.filters.baseline.baseline(dataset, column=1, fn=None, fail_silently=True)[source]

Substract baseline from the dataset

Parameters:

dataset : list of numpy array list

A list of numpy array list

column : integer

An index of column which will be proceeded

fn : function

A function which require data and return baseline. If it is None, the first value of data will be used for subtracting

fail_silently : boolean

If True, do not raise exception if no data exists

Returns:

ndarray :

A list of numpy array list

Examples

>>> import numpy as np
>>> from maidenhair.filters.baseline import baseline
>>> dataset = []
>>> dataset.append([np.array([0, 1, 2]), np.array([3, 4, 5])])
>>> dataset.append([np.array([0, 1, 2]), np.array([3, 5, 7])])
>>> dataset.append([np.array([0, 1, 2]), np.array([100, 103, 106])])
>>> expected = [
...     [np.array([0, 1, 2]), np.array([0, 1, 2])],
...     [np.array([0, 1, 2]), np.array([0, 2, 4])],
...     [np.array([0, 1, 2]), np.array([0, 3, 6])],
... ]
>>> proceed = baseline(dataset)
>>> np.array_equal(proceed, expected)
True
relative Module

Relative filter module

maidenhair.filters.relative.relative(dataset, ori=0, column=1, fail_silently=True)[source]

Convert dataset to relative value from the value of ori

Parameters:

dataset : list of numpy array list

A list of numpy array list

ori : integer or numpy array, optional

A relative original data index or numpy array

column : integer, optional

An index of base column to calculate the relative value

fail_silently : boolean

If True, do not raise exception if no data exists

Returns:

ndarray :

A list of numpy array list

Examples

>>> import numpy as np
>>> from maidenhair.filters.relative import relative
>>> dataset = []
>>> dataset.append([np.array([0, 1, 2]), np.array([3, 4, 5])])
>>> dataset.append([np.array([0, 1, 2]), np.array([3, 5, 7])])
>>> dataset.append([np.array([0, 1, 2]), np.array([100, 103, 106])])
>>> expected = [
...     [np.array([0, 1, 2]), np.array([0, 50, 100])],
...     [np.array([0, 1, 2]), np.array([0, 100, 200])],
...     [np.array([0, 1, 2]), np.array([4850, 5000, 5150])],
... ]
>>> proceed = relative(dataset)
>>> np.array_equal(proceed, expected)
True
loaders Package
base Module

An abstract loader class

class maidenhair.loaders.base.BaseLoader(using=None, parser=None)[source]

Bases: object

A abstract loader class

Methods

glob(pathname[, using, unite, basecolumn, ...]) Load data from file matched with given glob pattern.
load(filename[, using, parser]) Load data from file using a specified parser.
glob(pathname, using=None, unite=False, basecolumn=0, parser=None, with_filename=False, recursive=False, natsort=True, **kwargs)[source]

Load data from file matched with given glob pattern.

Return value will be a list of data unless unite is True. If unite is True, all dataset will be united into a single data.

Parameters:

pathname : string

A glob pattern

using : list of integer, slice instance, or None, optional

A list of index or slice instance used to slice data into column If it is not specified, using specified in constructor will be used instead.

unite : boolean, optional:

If it is True then dataset will be united into a single numpy array. See usage for more detail.

basecolumn : integer, optional

An index of base column. all data will be trimmed based on the order of this column when the number of samples are different among the dataset. It only affect when unite is specified as True.

parser : instance, optional

An instance or registered name of parser class. If it is not specified, parser specified in constructor will be used instead.

with_filename : boolean, optional

If it is True, returning dataset will contain filename in the first column. It is cannot be used with unite = True

recursive : boolean, optional

Recursively find pattern in the directory

natsort : boolean

Naturally sort found files.

Returns:

ndarray :

A list of numpy array

load(filename, using=None, parser=None, **kwargs)[source]

Load data from file using a specified parser.

Return value will be separated or sliced into a column list

Parameters:

filename : string

A data file path

using : list of integer, slice instance, or None, optional

A list of index or slice instance used to slice data into column If it is not specified, using specified in constructor will be used instead.

parser : instance or None, optional

An instance or registered name of parser class. If it is not specified, parser specified in constructor will be used instead.

Returns:

ndarray :

A list of numpy array

maidenhair.loaders.base.slice_columns(x, using=None)[source]

Slice a numpy array to make columns

Parameters:

x : ndarray

A numpy array instance

using : list of integer or slice instance or None, optional

A list of index or slice instance

Returns:

ndarray :

A list of numpy array columns sliced

maidenhair.loaders.base.unite_dataset(dataset, basecolumn=0)[source]

Unite dataset into a single data

Parameters:

dataset : list of ndarray

A data list of a column list of a numpy arrays

basecolumn : integer, optional

An index of base column. All data will be trimmed based on the order of this column when the number of samples are different among the dataset

Returns:

list of numpy array :

A column list of a numpy array

plain Module

A plain text loader

class maidenhair.loaders.plain.PlainLoader(using=None, parser=None)[source]

Bases: maidenhair.loaders.base.BaseLoader

A simple loader class

Methods

glob(pathname[, using, unite, basecolumn, ...]) Load data from file matched with given glob pattern.
load(filename[, using, parser]) Load data from file using a specified parser.
parsers Package
base Module

A base data parser module

class maidenhair.parsers.base.BaseParser[source]

Bases: object

An abstract data parser class

Methods

load(filename, **kwargs) Parse a file specified with the filename and return an numpy array
parse(iterable, **kwargs) Parse iterable to an numpy array ..
load(filename, **kwargs)[source]

Parse a file specified with the filename and return an numpy array

Parameters:

filename : string

A path of a file

Returns:

ndarray :

An instance of numpy array

parse(iterable, **kwargs)[source]

Parse iterable to an numpy array

Warning

Subclasses must override this method

Parameters:

iterable : iterable

An iterable instance to parse

Returns:

ndarray :

An instance of numpy array

plain Module

A plain parser class

class maidenhair.parsers.plain.PlainParser[source]

Bases: maidenhair.parsers.base.BaseParser

A plain text parser class based on numpy.loadtxt method

Methods

load(filename, **kwargs) Parse a file specified with the filename and return an numpy array
parse(iterable, **kwargs) Parse whitespace separated iterable to a numpy array.
parse(iterable, **kwargs)[source]

Parse whitespace separated iterable to a numpy array. It is based on numpy.loadtxt method.

Parameters:

iterable : iterable

An iterable instance to parse

Returns:

ndarray :

An instance of numpy array

statistics Package
statistics Package

Statistics shortcut functions

maidenhair.statistics.average(x)[source]

Return a numpy array of column average. It does not affect if the array is one dimension

Parameters:

x : ndarray

A numpy array instance

Returns:

ndarray :

A 1 x n numpy array instance of column average

Examples

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> np.array_equal(average(a), [2, 5, 8])
True
>>> a = np.array([1, 2, 3])
>>> np.array_equal(average(a), [1, 2, 3])
True
maidenhair.statistics.confidential_interval(x, alpha=0.98)[source]

Return a numpy array of column confidential interval

Parameters:

x : ndarray

A numpy array instance

alpha : float

Alpha value of confidential interval

Returns:

ndarray :

A 1 x n numpy array which indicate the each difference from sample average point to confidential interval point

maidenhair.statistics.mean(x)[source]

Return a numpy array of column mean. It does not affect if the array is one dimension

Parameters:

x : ndarray

A numpy array instance

Returns:

ndarray :

A 1 x n numpy array instance of column mean

Examples

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> np.array_equal(mean(a), [2, 5, 8])
True
>>> a = np.array([1, 2, 3])
>>> np.array_equal(mean(a), [1, 2, 3])
True
maidenhair.statistics.median(x)[source]

Return a numpy array of column median. It does not affect if the array is one dimension

Parameters:

x : ndarray

A numpy array instance

Returns:

ndarray :

A 1 x n numpy array instance of column median

Examples

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> np.array_equal(median(a), [2, 5, 8])
True
>>> a = np.array([1, 2, 3])
>>> np.array_equal(median(a), [1, 2, 3])
True
maidenhair.statistics.simple_moving_average(x, n=10)[source]

Calculate simple moving average

Parameters:

x : ndarray

A numpy array

n : integer

The number of sample points used to make average

Returns:

ndarray :

A 1 x n numpy array instance

maidenhair.statistics.simple_moving_matrix(x, n=10)[source]

Create simple moving matrix.

Parameters:

x : ndarray

A numpy array

n : integer

The number of sample points used to make average

Returns:

ndarray :

A n x n numpy array which will be useful for calculating confidentail interval of simple moving average

maidenhair.statistics.standard_deviation(x)[source]

Return a numpy array of column standard deviation

Parameters:

x : ndarray

A numpy array instance

Returns:

ndarray :

A 1 x n numpy array instance of column standard deviation

Examples

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> np.testing.assert_array_almost_equal(
...     standard_deviation(a),
...     [0.816496, 0.816496, 0.816496])
>>> a = np.array([1, 2, 3])
>>> np.testing.assert_array_almost_equal(
...     standard_deviation(a),
...     0.816496)
maidenhair.statistics.variance(x)[source]

Return a numpy array of column variance

Parameters:

x : ndarray

A numpy array instance

Returns:

ndarray :

A 1 x n numpy array instance of column variance

Examples

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> np.testing.assert_array_almost_equal(
...     variance(a),
...     [0.666666, 0.666666, 0.666666])
>>> a = np.array([1, 2, 3])
>>> np.testing.assert_array_almost_equal(
...     variance(a),
...     0.666666)
utils Package
utils Package
environment Module

Get special directory path

maidenhair.utils.environment.get_system_root_directory()[source]

Get system root directory (application installed root directory)

Returns:

string :

A full path

maidenhair.utils.environment.get_system_plugins_directory()[source]

Get system plugin directory (plugin directory for system wide use)

Returns:

string :

A full path

maidenhair.utils.environment.get_user_root_directory()[source]

Get user root directory (application user configuration root directory)

Returns:

string :

A full path

maidenhair.utils.environment.get_user_plugins_directory()[source]

Get user plugin directory (plugin directory for the particular user)

Returns:

string :

A full path

peakset Module
maidenhair.utils.peakset.find_peakset(dataset, basecolumn=-1, method='', where=None)[source]

Find peakset from the dataset

Parameters:

dataset : list

A list of data

basecolumn : int

An index of column for finding peaks

method : str

A method name of numpy for finding peaks

where : function

A function which recieve data and return numpy indexing list

Returns:

list :

A list of peaks of each axis (list)

plugins Module

A plugin registry module

class maidenhair.utils.plugins.Registry[source]

Bases: object

A registry class which store plugin objects

Methods

find(name[, namespace]) Find plugin object
load_plugins([plugin_dirs, quiet]) Load plugins in sys.path and plugin_dirs
register(name, obj[, namespace]) Register obj as name in namespace
ENTRY_POINT = 'maidenhair.plugins'
find(name, namespace=None)[source]

Find plugin object

Parameters:

name : string

A name of the object entry or full namespace

namespace : string, optional

A period separated namespace. E.g. foo.bar.hogehoge

Returns:

instance :

An instance found

Raises:

KeyError :

If the named instance have not registered

Examples

>>> registry = Registry()
>>> registry.register('hello', 'goodbye')
>>> registry.register('foo', 'bar', 'hoge.hoge.hoge')
>>> registry.register('foobar', 'foobar', 'hoge.hoge')
>>> registry.find('hello') == 'goodbye'
True
>>> registry.find('foo', 'hoge.hoge.hoge') == 'bar'
True
>>> registry.find('hoge.hoge.foobar') == 'foobar'
True
load_plugins(plugin_dirs=None, quiet=True)[source]

Load plugins in sys.path and plugin_dirs

Parameters:

plugin_dirs : list or tuple of string, optional

A list or tuple of plugin directory path

quiet : bool, optional

If True, print all error message

register(name, obj, namespace=None)[source]

Register obj as name in namespace

Parameters:

name : string

A name of the object entry

obj : instance

A python object which will be registered

namespace : string, optional

A period separated namespace. E.g. foo.bar.hogehoge

Examples

>>> registry = Registry()
>>> registry.register('hello', 'goodbye')
>>> registry.raw.hello == 'goodbye'
True
>>> registry.register('foo', 'bar', 'hoge.hoge.hoge')
>>> isinstance(registry.raw.hoge, Bunch)
True
>>> isinstance(registry.raw.hoge.hoge, Bunch)
True
>>> isinstance(registry.raw.hoge.hoge.hoge, Bunch)
True
>>> registry.raw.hoge.hoge.hoge.foo == 'bar'
True
>>> registry.register('hoge.hoge.foobar', 'foobar')
>>> registry.raw.hoge.hoge.hoge.foo == 'bar'
True
>>> registry.raw.hoge.hoge.foobar == 'foobar'
True
rglob Module

Recursive glob module

This glob is a recursive version of glob module.

maidenhair.utils.rglob.iglob(pathname)[source]

Return an iterator which yields the same values as glob() without actually storing them all simultaneously.

Parameters:

pathname : string

A glob pattern string which will be used for finding files

Returns:

iterator :

An iterator instance which will yield full path name

maidenhair.utils.rglob.glob(pathname)[source]

Return a possibly-empty list of path names that match pathname. It is recursive version of glob.glob()

Parameters:

pathname : string

A glob pattern string which will be used for finding files

Returns:

list :

A list of full path names

Indices and tables