Welcome to maidenhair’s documentation!¶
maidenhair¶







A plugin based data load and manimupulation library.
Usage¶
Assume that there are three kinds of samples and each samples have 5 indipendent experimental results. All filenames are written as the following format:
sample-type<type number>.<experiment number>.txt
And files are saved in data directory like:
+- data
|
+- sample-type1.001.txt
+- sample-type1.002.txt
+- sample-type1.003.txt
+- sample-type1.004.txt
+- sample-type1.005.txt
+- sample-type2.001.txt
+- sample-type2.002.txt
+- sample-type2.003.txt
+- sample-type2.004.txt
+- sample-type2.005.txt
+- sample-type3.001.txt
+- sample-type3.002.txt
+- sample-type3.003.txt
+- sample-type3.004.txt
+- sample-type3.005.txt
Then, the code for plotting the data will be:
>>> import matplotlib.pyplot as plt
>>> import maidenhair
>>> import maidenhair.statistics
>>> dataset = []
>>> dataset += maidenhair.load('data/sample-type1.*.txt', unite=True)
>>> dataset += maidenhair.load('data/sample-type2.*.txt', unite=True)
>>> dataset += maidenhair.load('data/sample-type3.*.txt', unite=True)
>>> nameset = ['Type1', 'Type2', 'Type3']
>>> for name, (x, y) in zip(nameset, dataset):
... xa = maidenhair.statistics.average(x)
... ya = maidenhair.statistics.average(y)
... ye = maidenhair.statistics.confidential_interval(y)
... plt.errorbar(xa, ya, yerr=ye, label=name)
...
>>> plt.show()
API documents¶
maidenhair Package¶
maidenhair Package¶
compat Module¶
functions Module¶
maidenhair shortcut function module
- maidenhair.functions.load(pathname, using=None, unite=False, basecolumn=0, relative=False, baseline=None, parser=None, loader=None, with_filename=False, recursive=False, natsort=True, **kwargs)[source]¶
Load data from file matched with given glob pattern.
Return value will be a list of data unless unite is True. If unite is True then all data will be united into a single data.
Parameters: pathname : string or list
A glob pattern or a list of glob pattern which will be used to load data.
using : integer list or slice instance, optional
A list of index or slice instance which will be used to slice data columns.
unite : boolean, optional:
If it is True then dataset will be united into a single numpy array. See usage for more detail.
basecolumn : integer, optional
An index of base column. all data will be trimmed based on the order of this column when the number of samples are different among the dataset. It only affect when unite is specified as True.
relative : boolean, optional
Make the dataset relative to the first data by using maidenhair.filters.relative.relative() function.
baseline : function, None, optional
A function which will take data columns and return regulated data columns. It is useful to regulate baseline of each data in dataset.
parser : instance, string, None, optional
An instance or registered name of parser class. If it is not specified, default parser specified with maidenhair.functions.set_default_parser() will be used instead.
loader : instance, string, None, optional
An instance or registered name of loader class. If it is not specified, default loader specified with maidenhair.functions.set_default_loader() will be used instead.
with_filename : boolean, optional
If it is True, returning dataset will contain filename in the first column. It is cannot be used with unite = True
recursive : boolean, optional
Recursively find pattern in the directory
natsort : boolean
Naturally sort found files.
Returns: list :
A list of numpy array
Examples
Assume that there are five independent experimental data for three types of samples, namely there are fifteen data. Each data file would have two direction (X and Y) and 100 data points. Its filenames would be formatted as <type number>.<experimental number>.txt and save in tests/fixtures directory.
Then the loading code will be
>>> import maidenhair >>> dataset = [] >>> dataset += maidenhair.load('tests/fixtures/1.*.txt', ... unite=True, using=(0, 1)) >>> dataset += maidenhair.load('tests/fixtures/2.*.txt', ... unite=True, using=(0, 1)) >>> dataset += maidenhair.load('tests/fixtures/3.*.txt', ... unite=True, using=(0, 1)) >>> len(dataset) # number of samples 3 >>> len(dataset[0]) # number of axis (X and Y) 2 >>> len(dataset[0][0]) # number of data points 100 >>> len(dataset[0][0][0]) # number of columns 5
Without using unite=True, the dataset will be
>>> import numpy as np >>> import maidenhair >>> dataset = [] >>> dataset += maidenhair.load('tests/fixtures/1.*.txt', using=(0, 1)) >>> dataset += maidenhair.load('tests/fixtures/2.*.txt', using=(0, 1)) >>> dataset += maidenhair.load('tests/fixtures/3.*.txt', using=(0, 1)) >>> len(dataset) # number of samples 15 >>> len(dataset[0]) # number of axis (X and Y) 2 >>> len(dataset[0][0]) # number of data points 100 >>> isinstance(dataset[0][0][0], np.float64) True
- maidenhair.functions.set_default_parser(parser)[source]¶
Set defaulr parser instance
Parameters: parser : instance or string
An instance or registered name of parser class. The specified parser instance will be used when user did not specified parser in maidenhair.functions.load() function.
See also
- maidenhair.utils.plugins.Registry.register()
- Register new class
- maidenhair.functions.get_default_parser()[source]¶
Get default parser instance
Returns: instance :
An instance of parser class
See also
- maidenhair.utils.plugins.Registry.register()
- Register new class
- maidenhair.functions.set_default_loader(loader)[source]¶
Set defaulr loader instance
Parameters: loader : instance or string
An instance or registered name of loader class. The specified loader instance will be used when user did not specified loader in maidenhair.functions.load() function.
See also
- maidenhair.utils.plugins.Registry.register()
- Register new class
- maidenhair.functions.get_default_loader()[source]¶
Get default loader instance
Returns: instance :
An instance of loader class
See also
- maidenhair.utils.plugins.Registry.register()
- Register new class
Subpackages¶
classification Package¶
classify Module¶
- maidenhair.classification.classify.classify_dataset(dataset, fn)[source]¶
Classify dataset via fn
Parameters: dataset : list
A list of data
fn : function
A function which recieve data and return classification string. It if is None, a function which return the first item of the data will be used (See with_filename parameter of maidenhair.load() function).
Returns: dict :
A classified dataset
- maidenhair.classification.classify.default_classify_function(data)[source]¶
A default classify_function which recieve data and return filename without characters just after the last underscore
>>> # [<filename>] is mimicking `data` >>> default_classify_function(['./foo/foo_bar_hoge.piyo']) './foo/foo_bar.piyo' >>> default_classify_function(['./foo/foo_bar.piyo']) './foo/foo.piyo' >>> default_classify_function(['./foo/foo.piyo']) './foo/foo.piyo' >>> default_classify_function(['./foo/foo']) './foo/foo'
unite Module¶
- maidenhair.classification.unite.default_unite_function(data)[source]¶
A default unite_function which recieve data and return filename without middle extensions
>>> # [<filename>] is mimicking `data` >>> default_unite_function(['./foo/foo.bar.hoge.piyo']) './foo/foo.piyo' >>> default_unite_function(['./foo/foo.piyo']) './foo/foo.piyo' >>> default_unite_function(['./foo/foo']) './foo/foo'
- maidenhair.classification.unite.unite_dataset(dataset, basecolumn, fn=None)[source]¶
Unite dataset via fn
Parameters: dataset : list
A list of data
basecolumn : int
A number of column which will be respected in uniting dataset
fn : function
A function which recieve data and return classification string. It if is None, a function which return the first item of the data will be used (See with_filename parameter of maidenhair.load() function).
Returns: list :
A united dataset
filters Package¶
filters Package¶
baseline Module¶
Baseline regulation filter module
- maidenhair.filters.baseline.baseline(dataset, column=1, fn=None, fail_silently=True)[source]¶
Substract baseline from the dataset
Parameters: dataset : list of numpy array list
A list of numpy array list
column : integer
An index of column which will be proceeded
fn : function
A function which require data and return baseline. If it is None, the first value of data will be used for subtracting
fail_silently : boolean
If True, do not raise exception if no data exists
Returns: ndarray :
A list of numpy array list
Examples
>>> import numpy as np >>> from maidenhair.filters.baseline import baseline >>> dataset = [] >>> dataset.append([np.array([0, 1, 2]), np.array([3, 4, 5])]) >>> dataset.append([np.array([0, 1, 2]), np.array([3, 5, 7])]) >>> dataset.append([np.array([0, 1, 2]), np.array([100, 103, 106])]) >>> expected = [ ... [np.array([0, 1, 2]), np.array([0, 1, 2])], ... [np.array([0, 1, 2]), np.array([0, 2, 4])], ... [np.array([0, 1, 2]), np.array([0, 3, 6])], ... ] >>> proceed = baseline(dataset) >>> np.array_equal(proceed, expected) True
relative Module¶
Relative filter module
- maidenhair.filters.relative.relative(dataset, ori=0, column=1, fail_silently=True)[source]¶
Convert dataset to relative value from the value of ori
Parameters: dataset : list of numpy array list
A list of numpy array list
ori : integer or numpy array, optional
A relative original data index or numpy array
column : integer, optional
An index of base column to calculate the relative value
fail_silently : boolean
If True, do not raise exception if no data exists
Returns: ndarray :
A list of numpy array list
Examples
>>> import numpy as np >>> from maidenhair.filters.relative import relative >>> dataset = [] >>> dataset.append([np.array([0, 1, 2]), np.array([3, 4, 5])]) >>> dataset.append([np.array([0, 1, 2]), np.array([3, 5, 7])]) >>> dataset.append([np.array([0, 1, 2]), np.array([100, 103, 106])]) >>> expected = [ ... [np.array([0, 1, 2]), np.array([0, 50, 100])], ... [np.array([0, 1, 2]), np.array([0, 100, 200])], ... [np.array([0, 1, 2]), np.array([4850, 5000, 5150])], ... ] >>> proceed = relative(dataset) >>> np.array_equal(proceed, expected) True
loaders Package¶
base Module¶
An abstract loader class
- class maidenhair.loaders.base.BaseLoader(using=None, parser=None)[source]¶
Bases: object
A abstract loader class
Methods
glob(pathname[, using, unite, basecolumn, ...]) Load data from file matched with given glob pattern. load(filename[, using, parser]) Load data from file using a specified parser. - glob(pathname, using=None, unite=False, basecolumn=0, parser=None, with_filename=False, recursive=False, natsort=True, **kwargs)[source]¶
Load data from file matched with given glob pattern.
Return value will be a list of data unless unite is True. If unite is True, all dataset will be united into a single data.
Parameters: pathname : string
A glob pattern
using : list of integer, slice instance, or None, optional
A list of index or slice instance used to slice data into column If it is not specified, using specified in constructor will be used instead.
unite : boolean, optional:
If it is True then dataset will be united into a single numpy array. See usage for more detail.
basecolumn : integer, optional
An index of base column. all data will be trimmed based on the order of this column when the number of samples are different among the dataset. It only affect when unite is specified as True.
parser : instance, optional
An instance or registered name of parser class. If it is not specified, parser specified in constructor will be used instead.
with_filename : boolean, optional
If it is True, returning dataset will contain filename in the first column. It is cannot be used with unite = True
recursive : boolean, optional
Recursively find pattern in the directory
natsort : boolean
Naturally sort found files.
Returns: ndarray :
A list of numpy array
- load(filename, using=None, parser=None, **kwargs)[source]¶
Load data from file using a specified parser.
Return value will be separated or sliced into a column list
Parameters: filename : string
A data file path
using : list of integer, slice instance, or None, optional
A list of index or slice instance used to slice data into column If it is not specified, using specified in constructor will be used instead.
parser : instance or None, optional
An instance or registered name of parser class. If it is not specified, parser specified in constructor will be used instead.
Returns: ndarray :
A list of numpy array
- maidenhair.loaders.base.slice_columns(x, using=None)[source]¶
Slice a numpy array to make columns
Parameters: x : ndarray
A numpy array instance
using : list of integer or slice instance or None, optional
A list of index or slice instance
Returns: ndarray :
A list of numpy array columns sliced
- maidenhair.loaders.base.unite_dataset(dataset, basecolumn=0)[source]¶
Unite dataset into a single data
Parameters: dataset : list of ndarray
A data list of a column list of a numpy arrays
basecolumn : integer, optional
An index of base column. All data will be trimmed based on the order of this column when the number of samples are different among the dataset
Returns: list of numpy array :
A column list of a numpy array
plain Module¶
A plain text loader
- class maidenhair.loaders.plain.PlainLoader(using=None, parser=None)[source]¶
Bases: maidenhair.loaders.base.BaseLoader
A simple loader class
Methods
glob(pathname[, using, unite, basecolumn, ...]) Load data from file matched with given glob pattern. load(filename[, using, parser]) Load data from file using a specified parser.
parsers Package¶
base Module¶
A base data parser module
- class maidenhair.parsers.base.BaseParser[source]¶
Bases: object
An abstract data parser class
Methods
load(filename, **kwargs) Parse a file specified with the filename and return an numpy array parse(iterable, **kwargs) Parse iterable to an numpy array ..
plain Module¶
A plain parser class
- class maidenhair.parsers.plain.PlainParser[source]¶
Bases: maidenhair.parsers.base.BaseParser
A plain text parser class based on numpy.loadtxt method
Methods
load(filename, **kwargs) Parse a file specified with the filename and return an numpy array parse(iterable, **kwargs) Parse whitespace separated iterable to a numpy array.
statistics Package¶
statistics Package¶
Statistics shortcut functions
- maidenhair.statistics.average(x)[source]¶
Return a numpy array of column average. It does not affect if the array is one dimension
Parameters: x : ndarray
A numpy array instance
Returns: ndarray :
A 1 x n numpy array instance of column average
Examples
>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> np.array_equal(average(a), [2, 5, 8]) True >>> a = np.array([1, 2, 3]) >>> np.array_equal(average(a), [1, 2, 3]) True
- maidenhair.statistics.confidential_interval(x, alpha=0.98)[source]¶
Return a numpy array of column confidential interval
Parameters: x : ndarray
A numpy array instance
alpha : float
Alpha value of confidential interval
Returns: ndarray :
A 1 x n numpy array which indicate the each difference from sample average point to confidential interval point
- maidenhair.statistics.mean(x)[source]¶
Return a numpy array of column mean. It does not affect if the array is one dimension
Parameters: x : ndarray
A numpy array instance
Returns: ndarray :
A 1 x n numpy array instance of column mean
Examples
>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> np.array_equal(mean(a), [2, 5, 8]) True >>> a = np.array([1, 2, 3]) >>> np.array_equal(mean(a), [1, 2, 3]) True
- maidenhair.statistics.median(x)[source]¶
Return a numpy array of column median. It does not affect if the array is one dimension
Parameters: x : ndarray
A numpy array instance
Returns: ndarray :
A 1 x n numpy array instance of column median
Examples
>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> np.array_equal(median(a), [2, 5, 8]) True >>> a = np.array([1, 2, 3]) >>> np.array_equal(median(a), [1, 2, 3]) True
- maidenhair.statistics.simple_moving_average(x, n=10)[source]¶
Calculate simple moving average
Parameters: x : ndarray
A numpy array
n : integer
The number of sample points used to make average
Returns: ndarray :
A 1 x n numpy array instance
- maidenhair.statistics.simple_moving_matrix(x, n=10)[source]¶
Create simple moving matrix.
Parameters: x : ndarray
A numpy array
n : integer
The number of sample points used to make average
Returns: ndarray :
A n x n numpy array which will be useful for calculating confidentail interval of simple moving average
- maidenhair.statistics.standard_deviation(x)[source]¶
Return a numpy array of column standard deviation
Parameters: x : ndarray
A numpy array instance
Returns: ndarray :
A 1 x n numpy array instance of column standard deviation
Examples
>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> np.testing.assert_array_almost_equal( ... standard_deviation(a), ... [0.816496, 0.816496, 0.816496]) >>> a = np.array([1, 2, 3]) >>> np.testing.assert_array_almost_equal( ... standard_deviation(a), ... 0.816496)
- maidenhair.statistics.variance(x)[source]¶
Return a numpy array of column variance
Parameters: x : ndarray
A numpy array instance
Returns: ndarray :
A 1 x n numpy array instance of column variance
Examples
>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> np.testing.assert_array_almost_equal( ... variance(a), ... [0.666666, 0.666666, 0.666666]) >>> a = np.array([1, 2, 3]) >>> np.testing.assert_array_almost_equal( ... variance(a), ... 0.666666)
utils Package¶
utils Package¶
environment Module¶
Get special directory path
- maidenhair.utils.environment.get_system_root_directory()[source]¶
Get system root directory (application installed root directory)
Returns: string :
A full path
- maidenhair.utils.environment.get_system_plugins_directory()[source]¶
Get system plugin directory (plugin directory for system wide use)
Returns: string :
A full path
peakset Module¶
- maidenhair.utils.peakset.find_peakset(dataset, basecolumn=-1, method='', where=None)[source]¶
Find peakset from the dataset
Parameters: dataset : list
A list of data
basecolumn : int
An index of column for finding peaks
method : str
A method name of numpy for finding peaks
where : function
A function which recieve data and return numpy indexing list
Returns: list :
A list of peaks of each axis (list)
plugins Module¶
A plugin registry module
- class maidenhair.utils.plugins.Registry[source]¶
Bases: object
A registry class which store plugin objects
Methods
find(name[, namespace]) Find plugin object load_plugins([plugin_dirs, quiet]) Load plugins in sys.path and plugin_dirs register(name, obj[, namespace]) Register obj as name in namespace - ENTRY_POINT = 'maidenhair.plugins'¶
- find(name, namespace=None)[source]¶
Find plugin object
Parameters: name : string
A name of the object entry or full namespace
namespace : string, optional
A period separated namespace. E.g. foo.bar.hogehoge
Returns: instance :
An instance found
Raises: KeyError :
If the named instance have not registered
Examples
>>> registry = Registry() >>> registry.register('hello', 'goodbye') >>> registry.register('foo', 'bar', 'hoge.hoge.hoge') >>> registry.register('foobar', 'foobar', 'hoge.hoge') >>> registry.find('hello') == 'goodbye' True >>> registry.find('foo', 'hoge.hoge.hoge') == 'bar' True >>> registry.find('hoge.hoge.foobar') == 'foobar' True
- load_plugins(plugin_dirs=None, quiet=True)[source]¶
Load plugins in sys.path and plugin_dirs
Parameters: plugin_dirs : list or tuple of string, optional
A list or tuple of plugin directory path
quiet : bool, optional
If True, print all error message
- register(name, obj, namespace=None)[source]¶
Register obj as name in namespace
Parameters: name : string
A name of the object entry
obj : instance
A python object which will be registered
namespace : string, optional
A period separated namespace. E.g. foo.bar.hogehoge
Examples
>>> registry = Registry() >>> registry.register('hello', 'goodbye') >>> registry.raw.hello == 'goodbye' True >>> registry.register('foo', 'bar', 'hoge.hoge.hoge') >>> isinstance(registry.raw.hoge, Bunch) True >>> isinstance(registry.raw.hoge.hoge, Bunch) True >>> isinstance(registry.raw.hoge.hoge.hoge, Bunch) True >>> registry.raw.hoge.hoge.hoge.foo == 'bar' True >>> registry.register('hoge.hoge.foobar', 'foobar') >>> registry.raw.hoge.hoge.hoge.foo == 'bar' True >>> registry.raw.hoge.hoge.foobar == 'foobar' True
rglob Module¶
Recursive glob module
This glob is a recursive version of glob module.
- maidenhair.utils.rglob.iglob(pathname)[source]¶
Return an iterator which yields the same values as glob() without actually storing them all simultaneously.
Parameters: pathname : string
A glob pattern string which will be used for finding files
Returns: iterator :
An iterator instance which will yield full path name