Bento documentation contents¶
Overview¶
Bento aims at simplifying the packaging of python softwares, both from the user and developer point of view. Bento packages are described by a bento.info file, which is parsed by the different build tools to do the actual work. Currently, the main user interface to bento is bentomaker, a command line tool to build, install and query bento packages.
There are currently two ways to create bento packages : by writing a bento.info file from scratch, or by converting an existing setup.py.
Simple example¶
Those examples assume you have already a usable bentomaker in your PATH, either through bento installation or by using the one-file bentomaker bundle. If you can execute:
bentomaker help
successfully, you should be able to go on.
From scratch¶
Bento packages are created from a bento.info file, which describes metadata as well as package content in a mostly declarative manner.
For a simple python package hello consisting of two files:
hello/__init__.py
hello/hello.py
a simple bento.info may be written as follows:
Name: hello
Version: 1.0
Library:
Packages:
hello
The file contains some metadata, like package name and version. Its syntax is indentation-based, like python, except that only spaces are allowed (tab character will cause an error when used at the beginning of a line).
Building and installing¶
You use bentomaker to build and install bento packages. Its interface is similar to autotools:
bentomaker configure --prefix=somedirectory
bentomaker install
If you are fine with default configuration values, you can install in one step:
bentomaker install
bentomaker will automatically determine which commands need to be re-run. You can check where bento install files with the –list-files option (in which case bento does not install anything):
bentomaker install --list-files
Bentomaker contains a basic help facility, which list existing commands, etc...:
bentomaker help commands # list commands
From existing setup.py (convertion from distutils-based projects)¶
Bentomaker has an experimental convert command to convert an existing setup.py:
bentomaker convert
If successfull, it will write a bento.info file whose content is derived from your setup.py. The convert command is inherently fragile, because it has to hook into distutils/setuptools internals. Nevertheless, it has been used succesfully to convert packages such as Sphinx or Jinja.
Installing bento¶
setuptools-based installation (deprecated)¶
Bento has a setup.py file, and can be installed as any other conventional python software:
python setup.py install --user # for python >= 2.6
python setup.py install # otherwise
bento-based installer¶
Bento is now able to install itself. First, you need to create the bentomaker script:
python bootstrap.py
This will create a script (or an exe on windows) which can be used to install bento. Once created, bento is installed as a regular bento package:
./bentomaker configure
./bentomaker build
./bentomaker install
# Or an egg
./bentomaker build_egg
# Or a windows installer
./bentomaker build_wininst
Tutorial¶
This tutorial will guide you through the basics to package your python code with bento. Note that for an existing project using setup.py-based packaging, you should look at the convert command so that you don’t have to start from scratch.
Packaging a python module¶
First, let’s assume you have a simple software fubar consisting of a single python module hello.py:
hello.py
A simple bento.info file would look as follows:
Name: fubar
Author: John Doe
Summary: a simple module
Library:
Modules: hello
The indentation must be done through spaces (tabs are considered syntax errors). The bento.info is located just next to your hello.py:
hello.py
bento.info
That’s it, you have your first bento package !
Bentomaker¶
Currently, the only way to interact with bento is bentomaker, a command-line interface to bento. It is used to build, install and test packages from the command line:
bentomaker install
This will automatically run the configure and build commands for you. You can run them explicitely if you to customized installation, e.g.:
bentomaker configure --prefix=/blabla
bentomaker install
You can also build eggs, source tarballs and windows installers (windows only for now):
bentomaker sdist
bentomaker build_egg
bentomaker build_wininst
You can access the list of available commands with the help command:
bentomaker help commands
Adding packages¶
Adding a package (a directory with a __init__.py file) is simple as well. Assuming the following source tree:
hello.py
foo/__init__.py
foo/bar.py
You simply write:
Library:
Packages: foo
Multiple packages are specified through a comma separated list, and respect indentation:
Library:
Packages: foo, bar
or:
Library:
Packages:
foo, bar
or:
Library:
Packages:
foo,
bar
Adding data files¶
Besides packages and modules, you may want to add extra files, like configuration, manpages, documentation, etc... Those are called data files. Bento has a simple but powerful way to install arbitrary data in arbitrary locations.
Installed vs non-installed files¶
Bento makes the distinction between the two following categories:
- installed files (data files): those files are part of the installed package
- extra source files: those files are not installed, but part of the source distribution. They may be README, or additional files necessary to build the software.
An extra source file will only be included in the source tarballs, whereas data files are installed and needed to use the software.
Installed data files: DataFiles section¶
Say our fubar software has one manpage fubar.1:
fubar.1
We need to add the following to bento.info:
DataFiles: manpage
TargetDir: $mandir
Files: fubar.1
This will install the file fubar.1 into $mandir (as $mandir/fubar.1). $mandir is expanded by bento to a sensible default on every support platform, and can be customized at configuration time through the –mandir option. You can of course hardcode the install directory, e.g.:
DataFiles: manpage
TargetDir: /usr/share/man/man1
Files: fubar.1
but this is generally not recommended as it is not portable and makes native packaging more difficult. Bento has a simple mechanism so that you can add your own paths.
Extra source files¶
Extra source files are added through the ExtraSourceFiles section:
ExtraSourceFiles:
setup.py
test/*.py
Adding extensions¶
Extension (compiled python modules) are supported as well. If you have an extension _hello built from the file hellomodule.c, you just write:
Library:
Extension: _hello
Sources: hellomodule.c
Adding compiled libraries¶
Similarly, if you have a compiled library (a C library which is not importable from python):
Library:
CompiledLibrary: foo
Sources: foo.c
Note that there is only one Library section, i.e. a package with both extensions and compiled libraries would look like:
Library:
Extension: _hello
Sources: hellomodule.c
CompiledLibrary: foo
Sources: foo.c
and not like:
Library:
Extension: _hello
Sources: hellomodule.c
Library:
CompiledLibrary: foo
Sources: foo.c
Note that it is currently not possible to link an extension against such a compiled library purely from the bento.info file: you need to use the hook mechanism.
Adding executables¶
Many python softwares are libraries, and their only use is from a python interpreter. Nevertheless, it is relatively common to provide a full program, be it GUI or command line tool. Bento uses a feature similar to setuptools to help you create “entry points” which work on both unix and windows systems:
Executable: foomaker
Module: foomakerlib.foomaker
Function: main
This tells bento to create a script called foomaker (foomaker.exe on windows), which calls the main function from the foomakerlib.foomaker python module. Those scripts are automatically installed in $bindir (which translates to /usr/local/bin by default on unix, and C:Python*Scripts on windows, both values which may be changed by the user at the configure stage through the –bindir option).
Guides¶
Specifiying data files¶
Most packages have some files besides pure code: configuration, data files, documentation, etc... When those files need to be installed, you should use DataFiles sections. Each such section has two mandatory fields, to specify the target directory (where files are installed) and which files are specific to this section:
DataFiles: manage
TargetDir: /usr/man/man1
Files: fubar/fubar1
This will install the file top_dir/fubar/fubar1 into /usr/man/man1/fubar/fubar1.
Flexible install scheme¶
Hardcoding the target directory as above is not flexibe. The user may want to install manpages somewhere else. Bento defines a set of variable paths which are customizable from bentomaker, with platform-specific defaults. For manpages, the variable is mandir:
DataFiles: manpage
TargetDir: $mandir/man1
Files: fubar/fubar.1
Now, the installation path is customizable, e.g.:
bentomaker configure --mandir=/opt/man
will cause the target directory to translate to /opt/man/man1. Moreover, as mandir default value is defined relatively to $prefix ($prefix/man on unix), modifying the prefix will also change how mandir is expanded at install time:
# $mandir is automatically expanded to /opt/man
bentomaker configure --prefix=/opt
If you do not want to install files with their directory component, you need to use the SourceDir option:
DataFiles: manpage
TargetDir: $mandir
SourceDir: fubar
Files: fubar.1
will install fubar/fubar.1 as $mandir/fubar.1 instead of $mandir/fubar/fubar.1.
Custom data paths¶
While the default list should cover most package needs, it is sometimes useful to define custom path variable:
Path: foo
Description: foo directory
Default: $datadir/foo
Bentomaker will automatically add the –foodir option, and $foo will be expanded to the customized value (or $datadir/foo by default). The description will be used as a description in the help message.
Conditional customization¶
It is sometimes necessary to define platform-specific default for custom paths. This can be done as follows:
Path: foo
Description: foo directory
if os(darwin):
Default: /Library/foo
else:
Default: $bin/foo
FIXME: refer to conditional
Retrieving data files at runtime¶
It is often necessary to retrieve data files from your python code. For example, you may have a configuration file which needs to be read at startup. The simplest way to do so is to use __file__ and refer to data files relatively to python code location. This is not very flexible, because it requires dealing with platform idiosyncraties w.r.t. file location. Setuptools and its descendents have an alternative mechanism to retrieve resources at runtime, implemented in the pkg_resource module.
Bento uses a much simpler system, based on a simple python module generated at install time, containing all the relevant information. This file is not generated by default, and you need to define which file will contain all those variables with the ConfigPy field:
ConfigPy: foo/__bento_config.py
This tells bento to generate a module, and install it into foo/__bento_config.py. The path is always relative to site-packages (e.g. /usr/local/lib/python2.6/site-packages/foo/__bento_config.py by default on unix). The file looks as follows:
DOCDIR = "/usr/local/share/doc/config_py"
SHAREDSTATEDIR = "/usr/local/com"
...
to that you can import every path variable with its expanded value in your package:
from foo.__bento_config import DOCDIR, SHAREDSTATEDIR
As the generated python module is a simple python file with pair values, it is easy to modify it if desired (for debugging, etc...), and understandable by any python programmer.
If you need to support the case where the package has not been built yet, you can do as follows:
try:
from foo.__bento_config import DOCDIR, SHAREDSTATEDIR
except ImportError:
# Default values (so that the package may be imported/used without
# being built)
DOCDIR = ...
This is not done by default as it is not possible to know the right default value.
Example¶
Assuming the following bento file:
...
DataFiles: test_data
SourceDir: data
TargetDir: $pkgdatadir
Files:
foo.dat
ConfigPy: foo/__bento_config.py
you can access “foo.dat” as follows in your package:
try:
from foo.__bento_config import PKGDATADIR
except ImportError:
PKGDATADIR = "data" # default value
data = os.path.join(PKGDATADIR, "foo.dat")
This will point to the right location independently on $pkgdatadir value.
Recursive package description¶
If you have a package with a lot of python subpackages which require custom configurations, doing everything in one bento.info file is restrictive. Bento has a simple recursive feature so that one bento.info can refer to another bento.info:
...
Recurse: foo, bar
The Recurse field indicates to bento that it should look for bento.info in both foo/ and bar/ directories. At this time, those bento.info files support a strict subset of the top bento.info. For example, no metadata may be defined in sub-bento.info.
Simple example¶
Let’s assume that you have a software with the packages foo, foo.bar and foo.foo. The simplest way to define this software would be:
...
Library:
Packages: foo, foo.bar, foo.fubar
Alternatively, an equivalent description, using the recursive feature:
...
Recurse: foo
Library:
Package: foo
and the foo/bento.info:
...
Library:
Packages: bar, fubar
The packages are defined relatively to the directory where the subento file is located. Obviously, in this case, it is overkill, but for complex, deeply nested packages (like scipy or twisted), this makes the bento.info more readable. It is especially useful when you use this with the hook file mechanism, where each subento file can drive a part of the configure/build through command hooks and overrides. In that case, the hook file defined in a subdirectory only sees the libraries, modules, etc... defined in the corresponding bento.info by default (see hook section).
Reference¶
bento.info format reference¶
Introduction¶
The package description is a text file, by default named bento.info. Its syntax is indentation-based. It does not support yet commenting, and is currently limited to ASCII.
A typical .info file is structured as follows:
- The package metadata (name, version, etc...)
- Optionally, it may contain addition user-customizable options such as path or flags, whose exact value may be set at configure time.
- A Library section, which defines the package content (packages, modules, C extensions, etc...)
- Optionally, the .info file may contain one or several Executable sections, to describe programs expected to be run from the command line or from a GUI. This is where distutils scripts and setuptools console scripts are defined.
Each section consists of field:value pairs:
- Both fields and values are case-sensitive.
- Indentation has to be in spaces, tab characters for indentation are not supported. Besides this constraints, rules for indentation should follow python’s own rule (arbitrary number of spaces for a given indentation level).
Package metadata¶
Bento supports most metadata defined in the PEP 241 and 314. For a simple package containing one module hello, the bento.info metadata definition would look like:
Name: hello
Version: 0.0.1
Summary: A one-line description of the distribution
Description:
A longer, potentially multi-line string.
As long as the indentation is maintained, the field is considered as
continued.
Author: John Doe
AuthorEmail: john@doe.org
License: BSD
Different fields have different values: they generally consist of either a word (string sequence without a space), a line (a sequence of words without a newline) or multiple lines (Description field only).
Note:: while most metadata defined in the PEP-241 and PEP-314 are supported syntax-wise, their semantics are not always implemented already.
Note:: the bento lexer is ad-hoc and not well specified at this stage. It was conceived to handle values in the reStructuredText format, but doing so prevents desired flexibility of the bento.info format itself, or would be too complex to support. Before 1.0, bento.info format may change so that fields in reStructuredText need to be put in a separate file, e.g.:
DescriptionFromFile: README.rst
If you are reading this and actually know something about parsing and have a better idea to support inline reST, I am open to suggestions !
Name¶
Format:
Name: ASCII_TOKEN
Name of the software being packaged. Its value should contain only alpha-numeric characters.
Url¶
Author¶
Author email¶
Maintainer¶
Maintainer email¶
License¶
Description¶
Platforms¶
Classifiers¶
User-customizable flags¶
Library section¶
Executable section¶
Pure python packages¶
Assuming a package with the following layout:
hello/pkg1/__init__.py
hello/pkg1/...
hello/pkg2/__init__.py
hello/pkg2/...
hello/__init__.py
it would be declared as follows:
Name: hello
Version: 0.0.1
Library:
Packages:
hello.pkg1,
hello.pkg2,
hello
The following syntax is also allowed:
Library:
Packages:
hello.pkg1, hello.pkg2, hello
as well as:
Library:
Packages: hello.pkg1, hello.pkg2, hello
Packages containing C extensions¶
For a simple extension hello._foo, built from sources src/foo.c and src/bar.c, the declaration is as follows:
Library:
Extension: hello._foo
Sources:
src/foo.c,
src/bar.c
Note: none of the other distutils Extension arguments (macro definitions, etc...) are supported yet.
Packages with data files¶
Adding data files in bento is easy. By data files, we mean any file other than C extension sources and python files. There are two kinds of data files in bento:
- Installed data files: those are installed somewhere on the user system at installation time (distutils package_data and data_files, numpy.distutils add_data_files and add_data_dir).
- Extra source files: those are only necessary to build the package, and are not installed. As such, they only need to be included in the source tarball (distutils MANIFEST[.in] mechanism, automatic inclusion from the VCS in setuptools, etc...)
Extra source files¶
Extra source files are simply declared in the section ExtraSourceFiles (outside any Library section):
ExtraSourceFiles:
AUTHORS,
CHANGES,
EXAMPLES,
LICENSE,
Makefile,
README,
TODO,
babel.cfg
Those will be always be included in the tarball generated by bento sdist. A limited form of globbing is allowed:
ExtraSourceFiles:
doc/source/*.rst
doc/source/chapter1/*.rst
that is globbing on every file with the same extension is allowed. Any other form of globbing, in particular recursive ones are purposedly not supported to avoid cluttering the tarball by accident.
Installed data files¶
It is often needed to install data files within the rest of the package. Bento’s system is both simple and flexible enough so that any file in your sources can be installed anywhere. The most simple syntax for data files is as follows:
DataFiles:
TargetDir: /etc
Files:
somefile.conf
This installs the file somefile.conf into /etc. Using hardcoded paths should be avoided, though. Bento allows you to use “dynamic” path instead. This scheme should be familiar to people who have used autotools:
DataFiles:
TargetDir: $sysconfdir
Files:
somefile.conf
$sysconfigdir is a path variable: bento defines several path variables (available on every platform), which may be customized at the configure stage. For example, on Unix, $sysconfdir is defined as $prefix/etc, and prefix is itself defined as /usr/local. If prefix is changed, sysconfdir will be changed accordingly. Of course, sysconfdir itself may be customized as well. This allows for very flexible installation layout, and every particular install scheme (distutils –user, self-contained as in GoboLinux or Mac OS X) may be implemented on top.
It is also possible to define your own path variables (see Path option section).
Srcdir field¶
By default, the installed name is the concatenation of target and the values in files, e.g.:
DataFiles:
TargetDir: $includedir
Files:
foo/bar.h
will be installed as $includedir/foo/bar.h. If instead, you want to install foo/bar.h as $includedir/bar.h, you need to use the srcdir field:
DataFiles:
TargetDir: $includedir
SourceDir: foo
Files:
bar.h
Named data files section¶
You can define as many DataFiles sections as you want, as long as you name them, i.e.:
DataFiles: man1
TargetDir: $mandir/man1
SourceDir: doc/man
Files:
*.1
DataFiles: man3
TargetDir: $mandir/man3
SourceDir: doc/man
Files:
*.3
is ok, but:
DataFiles:
TargetDir: $mandir/man1
SourceDir: doc/man
Files:
*.1
DataFiles:
TargetDir: $mandir/man3
SourceDir: doc/man
Files:
*.3
is not.
Available path variables¶
By default, bento defines the following path variables:
- prefix: install architecture-independent files
- eprefix: install architecture-dependent files
- bindir: user executables
- sbindir: system admin executables
- libexecdir: program executables
- sysconfdir: read-only single-machine data
- sharedstatedir: modifiable architecture-independent data
- localstatedir: modifiable single-machine data
- libdir: object code libraries
- includedir: C header files
- oldincludedir: C header files for non-gcc
- datarootdir: read-only arch.-independent data root
- datadir: read-only architecture-independent data
- infodir: info documentation
- localedir: locale-dependent data
- mandir: man documentation
- docdir: documentation root
- htmldir: html documentation
- dvidir: dvi documentation
- pdfdir: pdf documentation
- psdir: ps documentation
While some of those path semantics don’t make sense on some platforms such as windows, they are defined everywhere with defaults, to ensure a consistent interface across platforms. They are also defined to to get a 1-to-1 correpondance with the autoconf conventions, which are familiar to most packagers on open source OS and system administrators.
Conditionals¶
It is not always possible to have one same package description for every platform. It may also be desirable to enable/disable some parts of a package depending on some option. For this reason, the .info file supports a limited form of conditional. For example:
Library:
InstallRequires:
docutils,
sphinx
if os(windows):
pywin32
The following conditional forms are available:
- os(value): condition on the OS
- flag(value): user-defined flag, boolean
Adding custom options¶
Path option¶
A new path option may be added:
Path: octavedir
Description: octave directory
Default: $datadir/octave
Bentomaker automatically adds an –octavedir option (with help taken from the description), and $octavedir may be used inside the .info file.
Flag option¶
A new flag option may be added:
Flag: debug
Description: build debug
Default: false
Bentomaker automatically adds an –octavedir option (with help taken from the description), and $octavedir may be used inside the .info file.
Bentomaker, the command line interface to bento¶
Introduction¶
Bentomaker is a simple python package which uses bento API to configure, build and install packages. A simple install with bentomaker looks like this:
bentomaker configure --prefix=/home/david/local
bentomaker build
bentomaker install
Or more simply:
bentomaker configure --prefix=/home/david/local
bentomaker install
bentomaker commands know which other command they depend on, and are automatically run if necessary.
Bentomaker has a basic help facility:
bentomaker help
will list all available commands. Once the project is configured, every installation path and user customization is set up, and cannot be changed (except by reconfiguring the package, of course).
Available commands¶
configure¶
This command must be run before any build/install command. It is similar to the well-known configure script from autoconf. Every customizable option is available from the command help:
bentomaker configure -h
If the configure command is not run explicitely, it will automatically be run by any subsequent command.
build¶
This simply builds the package. For pure-python packages, it does almost nothing, except producing a `Build manifest`_. For packages with C extensions, the C extensions are built.
install¶
build_egg¶
This command builds an egg from the package description. It currently requires that configure and build commands have been run.
This is experimental - although I intend to produce eggs which are as backward compatible as possible with existing tools (in particular enstaller, and hopefully virtualenv and buildout), eggs are implementation defined, and depend a lot on distutils idiosyncraties.
sdist¶
This simply produces a source tarball. Currently, only .tar.gz is supported.
convert¶
This converts a package built from distutils, setuptools or numpy.distutils:
bentomaker convert
If successful, it will produce a bento.info file.
This is experimental, and may not work. Also, it cannot convert every package accurately, as it is based on inspecting setup.py’s execution. Nevertheless, it can already convert simple, but non trivial packages such as sphinx pretty accurately.
Single-file distribution¶
Ultimately, deployment is about making your code available to your users: adding a dependency on bento in your package goes against it. To that goal, bento sources include a script which build a single file distribution of bento:
python tools/singledist.py
This creates a bentomaker (bentomaker.exe on windows) file which contains everything needed to configure, build and install software packaged with bento. You only need to include this file in your source tarball, and that’s it – no need to install anything.
How does this work ?¶
The process is taken from the waf project, and is basically a simple python script which contains enough code to bootstrap itself, and a long ascii-encoded string representing the full bento code compressed in bzip2 format
Note:: as of today, most of the space is taken by windows executables. If you don’t support windows, you can strip down the size to around 120 kb:
python tools/singledist.py --noinclude-exe
Transition from existing python packaging infrastructure¶
Even if you are convinced than bento is more appropriate for your needs than current distutils-based tools, there is a significant hurdle to transition to a new infrastructure for your package. First, you need to convert your package, but you also potentially loose goodies such a putting your package on pypi, or being installable through tools such as pip or easy_install.
Ideally, such tools would become pluggable so that they can be made aware of new packaging formats, but in the mean-time, the practical approach of bento is to “emulate” distutils just enough to make them work with the most useful bits of the current python packaging infrastructure, and to provide tools to convert existing setup.py to the bento format.
Converting distutils-based packages¶
The bentomaker command-line tool has a convert command which should be run at the top of your source tree (the directory containing your top setup.py). Because the convert command works by running the setup.py, you need to make sure you can run the setup.py. To convert your package, just do:
bentomaker convert
If successfull, this will write a bento.info file whose content has been pulled of the convert command analysis (it will not overwrite an existing one). It first tries to determine whether your setup.py uses setuptools or not, and then run it with mocked distutils objects for the actual conversion. Since the convert command works by inserting various hooks into distutils internals, it is inherently fragile.
It will definitely not work in the following cases:
- you use the package_dir feature: bento does not support the feature at all.
- you have your own distutils extensions (setuptools and numpy.distutils are somehow handled, though, and other common distutils extensions may be added as well).
It should support the following features:
- All the distutils metadata
- Some setuptools metadata (like require or console scripts)
- module, packages and extensions
- data files as specified in data_files
- source files in MANIFEST[.in]
Note:: because the convert command does not parse the setup.py, but runs it instead, it only handles package description as defined by this one run of setup.py. For example, bento convert cannot automatically handle the following setup.py:
import sys
from setuptools import setup
if sys.platform == "win32":
requires = ["sphinx", "pywin32"]
else:
requires = ["sphinx"]
setup(name="foo", install_requires=requires)
If run on windows, the generated bento.info will be:
Name: foo
Library:
InstallRequires:
pywin32,
sphinx
and:
Name: foo
Library:
InstallRequires:
sphinx
otherwise.
Note:: bento syntax supports simple conditional, so after conversion, you could modify the generated file as follows:
Name: foo
Library:
InstallRequires:
sphinx
if os(win32):
InstallRequires:
pywin32
Adding bento-based setup.py for compatibility with pip, etc...¶
Although nothing fundamentally prevents bento to work under installers such as pip, pip currently does not know anything about bento. To help transition, bento has a distutils compatibility layer. A setup.py as simple as:
import setuptools
from bento.distutils.monkey_patch import monkey_patch
monkey_patch()
from setuptools import setup
if __name__ == '__main__':
setup()
will enable commands such as:
python setup.py install
python setup.py sdist
to work as expected, taking all the package information from bento.info file. Note that the monkey-patching done by bento.distutils on top of setuptools is explicit - solely importing bento.distutils will not monkey patch anything. A simpler, setuptools-style monkey patch is also possible:
import setuptools
from bento.distutils.monkey_patch import setup
if __name__ == '__main__':
setup()
Note:: obviously, this mode will not enable all the features offered by bento. If it were possible, bento would not have been written in the first place. Nevertheless, the following commands should work relatively well as long as you don’t have hooks:
- sdist
- bdist_egg
- install
This should be enough for pip install foo or easy_install foo to work for a bento-based package.
How to contribute¶
Although bento is still in early stages, there are several ways to contribute to the project effectively if you are interested:
- Try the convert command on as many packages as possible, and report failures
- Report bento bugs
You may also look at the TODO list, and take a shot at the missing features. As bento design is still in flux, please discuss any non-trivial feature on the Mailing List: bento@librelist.com.
Design notes¶
This is not really readable at the moment, mostly some personal notes about current internal design.
Bento is currently split into two parts: a core API to parse the package description into a simple object API, and a commands library which gives a command line interface to bento.
The main design philosophy of bento is to clearly separate the different stages of packaging deployment, as we believe it is the only way to make a build tool extensible.
Commands “protocol”¶
The command line interface of bento currently supports 3 stages:
- configuration: is concerned with configuring user options (build/install customization).
- build: compile C extensions
- install: deploy the software into the system as configured at the first stage. Installers are considered installation as well for reasons explained later.
Although those stages are very similar to distutils/setuptools mechanism, the implementation is fundamentally different, because each stage is mostly independent from each other. No python object is directly shared between commands - the current bentomaker implementation implements each stage as a separate run. Once configured, every command has access to all options.
Build manifest and building installers¶
Bento uses a slightly unusal process to install the bits of your package. Instead of copying directly the files to the desired location, the install process is driven by a build manifest. This build manifest is produced by the build command. It contains a description of files per category as well as a few metadata. The syntax is based of JSON so that it can easily be parsed from any language and in most environments (local machine, browser, etc...).
Format internals¶
(This is likely to change in the future)
The json file contains 4 elements:
- meta: this contains the metadata (as defined in the relevant packaging PEP)
- install_paths: a dict of the configured paths
- file_sections: a list of so-called file sections
- executables: a list of executable sections
File sections¶
A list of dictionaries. Each dictionary contains:
- category: the category name
- name: name of this section
- files: a list of tuple source -> target
- source_dir: os.path.join(source_dir, source) gives an absolute path for each source file
- target_dir: os.path.join(target_dir, target) gives an absolute path for each target file
Note that both source_dir and target_dir can refer to path variables as defined in the install_paths section. This allows to “retarget” a build tree to different tree configurations, as required by different packages formats.
Example:
"category": "executables",
"files": [
[
"bentomaker-2.7",
"bentomaker"
]
],
"name": "bentomaker",
"source_dir": "$_srcrootdir/scripts-2.7",
"target_dir": "$bindir"
This is interpreted as installing the file $_srcrootdir/scripts-2.7/bentomaker-2.7 into $bindir/bentomaker.
Advantages¶
The built bits and the build manifest are enough to install the software to arbitrary location, so that the install process does not need to know anything about the build process. Conversely, as long as you can produce a build manifest, you can use the installation commands as is.
Besides installation, the manifest is also used to produce installers. Currently, windows installers (both .exe and .msi), eggs and mpkg are supported, and adding new types of installers should be easier than with distutils. If you look at the build_wininst and build_egg commands source code, they are simple, and most of the “magic” happens in the build manifest. In particular, the build manifest still refers to installed bits relatively to abstract paths, and those paths are resolved when building the installers.
Installers conversion¶
The build manifest is intended to be included in each produced installer, for convertion between various formats. The goal is to have idempotent conversions (e.g. converting an egg to wininst and then converting it back to an egg produces the exact same egg).
We also intend to use build manifest for the upcoming ‘’nest’’ service, which will contain a database of installed software.
FAQ¶
Why to create a new tool ?¶
Because scientific code depends so much on compiled languages (C and Fortran), the scipy community had to significantly extend distutils. It was found to be more and more difficult to maintain, and the source of numerous user complaints. In the last decade, several attemps of refactoring distutils and our extensions have been made, but none succeeded.
Bento is born out of this experience. We also believe that current solutions based on distutils suffer a lot of NIH, and ignore lessons learned in packaging in most other systems. Bento aims at shamelessly copying what works in other systems (CPAN, CRAN, JSAN, HackageDB).
It should be noted that while bento currently first focus on improving the situation for scipy community, it is in now way specific to it. Some features like flexible installation scheme, simple data files handling are potentially useful for anyone.
What are the goals of bento ?¶
The main goal of bento is to separate the concerns on building, packaging and package description, so that it can be easily reused within custom build frameworks (make, waf, scons, etc...). A simple build system is also provided so that simple packages do not need to deal with anything besides bento.
Bento aims at being part of a grander vision for Scientific computing, to make something like CPAN or CRAN available to python users. By being simpler, more explicit, it is hoped that bento will make the development of a scientific-specific Pypi easier.
Why not extending existing tools (distutils, etc...) ?¶
There is a general consensus at least in the scientific python community that distutils is deeply flawed:
- The design by commands does not make much sense. In distutils, each command has its own set of options, and getting the options from other commands is difficult, if not impossible. For example, the install paths are only known once the install command finalize_options has been run, but knowing the install prefix at build time is often useful.
- There is no developer documentation, and what consitutes public API is not documented either. Consequently, every non trivial distutils extension relies on internal details, and as such is fragile.
- Extending by inheritence does not work well: when two modules A and B extend distutils, it becomes difficult for B to reuse A (for example, dealing with setuptools in numpy.distutils extensions has been a constant source of bugs).
- Customizing compilation flags, and more generally some tools involved in compilation is too complicated. For example, adding a new tool in the build chain requires rewriting the build command, which is aggravated by the previous issue. We believe fixing this would end up in rewriting the whole thing.
- Improving distutils to handle dependencies automatically (rebuild only the necessary .c files) is difficult because of the way distutils is designed (build split across different commands, which may be re-executed).
- The codebase quality is horrible. Subclasses don’t share the same interface, numerous attributes are conditionally added on the fly depending on options, etc...
Overall, there is little to save in the current codebase. At least all of the command and ccompiler code must go away, and that’s already 2/3 of distutils code. Given the relatively small size of distutils code, the only asset is its “API”, but fixing what’s wrong with distutils precisely means breaking the API. As such, a new tool written from scratch, but taking inspiration of existing tools elsewhere is much more likely to be an actual improvement.
One should note that numpy’s extensions to distutils are pretty big: numpy.distutils itself is as big as distutils in term of code size, and is the biggest user of distutils API as far as I know. Hence, we are well aware of the cost of a total break from distutils.
What about distutils2 ?¶
We believe that most efforts in distutils2 are peripherical to our core issues as described above, and won’t improve the situation for the scipy community.
Starting from the distutils codebase is not very appealing, as most of it would need to be scrapped (at least the whole command and compiler business needs to be completely rewritten). Distutils2/packaging-related PEPs pushed by the distutils2 team will be implemented on a case per case basis (some of them are obsolete as far as bento is concerned, in the sense that they are already implemented, if only in intent).
Moreover, as bento is designed from the ground up to be split into mostly independent parts, it is possible to reuse its code in other projects. No effort will be made to tie some features to bento to force people to use it. If bento ends up being an experiment into useful new APIs integrated into distutils2, bento would be considered successful. If our vision ends up being wrong or unreachable, some of the code should be useful nonetheless.
Isn’t it too difficult to support building extensions on every platform ?¶
People often assume that distutils has a lot of platform-specific knowledge, in particular to build C extensions. Except for a few exceptions (mostly on non-Unix platforms), most of this knowledge actually comes from autoconf through the sysconfig module.
Any non-superficial modification of the C compilation part of distutils will also require reworking the platform-specific knowledge anyway.
What about existing projects using distutils ?¶
Bentomaker, the command line interface to bento, contains an experimental command to convert existing setup.py to bento format.
It is also possible to write a setup.py which “fake distutils” while using bento for its implementation. This allows a bento-based package to be installable from easy_install or pip.
Is bento based on existing tools ?¶
The main inspirations for bento’s current design are taken from:
- Cabal, the packaging tool for Haskell: the bento file format is mainly an adaptation of Cabal to python.
- Autoconf, for the flexible install scheme, automake’s way of declaring extra distribution files (data files).
- RPM, for the spec file format.
- Setuptools: exe-based script generation on windows, egg format
Who are the authors of bento ?¶
Currently, I (David Cournapeau) am the main author of bento. I am a core contributor to Numpy and Scipy, and have been the main maintainer of Numpy distutils extensions for more than two years. I am also an occasional contributor to scons (a make replacement in python), and debian packager.
- Other contributors:
- Stefan Van der Walt: initial implementation of the bento.info parser
- Philip J. Eby: for answering most of my questions about setuptools/eggs design
- A lot of inspiration came from waf, a great make replacement in python:
- Single file distribution
- Yaku, bento’s internal build system is a dumbed-down waf clone
What are the main features of bento compared to its competitors¶
- Bento has the following main features:
- Full static metadata description for simple packages
- Arbitrary extensibility through python scripts
- Reliable build and installation: no more stalled files when installing, out-of-date source files and dependencies automatically detected for C extensions
- Optional recursive package description for complex packages
- Pluggable build backend: waf, distutils and custom one are currently implemented. One could think about adding support for gyp, make, scons, etc...
- Robust command dependencies from dependencies descriptor: no more monkey-patching nonsense to insert a new command between two existing subcommands
- The following features are being implemented as well:
- New packaging format which can be translated to any existing one if wanted (egg, wininst, msi, etc...). The format is optimized for installation
- Reliable uninstallation
Does bento support virtualenv ?¶
Depending on your definition of support, yes. If you run inside a virtualenv, the following:
bentomaker configure
bentomaker install
will install the package inside the virtual environment (i.e. the same default as when the setup.py uses setuptools). If you customized the prefix at configure stage, it will of course not take into account the virtual environment:
bentomaker configure --prefix=/usr/local
bentomaker install
Why shouldn’t I use bento ?¶
While I believe bento to be significantly better than other existing solutions, bento has some significant disadvantages as well that you need to be aware of:
- Still mostly a one-man show. However, once bento reaches a satisfying level, it will likely be used as a replacement to distutils for numpy and scipy, and hopefully beyond
- Weak documentation: hopefully, this is getting better.
- Mediocre code quality: I focused on the general architecture and low-coupling which are the main issues I had with distutils, but at a lower level, a lot of code leaves to be desired (style inconsistencies, etc...).
Is bento API stable ?¶
As suggested by the current version, no. As long as you only use the bento.info file (no hook), you should be pretty safe - I don’t expect the bento.info file to change in any significant backward-incompatible way.
However, the API to be used inside hook files leaves a lot to be desired, and will change in backward incompatible ways before the first alpha. The good side is that you can complain about the API and get it fixed until then.
TODO¶
Note: this is quite obsolete.
TODO:
- add 2to3 command
- think about integration with sphinx for doc
- test command support
- specify hook mechanism
- add proper egg support
- namespace packages: how to deal with them (file description and runtime support) ?
- port stdeb to bento
- handle reliable install/uninstall
- fix messy lexer/parser code
- Not well thought out yet:
- supporting everything that pkg_resources does (namespace package), except multiple-version installs.
Syntax and features of the package description file¶
The parser and lexer need to be seriously cleaned-up.
Missing features:
- Format-Versioning
- Options declaration besides boolean ?
- Unicode support
Install-Reinstall-Rebuild-Clean problem¶
Reliable install/reinstall¶
InstalledPkgInfo should be enough to install/uninstall things, so including it in installers should be sufficient to get all the data, although it may not be very efficient.
Fundamental problem: bento vs native packages. Possible solutions:
- 1 create a new local site-packages specific to bento, and only use
bento-enabled package for dependencies:
- advantages: reliable, relatively simple
- disadvantages: invasive, requires all dependencies to be under bento (in particular numpy/scipy/matplotlib)
2 try to cope with existing, already installed packages.
- advantages: no barrier of entry, gradual migration
- disadvantages: how to do it ?
Scipi¶
Pypi does not work for the scientific community, so we need to replace it with our own stack. The goal is something like CRAN:
- publish a package from sdist with a cabal-like file to scipi
- the package would be automatically checked for metadata consistency, built (included documentation)
- if the package builds correctly, the package will be available on the given platform(s)
- scipi would have a simple web interface ala CRAN
Technical issues:
- Simple server for published files (mirrored through rsync). Ideally, pure http-based file serving is enough
- Simple WEB-API to get metadata + files
- Look at HackageDB in details