What is RelStorage?

RelStorage is a storage implementation for ZODB that stores pickles in a relational database. PostgreSQL 9.0 and above, MySQL 5.0.32+ / 5.1.34+, and Oracle 10g and 11g are currently supported. RelStorage replaced the PGStorage project.

Features

  • It is a drop-in replacement for FileStorage and ZEO.
  • There is a simple way to convert FileStorage to RelStorage and back again. You can also convert a RelStorage instance to a different relational database.
  • Designed for high volume sites: multiple ZODB instances can share the same database. This is similar to ZEO, but RelStorage does not require ZEO.
  • According to some tests, RelStorage handles high concurrency better than the standard combination of ZEO and FileStorage.
  • Whereas FileStorage takes longer to start as the database grows due to an in-memory index of all objects, RelStorage starts quickly regardless of database size.
  • Supports undo, packing, and filesystem-based ZODB blobs.
  • Both history-preserving and history-free storage are available.
  • Capable of failover to replicated SQL databases.
  • zodbconvert utility to copy/transform databases.
  • zodbpack utility to pack databases.
  • zodburi support.
  • Free, open source (ZPL 2.1)

See the rest of the documentation for more information.

RelStorage’s Documentation

Contents:

Installation

RelStorage 2.0 is supported on Python 2.7, Python 3.4, 3.5 and 3.6, and PyPy2 5.4.1 or later.

You can install RelStorage using pip:

pip install RelStorage

If you use a recent version of pip to install RelStorage on a supported platform (OS X, Windows or “manylinx”), you can get a pre-built binary wheel. If you install from source or on a different platform, you will need to have a functioning C compiler.

RelStorage requires a modern version of ZODB and ZEO; it is tested with ZODB 4.4 and 5.x and ZEO 4.3 and 5.x. If you need to use an older version of ZODB/ZEO, install RelStorage 1.6. Likewise, if you need Python 2.6 support, install RelStorage 1.6 (note that 1.6 does not run on Python 3 or PyPy).

Database Adapter

You also need the Python database adapter that corresponds with your database.

Tip

The easiest way to get the recommended and tested database adapter for your platform and database is to install the corresponding RelStorage extra; RelStorage has extras for all three databases that install the recommended driver on all platforms:

pip install "RelStorage[mysql]"
pip install "RelStorage[postgresql]"
pip install "RelStorage[oracle]"

On CPython2, install psycopg2 2.6.1+, mysqlclient 1.3.7+, or cx_Oracle 5.2+ (but use caution with 5.2.1+); PyMySQL 0.7, MySQL Connector/Python 2.1.5 and umysql are also known to work as is pg8000.

For CPython3, install psycopg2, mysqlclient or cx_Oracle; PyMySQL, MySQL Connector/Python and pg8000 are also known to work.

On PyPy, install psycopg2cffi 2.7.4+ or PyMySQL 0.6.6+ (PyPy will generally work with psycopg2 and mysqlclient, but it will be much slower; in contrast, pg8000 performs nearly as well. cx_Oracle is untested on PyPy).

Here’s a table of known (tested) working adapters; adapters in bold are the recommended adapter.

Platform MySQL PostgreSQL Oracle
CPython2
  1. mysqlclient
  2. PyMySQL
  3. umysqldb
  4. MySQL Connector
  1. psycopg2
  2. pg8000
cx_Oracle
CPython3
  1. mysqlclient
  2. PyMySQL
  3. MySQL Connector
  1. psycopg2
  2. pg8000
cx_Oracle
PyPy
  1. PyMySQL
  2. MySQL Connector
  1. psycopg2cffi
  2. pg8000
 

mysqlclient, MySQL Connector/Python (without its C extension), pg8000 and umysql are compatible (cooperative) with gevent.

For additional details and warnings, see the “driver” section for each database in Database-Specific Adapter Options.

Memcache Integration

If you want to use Memcache servers as an external shared cache for RelStorage clients, you’ll need to install either pylibmc (C based, requires Memcache development libraries and CPython) or python-memcached (pure-Python, works on CPython and PyPy, compatible with gevent).

Once RelStorage is installed, it’s time to configure the database you’ll be using.

Configuring Your Database

You need to configure a database (schema) and user account for RelStorage. RelStorage will populate the database with its schema the first time it connects. Once you have the database configured, you can configure your application to use RelStorage.

Note

If you’ll be developing on RelStorage itself, see how to set up databases to run tests.

PostgreSQL

If you installed PostgreSQL from a binary package, you probably have a user account named postgres. Since PostgreSQL respects the name of the logged-in user by default, switch to the postgres account to create the RelStorage user and database. Even root does not have the PostgreSQL privileges that the postgres account has. For example:

$ sudo su - postgres
$ createuser --pwprompt zodbuser
$ createdb -O zodbuser zodb

Alternately, you can use the psql PostgreSQL client and issue SQL statements to create users and databases. For example:

$ psql -U postgres -c "CREATE USER zodbuser WITH PASSWORD 'relstoragetest';"
$ psql -U postgres -c "CREATE DATABASE zodb OWNER zodbuser;"

New PostgreSQL accounts often require modifications to pg_hba.conf, which contains host-based access control rules. The location of pg_hba.conf varies, but /etc/postgresql/8.4/main/pg_hba.conf is common. PostgreSQL processes the rules in order, so add new rules before the default rules rather than after. Here is a sample rule that allows only local connections by zodbuser to the zodb database:

local  zodb  zodbuser  md5

PostgreSQL re-reads pg_hba.conf when you ask it to reload its configuration file:

/etc/init.d/postgresql reload
MySQL

Use the mysql utility to create the database and user account. Note that the -p option is usually required. You must use the -p option if the account you are accessing requires a password, but you should not use the -p option if the account you are accessing does not require a password. If you do not provide the -p option, yet the account requires a password, the mysql utility will not prompt for a password and will fail to authenticate.

Most users can start the mysql utility with the following shell command, using any login account:

$ mysql -u root -p

Here are some sample SQL statements for creating the user and database:

CREATE USER 'zodbuser'@'localhost' IDENTIFIED BY 'mypassword';
CREATE DATABASE zodb;
GRANT ALL ON zodb.* TO 'zodbuser'@'localhost';
FLUSH PRIVILEGES;

See the RelStorage option blob-chunk-size for information on configuring the server’s max_allowed_packet value for optimal performance.

Oracle

Initial setup will require SYS privileges. Using Oracle 10g XE, you can start a SYS session with the following shell commands:

$ su - oracle
$ sqlplus / as sysdba

You need to create a database user and grant execute privileges on the DBMS_LOCK package to that user. Here are some sample SQL statements for creating the database user and granting the required permissions:

CREATE USER zodb IDENTIFIED BY mypassword;
GRANT CONNECT, RESOURCE, CREATE TABLE, CREATE SEQUENCE TO zodb;
GRANT EXECUTE ON DBMS_LOCK TO zodb;

Configuring Your Application

Once RelStorage is installed together with the appropriate database adapter, and you have created a user and database to use with RelStorage, it’s time to configure the application to use RelStorage. This means telling RelStorage about the database to use, how to connect to it, and specifying any additional options.

Note

This is just a quick-start guide. There is a full list of supported options in a separate document, as well as a list of database-specific options.

Configuring Plone

Plone uses the plone.recipe.zope2instance Buildout recipe to generate zope.conf, so the easiest way to configure RelStorage in a Plone site is to set the rel-storage parameter in buildout.cfg. The rel-storage parameter contains options separated by newlines, with these values:

  • type: any database type supported (postgresql, mysql, or oracle)
  • RelStorage options like cache-servers and poll-interval
  • Adapter-specific options

An example:

rel-storage =
    type mysql
    db plone
    user plone
    passwd PASSWORD

You’ll also need to make sure that the correct version of RelStorage and its database drivers are installed (typically by adding them to the [eggs] section in the buildout.cfg).

Note

For a detailed walk through of installing historic versions of RelStorage in historic versions of Plone 3, see this blog post. It’s important to note that this information is not directly applicable to newer versions (Plone 4 does not use fake eggs, and the version of ZODB used by Plone 4, 3.9.5 and above, does not need patched). The comments section may contain further hints for newer versions.

Configuring using ZConfig

The most common way to configure RelStorage will involve writing a configuration file using the ZConfig syntax. You will write a <relstorage> element containing the general RelStorage options, and containing one database-specific element (<postgresql>, <mysql> or <oracle>). (Where in the file the <relstorage> element goes is specific to the framework or application you’re using and will be covered next.)

In all cases, you’ll need to add %import relstorage to the top-level of the file to let ZConfig know about the RelStorage specific elements.

Examples for PostgreSQL:

<relstorage>
  <postgresql>
    # The dsn is optional, as are each of the parameters in the dsn.
    dsn dbname='zodb' user='username' host='localhost' password='pass'
  </postgresql>
</relstorage>

MySQL:

<relstorage>
  <mysql>
    # Most of the options provided by MySQLdb are available.
    # See component.xml.
    db zodb
  </mysql>
</relstorage>

And Oracle (10g XE in this example):

 <relstorage>
   <oracle>
     user username
     password pass
     dsn XE
   </oracle>
</relstorage>

To add ZODB blob support, provide a blob-dir option that specifies where to store the blobs. For example:

<relstorage>
  blob-dir ./blobs
  <postgresql>
    dsn dbname='zodb' user='username' host='localhost' password='pass'
  </postgresql>
</relstorage>
Configuring Zope 2

To integrate RelStorage in Zope 2, specify a RelStorage backend in etc/zope.conf. Remove the main mount point replace it with the <relstorage> element. For example (using PostgreSQL):

%import relstorage
<zodb_db main>
  mount-point /
  <relstorage>
    <postgresql>
      dsn dbname='zodb' user='username' host='localhost' password='pass'
    </postgresql>
  </relstorage>
</zodb_db>
Configuring repoze.zodbconn

To use RelStorage with repoze.zodbconn, a package that makes ZODB available to WSGI applications, create a configuration file with contents similar to the following:

%import relstorage
<zodb main>
  <relstorage>
    <mysql>
      db zodb
    </mysql>
  </relstorage>
  cache-size 100000
</zodb>

repoze.zodbconn expects a ZODB URI. Use a URI of the form zconfig://path/to/configuration#main where path/to/configuration is the complete path to the configuration file, and main is the name given to the <zodb> element.

Manually Opening a Database

You can also manually open a ZODB database in Python code. Once you have a configuration file as outlined above, you can use the ZODB.config.databaseFromURL API to get a ZODB database:

path = "path/to/configuration"
import ZODB.config
db = ZODB.config.databaseFromURL(path)
conn = db.open()
...

RelStorage Options

Specify these options in zope.conf, as parameters for the relstorage.storage.RelStorage constructor, or as attributes of a relstorage.options.Options instance. In the latter two cases, use underscores instead of dashes in the option names.

General Settings
name

The name of the storage.

Defaults to a descriptive name that includes the adapter connection parameters, except the database password.

read-only

If true, only reads may be executed against the storage.

The default is false.

keep-history

If this option is set to true (the default), the adapter will create and use a history-preserving database schema (like FileStorage). A history-preserving schema supports ZODB-level undo, but also grows more quickly and requires extensive packing on a regular basis.

If this option is set to false, the adapter will create and use a history-free database schema. Undo will not be supported, but the database will not grow as quickly. The database will still require regular garbage collection (which is accessible through the database pack mechanism.)

Warning

This option must not change once the database schema has been installed, because the schemas for history-preserving and history-free storage are different. RelStorage will refuse to initialize if it detects this value has been altered.

If you want to convert between a history-preserving and a history-free database, use the zodbconvert utility to copy to a new database.

commit-lock-timeout

During commit, RelStorage acquires a database-wide lock. This option specifies how long to wait for the lock before failing the attempt to commit. The default is 30 seconds.

The MySQL and Oracle adapters support this option. The PostgreSQL adapter supports this when the PostgreSQL server is at least version 9.3; otherwise it is ignored.

Changed in version 2.0.0b1: Add support for lock timeouts on PostgreSQL 9.3 and above. Previously no PostgreSQL version supported timeouts.

commit-lock-id

During commit, RelStorage acquires a database-wide lock. This option specifies the lock ID.

This option currently applies only to the Oracle adapter.

create-schema

Normally, RelStorage will create or update the database schema on start-up.

Set this option to false if you need to connect to a RelStorage database without automatic creation or updates.

Blobs
blob-dir

If supplied, the storage will provide ZODB blob support; this option specifies the name of the directory to hold blob data. The directory will be created if it does not exist.

If no value (or an empty value) is provided, then no blob support will be provided.

shared-blob-dir

If true (the default), the blob directory is assumed to be shared among all clients using NFS or similar; blob data will be stored only on the filesystem and not in the database. If false, blob data is stored in the relational database and the blob directory holds a cache of blobs.

When this option is false, the blob directory should not be shared among clients.

blob-cache-size

Maximum size of the blob cache, in bytes. If empty (the default), the cache size isn’t checked and the blob directory will grow without bounds. This should be either empty or significantly larger than the largest blob you store. At least 1 gigabyte is recommended for typical databases. More is recommended if you store large files such as videos, CD/DVD images, or virtual machine images.

This option allows suffixes such as “mb” or “gb”. This option is ignored if shared-blob-dir is true.

blob-cache-size-check

Blob cache check size as percent of blob-cache-size: “10” means “10%”. The blob cache size will be checked when this many bytes have been loaded into the cache. Defaults to 10% of the blob cache size.

This option is ignored if shared-blob-dir is true.

blob-chunk-size

When ZODB blobs are stored in MySQL, RelStorage breaks them into chunks to minimize the impact on RAM. This option specifies the chunk size for new blobs. If RAM is available and the network connection to the database server is fast, a larger value can be more efficient because it will result in fewer roundtrips to the server.

Caution

On MySQL, this value cannot exceed the server’s max_allowed_packet setting. If blob chunks are larger than that, it won’t be possible to upload them. If blob chunks are uploaded and then that value is later reduced, it won’t be possible to download blobs that exceed that value.

The driver may also influence this. For example, umysqldb needs a 1.3MB max_allowed_packet to send multiple 1MB chunks.

On PostgreSQL and Oracle, this value is used as the memory buffer size for blob upload and download operations.

The default is 1048576 (1 megabyte). This option allows suffixes such as “mb” or “gb”.

This option has no effect if shared-blob-dir is true (because blobs are not stored on the server).

Replication
replica-conf

If this option is provided, it specifies a text file that contains a list of database replicas the adapter can choose from. For MySQL and PostgreSQL, put in the replica file a list of host:port or host values, one per line. For Oracle, put in a list of DSN values. Blank lines and lines starting with # are ignored.

The adapter prefers the first replica specified in the file. If the first is not available, the adapter automatically tries the rest of the replicas, in order. If the file changes, the adapter will drop existing SQL database connections and make new connections when ZODB starts a new transaction.

ro-replica-conf

Like the replica-conf option, but the referenced text file provides a list of database replicas to use only for read-only load connections. This allows RelStorage to load objects from read-only database replicas, while using read-write replicas for all other database interactions.

If this option is not provided, load connections will fall back to the replica pool specified by replica-conf. If ro-replica-conf is provided but replica-conf is not, RelStorage will use replicas for load connections but not for other database interactions.

Note that if read-only replicas are asynchronous, the next interaction after a write operation might not be up to date. When that happens, RelStorage will log a “backward time travel” warning and clear the ZODB cache to prevent consistency errors. This will likely result in temporarily reduced performance as the ZODB cache is repopulated.

New in version 1.6.0.

replica-timeout

If this option has a nonzero value, when the adapter selects a replica other than the primary replica, the adapter will try to revert to the primary replica after the specified timeout (in seconds).

The default is 600, meaning 10 minutes.

revert-when-stale

Specifies what to do when a database connection is stale. This is especially applicable to asynchronously replicated databases: RelStorage could switch to a replica that is not yet up to date.

When revert-when-stale is false (the default) and the database connection is stale, RelStorage will raise a ReadConflictError if the application tries to read or write anything. The application should react to the ReadConflictError by retrying the transaction after a delay (possibly multiple times.) Once the database catches up, a subsequent transaction will see the update and the ReadConflictError will not occur again.

When revert-when-stale is true and the database connection is stale, RelStorage will log a warning, clear the affected ZODB connection cache (to prevent consistency errors), and let the application continue with database state from an earlier transaction. This behavior is intended to be useful for highly available, read-only ZODB clients. Enabling this option on ZODB clients that read and write the database is likely to cause confusion for users whose changes seem to be temporarily reverted.

New in version 1.6.0.

GC and Packing
pack-gc

If pack-gc is false, pack operations do not perform garbage collection. Garbage collection is enabled by default.

If garbage collection is disabled, pack operations keep at least one revision of every object. With garbage collection disabled, the pack code does not need to follow object references, making packing conceivably much faster. However, some of that benefit may be lost due to an ever increasing number of unused objects.

Disabling garbage collection is also a hack that ensures inter-database references never break.

pack-prepack-only

If pack-prepack-only is true, pack operations perform a full analysis of what to pack, but no data is actually removed. After a pre-pack, the pack_object, pack_state, and pack_state_tid tables are filled with the list of object states and objects that would have been removed.

If pack-gc is true, the object_ref table will also be fully populated. The object_ref table can be queried to discover references between stored objects.

pack-skip-prepack

If pack-skip-prepack is true, the pre-pack phase is skipped and it is assumed the pack_object, pack_state and pack_state_tid tables have been filled already. Thus packing will only affect records already targeted for packing by a previous pre-pack analysis run.

Use this option together with pack-prepack-only to split packing into distinct phases, where each phase can be run during different timeslots, or where a pre-pack analysis is run on a copy of the database to alleviate a production database load.

pack-batch-timeout

Packing occurs in batches of transactions; this specifies the timeout in seconds for each batch. Note that some database configurations have unpredictable I/O performance and might stall much longer than the timeout.

The default timeout is 1.0 seconds.

pack-commit-busy-delay

Before each pack batch, the commit lock is requested. If the lock is already held by for a regular commit, packing is paused for a short while. This option specifies how long the pack process should be paused before attempting to get the commit lock again.

The default delay is 5.0 seconds.

Database Caching

In addition to the ZODB Connection object caches, RelStorage uses pickle caches to reduce the number of queries to the database server. (This is similar to ZEO.) Caches can be both local to a process (within its memory) and remote (and shared between many processes). These options affect all caching operations.

cache-prefix

The prefix for all keys in the cache; also used as part of persistent cache names. All clients using a database should use the same cache-prefix. Defaults to the database name. (For example, in PostgreSQL, the database name is determined by executing SELECT current_database().) Set this if you have multiple databases with the same name.

Changed in version 1.6.0b1: Start defaulting to the database name.

Local Caching

RelStorage caches pickled objects in memory, similar to a ZEO cache. The “local” cache is shared between all threads in a process. An adequately sized local cache is important for the highest possible performance. Using a suitably-sized local cache, especially with persistent data files, can make a substantial performance difference, even if the write volume is relatively high.

For help understanding and tuning the cache behaviour, see Client Cache Tracing.

cache-local-mb

This option configures the approximate maximum amount of memory the cache should consume, in megabytes. It defaults to 10.

Set to 0 to disable the in-memory cache. (This is not recommended.)

cache-local-object-max

This option configures the maximum size of an object’s pickle (in bytes) that can qualify for the “local” cache. The size is measured after compression. Larger objects can still qualify for memcache.

The default is 16384 (1 << 14) bytes.

Changed in version 2.0b2: Measure the size after compression instead of before.

cache-local-compression

This option configures compression within the “local” cache. This option names a Python module that provides two functions, compress() and decompress(). Supported values include zlib, bz2, and none (no compression).

The default is zlib.

If you use the compressing storage wrapper zc.zlibstorage, this option automatically does nothing. With other compressing storage wrappers this should be set to none.

New in version 1.6.

cache-delta-size-limit

This is an advanced option. RelStorage uses a system of checkpoints to improve the cache hit rate. This option configures how many new objects should be stored before creating a new checkpoint.

For write heavy workloads, increasing this can be very beneficial. The cost is about 300K of memory for every 10000 on CPython.

The default is 20000 on CPython, 10000 on PyPy.

Changed in version 2.0b7: Double the default size from 10000 to 20000 on CPython. The use of LLBTree for the internal data structure means we use much less memory than we did before.

Persistent Local Caching

Like ZEO, RelStorage can store its local cache to disk for a quicker application warmup period.

New in version 2.0b2: This is a new, experimental feature. While there should be no problems enabling it, the exact details of its function are subject to change in the future based on feedback and experience.

cache-local-dir

The path to a directory where the local cache will be saved when the database is closed. On startup, RelStorage will look in this directory for cache files to load into memory.

This option can dramatically reduce the amount of time it takes for your application to warm up after a restart, especially if there were relatively few writes in the meantime. Some synthetic benchmarks show an 8-10x improvement after a restart.

This is similar to the ZEO persistent cache, but adds no overhead to normal transactions.

This directory will be populated with a number of files (up to cache-local-dir-count files), written each time a RelStorage instance is closed. If multiple RelStorage processes are working against the same database (for example, the workers of gunicorn), then they will each write and read files in this directory. On startup, the files will be combined to get the “warmest” possible cache.

Each file could potentially be up to the size of cache-local-dir-write-max-size (but see cache-local-dir-compress) The time taken to load the cache file (which only occurs when RelStorage is first opened) and the time taken to save the cache file (which only occurs when the database is closed) are proportional to the total size of the cache; thus, a cache that is too large (holding many unused entries) will slow down startup/shutdown for no benefit. However, the timing is probably not a practical concern compared to the disk usage; on one system, a 300MB uncompressed cache file can be saved in 3-4 seconds and read in 2-3 seconds.

This directory can be shared among storages connected to different databases, so long as they all have a distinct cache-prefix.

If this directory does not exist, we will attempt to create it on startup. It may be possible to share this directory across machines, but that has not been tested.

Tip

If the database (ZODB.DB object) is not properly closed, then the cache files will not be written.

cache-local-dir-count

How many files that cache-local-dir will be allowed to contain before files start getting reused. Set this equal to the number of workers that will be sharing the directory.

The default is 20.

cache-local-dir-compress

Whether to compress the persistent cache files on disk. The default is false because individual entries are usually already compressed and the tested workloads do not show a space benefit from the compression; in addition, compression can slow the reading and writing process by 2 to 3 times or more (and hence slow down opening the storage).

New in version 2.0b5.

cache-local-dir-read-count

The maximum number of files to read to populate the cache on startup.

By default, RelStorage will read all the appropriate files (so up to cache-local-dir-count files), from newest to oldest, collecting the distinct entries out of them, until the cache is fully populated (cache-local-mb) or there are no more files. This ensures that after startup, all workers have the most fully populated cache. This strategy works well if the workers have a good distribution of work (relatively few overlapping items) and the cache size is relatively small; after startup they will all be equally warm without spending too much startup time.

However, if the workers all do nearly the same work (so most items in the cache files are the same) and the cache sizes are very large, then the benefits of reading each subsequent file diminish (because it has very few if any new entries to add, and reading them all takes a lot of time). In that case, set this value to 1 to only read the first (“best”) cache file.

For situations in between, choose a number in between.

New in version 2.0b5.

cache-local-dir-write-max-size

The approximate maximum size of each individual cache file on disk. When not specified (the default), the maximum file size will be the same as cache-local-mb.

This is an approximate number because there is some overhead associated with the storage format that varies based on the number of entries in the cache.

RelStorage will write to disk, from most important to least important, the entries in the cache until all the entries are written or this limit is reached. If you use a size smaller than cache-local-mb, however, you may miss important entries that are only used at application startup.

New in version 2.0b7.

Remote Caching

RelStorage can use Memcached servers as a secondary, semi-persistent database cache. They are most useful if the ration of writes to reads is relatively low (because they add overhead to each write operation). They can also be useful if the database server is behind a high-latency connection or otherwise responds slowly.

cache-servers

Specifies a list of memcached servers. Using memcached with RelStorage improves the speed of frequent object accesses while slightly reducing the speed of other operations.

Provide a list of host:port pairs, separated by whitespace. “127.0.0.1:11211” is a common setting. Some memcached modules, such as pylibmc, allow you to specify a path to a Unix socket instead of a host:port pair.

The default is to disable memcached integration.

New in version 1.1rc1.

cache-module-name

Specifies which Python memcache module to use. The default is “relstorage.pylibmc_wrapper”, which requires pylibmc. An alternative module is “memcache”, a pure Python module. If you use the memcache module, use at least version 1.47.

This option has no effect unless cache-servers is set.

Database-Specific Adapter Options

Each adapter supports one common option:

driver

The name of the driver to use for this database. Defaults to “auto”, meaning to choose the best driver from among the possibilities. Most of the time this option should be omitted and RelStorage will choose the best option.

This is handy to set when an environment might have multiple drivers installed, some of which might be non-optimal. For example, on PyPy, PyMySQL is generally faster than MySQLdb, but both can be installed (in the form of mysqlclient for the latter).

This is also convenient when using zodbshootout to compare driver speeds.

If you specify a driver that is not installed, an error will be raised.

Each adapter will document the available driver names.

New in version 2.0b2.

PostgreSQL Adapter Options

RelStorage 2.1 performs best with PostgreSQL 9.5 or above.

The PostgreSQL adapter accepts:

driver

The possible options are:

psycopg2
A C-based driver that requires the PostgreSQL development libraries. Optimal on CPython, but not compatible with gevent.
psycopg2cffi
A C-based driver that requires the PostgreSQL development libraries. Optimal on PyPy and almost indistinguishable from psycopg2 on CPython. Not compatible with gevent.
pg8000

A pure-Python driver suitable for use with gevent. Works on all supported platforms.

Note

pg8000 requires PostgreSQL 9.4 or above for BLOB support.

dsn

Specifies the data source name for connecting to PostgreSQL. A PostgreSQL DSN is a list of parameters separated with whitespace. A typical DSN looks like:

dbname='zodb' user='username' host='localhost' password='pass'

If dsn is omitted, the adapter will connect to a local database with no password. Both the user and database name will match the name of the owner of the current process.

MySQL Adapter Options

The MySQL adapter accepts most parameters supported by the mysqlclient library (the maintained version of MySQL-python), including:

driver

The possible options are:

MySQLdb
A C-based driver that requires the MySQL client development libraries.. This is best provided by the PyPI distribution mysqlclient. (It can also be provided by the legacy MySQL-python distribution, but only on CPython 2; this distribution is no longer tested.) These drivers are not compatible with gevent.
PyMySQL
A pure-Python driver provided by the distribution of the same name. It works with CPython 2 and 3 and PyPy (where it is preferred). It is compatible with gevent.
umysqldb

A C-based driver that builds on PyMySQL. It is compatible with gevent, but only works on CPython 2. It does not require the MySQL client development libraries but uses a project called umysql to communicate with the server using only sockets.

Note

Make sure the server has a max_allowed_packet setting no larger than 16MB. Also make sure that RelStorage’s blob-chunk-size is less than 16MB as well.

Note

This fork of umysqldb is recommended. The full-buffer branch of this ultramysql fork is also recommended if you encounter strange MySQL packet errors.

MySQL Connector/Python

This is the official client provided by Oracle. It generally cannot be installed from PyPI or by pip if you want the optional C extension. It has an optional C extension that must be built manually. The C extension (which requires the MySQL client development libraries) performs about as well as mysqlclient, but the pure-python version somewhat slower than PyMySQL. However, it supports more advanced options for failover and high availability.

When using this name, RelStorage will use the C extension if available, otherwise it will use the Python version.

Binary packages are distributed by Oracle for many platforms and include the necessary native libraries and C extension.

New in version 2.1a1.

C MySQL Connector/Python
The same as above, but RelStorage will only use the C extension. This is not compatible with gevent.
Py MySQL Connector/Python
Like the above, but RelStorage will use the pure-Python version only. This is compatible with gevent.
host
string, host to connect
user
string, user to connect as
passwd
string, password to use
db
string, database to use
port
integer, TCP/IP port to connect to
unix_socket
string, location of unix_socket (UNIX-ish only)
conv
mapping, maps MySQL FIELD_TYPE.* to Python functions which convert a string to the appropriate Python type
connect_timeout
number of seconds to wait before the connection attempt fails.
compress
if set, gzip compression is enabled
named_pipe
if set, connect to server via named pipe (Windows only)
init_command
command which is run once the connection is created
read_default_file
see the MySQL documentation for mysql_options()
read_default_group
see the MySQL documentation for mysql_options()
client_flag
client flags from MySQLdb.constants.CLIENT
load_infile
int, non-zero enables LOAD LOCAL INFILE, zero disables
Oracle Adapter Options

The Oracle adapter accepts:

driver

Other than “auto” the only supported value is “cx_Oracle”.

Caution

If you use cx_Oracle 5.2.1 or 5.3 (in general, any version >= 5.2 but < 6.0) you must be sure that it is compiled against a version of the Oracle client that is compatible with the Oracle database to which you will be connecting.

Specifically, if you will be connecting to Oracle database 11 or earlier, you must not compile against client version 12. (Compiling against an older client and connecting to a newer database is fine.) If you use a client that is too new, RelStorage will fail to commit with the error DatabaseError: ORA-03115: unsupported network datatype or representation.

For more details, see issue #172.

user
The Oracle account name
password
The Oracle account password
dsn
The Oracle data source name. The Oracle client library will normally expect to find the DSN in /etc/oratab.

zodbconvert

RelStorage comes with a script named zodbconvert that converts databases between formats. Use it to convert a FileStorage instance to RelStorage and back, or to convert between different kinds of RelStorage instances, or to convert other kinds of storages that support the storage iterator protocol.

When converting between two history-preserving databases (note that FileStorage uses a history-preserving format), zodbconvert preserves all objects and transactions, meaning you can still use the ZODB undo feature after the conversion, and you can convert back using the same process in reverse. When converting from a history-free database to either a history-free database or a history-preserving database, zodbconvert retains all data, but the converted transactions will not be undoable. When converting from a history-preserving storage to a history-free storage, zodbconvert drops all historical information during the conversion.

How to use zodbconvert

Create a ZConfig style configuration file that specifies two storages, one named “source”, the other “destination”. The configuration file format is very much like zope.conf. Then run zodbconvert, providing the name of the configuration file as a parameter.

The utility does not modify the source storage. Before copying the data, the utility verifies the destination storage is completely empty. If the destination storage is not empty, the utility aborts without making any changes to the destination, unless the --incremental option is used (in which case the destination must be a previously copied version of the source).

Here is a sample zodbconvert configuration file:

<filestorage source>
  path /zope/var/Data.fs
</filestorage>

<relstorage destination>
  <mysql>
    db zodb
  </mysql>
</relstorage>

This configuration file specifies that the utility should copy all of the transactions from Data.fs to a MySQL database called “zodb”. If you want to reverse the conversion, exchange the names “source” and “destination”. All storage types and storage options available in zope.conf are also available in this configuration file.

Options for zodbconvert
usage: zodbconvert [-h] [--dry-run] [--clear] [--incremental] config_file

ZODB storage conversion utility.

positional arguments:
  config_file

optional arguments:
  -h, --help     show this help message and exit
  --dry-run      Attempt to open both storages, then explain what would be
                 done.
  --clear        Clear the contents of the destination storage before copying.
                 Only works if the destination is a RelStorage. WARNING: use
                 this only if you are certain the destination has no useful
                 data.
  --incremental  Assume the destination contains a partial copy of the source
                 and resume copying from the last transaction. WARNING: no
                 effort is made to verify that the destination holds the same
                 transaction data before this point! Use at your own risk.

zodbpack

RelStorage also comes with a script named zodbpack that packs any ZODB storage that allows concurrent connections (including RelStorage and ZEO, but not including FileStorage). Use zodbpack in cron scripts. Pass the script the name of a configuration file that lists the storages to pack, in ZConfig format. An example configuration file:

<relstorage>
  pack-gc true
  <mysql>
    db zodb
  </mysql>
</relstorage>
Options for zodbpack
--days or -d
Specifies how many days of historical data to keep. Defaults to 0, meaning no history is kept. This is meaningful even for history-free storages, since unreferenced objects are not removed from the database until the specified number of days have passed.
--prepack
Instructs the storage to only run the pre-pack phase of the pack but not actually delete anything. This is equivalent to specifying pack-prepack-only true in the storage options.
--use-prepack-state
Instructs the storage to only run the deletion (packing) phase, skipping the pre-pack analysis phase. This is equivalent to specifying pack-skip-prepack true in the storage options.
usage: zodbpack [-h] [-d DAYS] [--prepack] [--use-prepack-state] config_file

ZODB storage packing utility.

positional arguments:
  config_file

optional arguments:
  -h, --help            show this help message and exit
  -d DAYS, --days DAYS  Days of history to keep (default 0)
  --prepack             Perform only the pre-pack preparation stage of a pack.
                        (Only works with some storage types)
  --use-prepack-state   Skip the preparation stage and go straight to packing.
                        Requires that a pre-pack has been run, or that packing
                        was aborted before it was completed.

Use With zodburi

This package also enable the use of the postgres://, mysql:// and oracle:// URI schemes for zodburi. For more information about zodburi, please refer to its documentation. This section contains information specific to the these schemes.

URI schemes

The postgres:// , mysql:// and oracle:// URI schemes can be passed as zodbconn.uri to create a RelStorage PostgresSQL, MySQL or Oracle database factory. The uri should contain the user, the password, the host, the port and the db name e.g.:

postgres://someuser:somepass@somehost:5432/somedb?connection_cache_size=20000
mysql://someuser:somepass@somehost:5432/somedb?connection_cache_size=20000

Because oracle connection information are most often given as dsn, the oracle uri should not contain the same information as the other, but only the dsn

oracle://?dsn="HERE GOES THE DSN"
Query String Arguments

The URI scheme also accepts query string arguments. The query string arguments honored by this scheme are as follows.

Usual zodburi arguments

Arguments that are usual with zodburi are also available here (see http://docs.pylonsproject.org/projects/zodburi/en/latest/) :

demostorage
boolean (if true, wrap RelStorage in a DemoStorage)
database_name
string
connection_cache_size
integer (default 10000)
connection_pool_size
integer (default 7)
Postgres specific
connection_timeout
integer
ssl_mode
string
Mysql specific
connection_timeout
integer
client_flag
integer
load_infile
integer
compress
boolean
named_pipe
boolean
unix_socket
string
init_command
string
read_default_file
string
read_default_group
string
Oracle specific
twophase
integer
user
string
password
string
dsn
string
Example

An example that combines a path with a query string:

postgres://someuser:somepass@somehost:5432/somedb?connection_cache_size=20000

Client Cache Tracing

Note

This document contains information based on the original. View that document for more information.

An important question for RelStorage users is: how large should the RelStorage local cache be? RelStorage 2 (as of RelStorage 2.0b3) has a new feature that lets you collect a trace of cache activity and tools to analyze this trace, enabling you to make an informed decision about the cache size.

Don’t confuse the RelStorage local cache with the ZODB object cache. The RelStorage local cache is only used when an object is not in the ZODB object cache; the RelStorage local cache avoids roundtrips to the database server.

Enabling Cache Tracing

To enable cache tracing, you must use a persistent cache (specify a cache-local-dir name), and set the environment variable ZEO_CACHE_TRACE to a non-empty value. The path to the trace file is derived from the path to the persistent cache directory and the process’s PID. If the file doesn’t exist, RelStorage will try to create it. If the file does exist, it’s opened for appending (previous trace information is not overwritten). If there are problems with the file, a warning message is logged. To start or stop tracing, the RelStorage instance must be closed and re-opened (usually this means stopping and restarting the entire process).

Tip

If you are using only one client process per cache-local-dir, set ZEO_CACHE_TRACE to single. This will allow process restarts to be recorded in the trace file. Otherwise (if you are using multiple processes sharing the same cache directory or use a value different than single, each process will use its own trace file.) Note that using a value of single with multiple processes is undefined.

The trace file can grow pretty quickly; on a moderately loaded server, we observed it growing by 7 MB per hour. The file consists of binary records, each 34 bytes long if 8-byte oids are in use. No sensitive data are logged: data record sizes (but not data records), and binary object and transaction ids are logged, but no object pickles, object types or names, user names, transaction comments, access paths, or machine information (such as machine name or IP address) is logged.

Analyzing a Cache Trace

The cache_stats.py command-line tool (python -m ZEO.scripts.cache_stats) is the first-line tool to analyze a cache trace. Its default output consists of two parts: a one-line summary of essential statistics for each segment of 15 minutes, interspersed with lines indicating client restarts, followed by a more detailed summary of overall statistics.

The most important statistic is the “hit rate”, a percentage indicating how many requests to load an object could be satisfied from the cache. Hit rates around 70% are good. 90% is excellent. If you see a hit rate under 60% you can probably improve the cache performance (and hence your application server’s performance) by increasing the local cache size. This is normally configured using key cache-local-mb in the relstorage section of your configuration file. The default cache size is 10 MB, which is small.

Note

The trace file records all cache activity. If you are using Memcached servers (via the cache-servers setting), hits they produce will also be recorded. To isolate just the effects of the local cache, disable cache-servers. On the other hand, comparing trace files from instances with Memcached enabled and it disabled can help show if the additional overhead produces enough extra hits to be worthwhile.

The cache_stats.py tool (python -m ZEO.scripts.cache_stats) shows its command line syntax when invoked without arguments. The tracefile argument can be a gzipped file if it has a .gz extension. It will be read from stdin (assuming uncompressed data) if the tracefile argument is ‘-‘.

usage: cache_stats.py [--verbose | --quiet] [--sizes] [--no-stats]
                      [--load-histogram] [--check] [--interval INTERVAL]
                      tracefile

Trace file statistics analyzer

positional arguments:
  tracefile             The trace to read; may be gzipped

optional arguments:
  --verbose, -v         Be verbose; print each record
  --quiet, -q           Reduce output; don't print summaries
  --sizes, -s           print histogram of object sizes
  --no-stats, -S        don't print statistics
  --load-histogram, -h  print histogram of object load frequencies
  --check, -X           enable heuristic checking for misaligned records: oids
                        > 2**32 will be rejected; this requires the tracefile
                        to be seekable
  --interval INTERVAL, -i INTERVAL
                        summarizing interval in minutes (default 15; max 60)
Simulating Different Cache Sizes

Based on a cache trace file, you can make a prediction of how well the cache might do with a different cache size. The cache_simul.py (python -m ZEO.scripts.cache_simul) tool runs a simulation of the ZEO client cache implementation based upon the events read from a trace file. A new simulation is started each time the trace file records a client restart event; if a trace file contains more than one restart event, a separate line is printed for each simulation, and a line with overall statistics is added at the end.

usage: cache_simul.py [-h] [--size CACHELIMIT] [--interval INTERVAL]
                      [--rearrange REARRANGE]
                      tracefile

Cache simulation. Note: - The simulation isn't perfect. - The simulation will
be far off if the trace file was created starting with a non-empty cache

positional arguments:
  tracefile             The trace to read; may be gzipped

optional arguments:
  -h, --help            show this help message and exit
  --size CACHELIMIT, -s CACHELIMIT
                        cache size in MB (default 20MB)
  --interval INTERVAL, -i INTERVAL
                        summarizing interval in minutes (default 15; max 60)
  --rearrange REARRANGE, -r REARRANGE
                        rearrange factor

Example, assuming the trace file is in /tmp/cachetrace.log:

$ python -m ZEO.scripts.cache_simul.py -s 4 /tmp/cachetrace.log
CircularCacheSimulation, cache size 4,194,304 bytes
  START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
Jul 22 22:22     39:09  3218856  1429329  24046  41517   44.4%   40776    99.8

This shows that with a 4 MB cache size, the cache hit rate is 44.4%, the percentage 1429329 (number of cache hits) is of 3218856 (number of load requests). The cache simulated 40776 evictions, to make room for new object states. At the end, 99.8% of the bytes reserved for the cache file were in use to hold object state (the remaining 0.2% consists of “holes”, bytes freed by object eviction and not yet reused to hold another object’s state).

Let’s try this again with an 8 MB cache:

$ python -m ZEO.scripts.cache_simul.py -s 8 /tmp/cachetrace.log
CircularCacheSimulation, cache size 8,388,608 bytes
  START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
Jul 22 22:22     39:09  3218856  2182722  31315  41517   67.8%   40016   100.0

That’s a huge improvement in hit rate, which isn’t surprising since these are very small cache sizes. The default cache size is 20 MB, which is still on the small side:

$ python -m ZEO.scripts.cache_simul.py /tmp/cachetrace.log
CircularCacheSimulation, cache size 20,971,520 bytes
  START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
Jul 22 22:22     39:09  3218856  2982589  37922  41517   92.7%   37761    99.9

Again a very nice improvement in hit rate, and there’s not a lot of room left for improvement. Let’s try 100 MB:

$ python -m ZEO.scripts.cache_simul.py -s 100 /tmp/cachetrace.log
CircularCacheSimulation, cache size 104,857,600 bytes
  START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
Jul 22 22:22     39:09  3218856  3218741  39572  41517  100.0%   22778   100.0

It’s very unusual to see a hit rate so high. The application here frequently modified a very large BTree, so given enough cache space to hold the entire BTree it rarely needed to ask the ZEO server for data: this application reused the same objects over and over.

More typical is that a substantial number of objects will be referenced only once. Whenever an object turns out to be loaded only once, it’s a pure loss for the cache: the first (and only) load is a cache miss; storing the object evicts other objects, possibly causing more cache misses; and the object is never loaded again. If, for example, a third of the objects are loaded only once, it’s quite possible for the theoretical maximum hit rate to be 67%, no matter how large the cache.

The cache_simul.py script also contains code to simulate different cache strategies. Since none of these are implemented, and only the default cache strategy’s code has been updated to be aware of MVCC, these are not further documented here.

Simulation Limitations

The cache simulation is an approximation, and actual hit rate may be higher or lower than the simulated result. These are some factors that inhibit exact simulation:

  • Important The simulation tools were designed for ZEO and may not be accurate for RelStorage due to the different mechanisms of invalidation employed. Still, they may give a useful idea.

    Because of the invalidation differences, some trace files might cause the script to produce an AssertionError. These can typically be ignored and commented out in the script itself to proceed.

  • Each time a load of an object O in the trace file was a cache hit, but the simulated cache has evicted O, the simulated cache has no way to repair its knowledge about O. This is more frequent when simulating caches smaller than the cache used to produce the trace file. When a real cache suffers a cache miss, it asks the database server for the needed information about O, and saves O in the local cache. The simulated cache doesn’t have a database server to ask, and O continues to be absent in the simulated cache. Further requests for O will continue to be simulated cache misses, although in a real cache they’ll likely be cache hits. On the other hand, the simulated cache doesn’t need to evict any objects to make room for O, so it may enjoy further cache hits on objects a real cache would have evicted.

FAQs

Q: How can I help improve RelStorage?

A: The best way to help is to test and to provide database-specific expertise. Ask questions about RelStorage on the zodb-dev mailing list.

Q: Can I perform SQL queries on the data in the database?

A: No. Like FileStorage and DirectoryStorage, RelStorage stores the data as pickles, making it hard for anything but ZODB to interpret the data. An earlier project called Ape attempted to store data in a truly relational way, but it turned out that Ape worked too much against ZODB principles and therefore could not be made reliable enough for production use. RelStorage, on the other hand, is much closer to an ordinary ZODB storage, and is therefore more appropriate for production use.

Q: How does RelStorage performance compare with FileStorage?

A: According to benchmarks, RelStorage with PostgreSQL is often faster than FileStorage, especially under high concurrency. See Performance for more.

Q: Why should I choose RelStorage?

A: Because RelStorage is a fairly small layer that builds on world-class databases. These databases have proven reliability and scalability, along with numerous support options.

Q: Can RelStorage replace ZRS (Zope Replication Services)?

A: Yes, RelStorage inherits the replication capabilities of PostgreSQL, MySQL, and Oracle.

Q: How do I set up an environment to run the RelStorage tests?

Q: Why do I get DatabaseError: ORA-03115: unsupported network datatype or representation when using Oracle?

See the “driver” section of Oracle Adapter Options for more information.

Performance

This document captures various performance metrics, comparing RelStorage to other ZODB storages at given points in time. The numbers given here cannot be compared to each other outside their own test.

RelStorage 2.1a1 vs RelStorage 2.0

This section compares the performance improvements in RelStorage 2.1a1 with RelStorage 2.0. (Hot and warm results are omitted because they had essentially no change in these tests.)

All RelStorage schemas were history free and did not use memcache. ZODB was version 5.1.1, MySQL was version 5.7.17, and PostgreSQL was version 9.6.1. All databases and cache settings were at the default. The database drivers were mysqlclient-1.3.9 and psycopg2-2.6.2.

The test suite was zodbshootout 0.6 running on CPython 2.7.13 on a 2.5Ghz Intel Core i7 (MacBookPro11,5) under macOS 10.12.2.

First, the default test scenario for zodbshootout (1000 objects per transaction, each with a size of 128), running in two concurrent processes. We can see substantial gains for PostgreSQL on all tests (30-40%), while MySQL shows modest gains for adding and reading objects (10%).

_images/perf-rs21v20-c2.png

That test is useful for assessing raw throughput, but it is not very representative of most real-world uses. Studies of some production databases show that most transactions consist of 100 or fewer objects that are often around 256 bytes in size. (Naturally these numbers can vary quite a bit depending on application.) Also, many applications use in-process concurrency, whether threads or gevent.

This test reflects that, using 6 threads each working on 100 256-byte objects. We can again see substantial gains for PostgreSQL on adding and updating objects (20% and 60%, respectively), and modest gains for MySQL on both those tasks (10% and 7%, respectively).

_images/perf-rs21v20-c6-n100-s256.png
RelStorage 2.0

All RelStorage schemas were history free and did not use a memcache instance. ZEO was version 5.0.4 and used a FileStorage backend from ZODB 5.1.1 MySQL was version 5.5.53 and PostgreSQL was version 9.6.1. The database drivers were mysqlclient-1.3.9 and psycopg2 2.6.2. All RelStorage cache settings were at their default values (MySQL configuration had been somewhat modified from its defaults but PostgreSQL was at its defaults).

The test suite was zodbshootout 0.6 with a concurrency level of 2 (and all other options the default), running on CPython 2.7.12 on a 2.5Ghz Intel Core i7 (MacBookPro11,5) under macOS 10.12.2.

Transaction PostgreSQL MySQL ZEO
Add 1000 Objects 19534 23184 8152
Update 1000 Objects 17140 23790 7122
Read 1000 Warm Objects 14232 16649 4796
Read 1000 Cold Objects 15595 18070 4860
Read 1000 Hot Objects 86326 88422 82600
_images/perf-rs20-no-mem.png
Memcache

This is exactly the same configuration as above, but shows the effect of introducing a local memcache instance. The “warm” test performs better, but the other tests perform worse.

Transaction PostgreSQL MySQL ZEO
Add 1000 Objects 14789 17378 8220
Update 1000 Objects 13834 17711 7021
Read 1000 Warm Objects 27394 27978 4662
Read 1000 Cold Objects 7621 8415 4344
Read 1000 Hot Objects 86165 83438 76540
_images/perf-20.png
PyPy

This is the same as the second configuration, except we’re using PyPy 5.6.0 for everything (including running the ZEO server). The database drivers were PyMySQL-0.7.9 and psycopg2cffi-2.7.5.

Transaction PostgreSQL MySQL ZEO
Add 1000 Objects 15235 13849 4681
Update 1000 Objects 18522 25939 4853
Read 1000 Warm Object 15138 964041 23843
Read 1000 Cold Object 11974 9876 3360
Read 1000 Hot Objects 99241 81685 25552
_images/perf-rs20pypy.png

Under PyPy, the benchmark was run in shared threads mode (--threads shared) instead of separate processes, to allow PyPy’s JIT to warm up. Under CPython, shared threads for concurrency two generally reduces performance (with the exception of one test), but under PyPy it substantially improves performance.

For comparison, here’s the shared thread results for CPython:

Transaction PostgreSQL MySQL ZEO
Add 1000 Objects 13409 14081 4852
Update 1000 Objects 11541 14310 4496
Read 1000 Warm Objects 42646 43699 123079
Read 1000 Cold Objects 10253 7712 3497
Read 1000 Hot Objects 24973 26129 25049
RelStorage 1.6.1 vs RelStorage 2.0

These tests are run using the same configuration (databases, database drivers, CPython version and same machine) as above, to show any performance differences between RelStorage 2.0 and RelStorage 1.6.1 (running with ZODB 4.4.4 and ZEO 4.3.1).

_images/perf-rs20-rs16.png
RelStorage 1.4.0b1

All RelStorage schemas were history free and used a memcache instance. ZEO was version 3.9.3 and used a FileStorage backend.

The test suite was zodbshootout 0.2 with a concurrency level of 2, running on Python 2.6 on a 2.1Ghz Intel Core 2 Duo (T8100)

Transaction PostgreSQL MySQL ZEO
Add 1000 Objects 9177 12711 5044
Update 1000 Objects 10040 9431 4017
Read 1000 Warm Objects 20897 22310 1924
Read 1000 Cold Objects 6591 5853 1932
Read 1000 Hot Objects 36847 37950 36545
_images/perf-rs14.png

Migrating to a new version of RelStorage

Sometimes RelStorage needs a schema modification along with a software upgrade. Hopefully, this will not often be necessary.

Migrating to RelStorage version 1.5

All databases need a schema migration for this release. This release adds a state_size column to the object_state table, making it possible to query the size of objects without loading the state. The new column is intended for gathering statistics.

Please note that if you are using the history-free schema, you need to first migrate to RelStorage 1.4.2 by following the instructions in Migrating to RelStorage version 1.4.2.

PostgreSQL
  1. Migrate the object_state table:

    BEGIN;
    ALTER TABLE object_state ADD COLUMN state_size BIGINT;
    UPDATE object_state SET state_size = COALESCE(LENGTH(state), 0);
    ALTER TABLE object_state ALTER COLUMN state_size SET NOT NULL;
    COMMIT;
    
  2. The “plpgsql” language is now required and must be enabled in your database. Depending on your version of PostgreSQL, you may have to execute the following psql command as the database superuser:

    CREATE LANGUAGE plpgsql;
    
  3. If you used a beta version of RelStorage 1.5.0, you need to migrate your blob_chunk table schema:

    CREATE OR REPLACE FUNCTION blob_write(data bytea) RETURNS oid
    AS $blob_write$
        DECLARE
            new_oid OID;
            fd INTEGER;
            bytes INTEGER;
        BEGIN
            new_oid := lo_create(0);
            fd := lo_open(new_oid, 131072);
            bytes := lowrite(fd, data);
            IF (bytes != LENGTH(data)) THEN
                RAISE EXCEPTION 'Not all data copied to blob';
            END IF;
            PERFORM lo_close(fd);
            RETURN new_oid;
        END;
    $blob_write$ LANGUAGE plpgsql;
    BEGIN;
    ALTER TABLE blob_chunk RENAME COLUMN chunk TO oldbytea;
    ALTER TABLE blob_chunk ADD COLUMN chunk OID;
    UPDATE blob_chunk SET chunk = blob_write(oldbytea);
    ALTER TABLE blob_chunk
        ALTER COLUMN chunk SET NOT NULL,
        DROP COLUMN oldbytea;
    COMMIT;
    

If the conversion succeeded, the psql prompt will respond with “COMMIT”. If something went wrong, psql will respond with “ROLLBACK”.

3A. The script in step 3 does not reclaim the space occupied by the oldbytea column. If there is a large amount of data in the blob_chunk table, you may want to re-initialize the whole table by moving the data to a temporary table and then copying it back:

BEGIN;
ALTER TABLE blob_chunk DISABLE TRIGGER USER;
CREATE TEMP TABLE blob_chunk_copy (LIKE blob_chunk) ON COMMIT DROP;
INSERT INTO blob_chunk_copy SELECT * FROM blob_chunk;
TRUNCATE blob_chunk;
INSERT INTO blob_chunk SELECT * FROM blob_chunk_copy;
ALTER TABLE blob_chunk ENABLE TRIGGER USER;
COMMIT;
MySQL history-preserving

Execute:

ALTER TABLE object_state ADD COLUMN state_size BIGINT AFTER md5;
UPDATE object_state SET state_size = COALESCE(LENGTH(state), 0);
ALTER TABLE object_state MODIFY state_size BIGINT NOT NULL AFTER md5;
MySQL history-free

Execute:

ALTER TABLE object_state ADD COLUMN state_size BIGINT AFTER tid;
UPDATE object_state SET state_size = COALESCE(LENGTH(state), 0);
ALTER TABLE object_state MODIFY state_size BIGINT NOT NULL AFTER tid;
Oracle

Execute:

ALTER TABLE object_state ADD state_size NUMBER(20);
UPDATE object_state SET state_size = COALESCE(LENGTH(state), 0);
ALTER TABLE object_state MODIFY state_size NOT NULL;
Migrating to RelStorage version 1.4.2

If you are using a history-free storage, you need to drop and re-create the object_refs_added table. It contains only temporary state used during packing, so it is safe to drop and create the table at any time except while packing.

Do not make these changes to history-preserving databases.

PostgreSQL:

DROP TABLE object_refs_added;
CREATE TABLE object_refs_added (
    zoid        BIGINT NOT NULL PRIMARY KEY,
    tid         BIGINT NOT NULL
);

MySQL:

DROP TABLE object_refs_added;
CREATE TABLE object_refs_added (
    zoid        BIGINT NOT NULL PRIMARY KEY,
    tid         BIGINT NOT NULL
) ENGINE = MyISAM;

Oracle:

DROP TABLE object_refs_added;
CREATE TABLE object_refs_added (
    zoid        NUMBER(20) NOT NULL PRIMARY KEY,
    tid         NUMBER(20) NOT NULL
);
Migrating to RelStorage version 1.4

Before following these directions, first upgrade to the schema of RelStorage version 1.1.2 by following the directions in Migrating from RelStorage version 1.1.1 to version 1.1.2.

Only Oracle needs a change for this release. The Oracle adapter now requires the EXECUTE permission on the DBMS_LOCK package. In the example below, “zodb” is the name of the account:

GRANT EXECUTE ON DBMS_LOCK TO zodb;

Also, the Oracle adapter in RelStorage no longer uses the commit_lock table, so you can drop it. It contains no data.

DROP TABLE commit_lock;
Migrating from RelStorage version 1.1.1 to version 1.1.2

Before following these directions, first upgrade to the schema of RelStorage version 1.1.1 by following the directions in Migrating from RelStorage version 1.1 to version 1.1.1.

Only Oracle needs a schema update for this release:

DROP TABLE temp_pack_visit;
CREATE GLOBAL TEMPORARY TABLE temp_pack_visit (
    zoid        NUMBER(20) NOT NULL PRIMARY KEY,
    keep_tid    NUMBER(20)
);
Migrating from RelStorage version 1.1 to version 1.1.1

Before following these directions, first upgrade to the schema of RelStorage version 1.1 by following the directions in Migrating from RelStorage version 1.0 or 1.0.1 to version 1.1.

PostgreSQL:

DROP TABLE pack_object;
CREATE TABLE pack_object (
    zoid        BIGINT NOT NULL PRIMARY KEY,
    keep        BOOLEAN NOT NULL,
    keep_tid    BIGINT NOT NULL,
    visited     BOOLEAN NOT NULL DEFAULT FALSE
);
CREATE INDEX pack_object_keep_false ON pack_object (zoid)
    WHERE keep = false;
CREATE INDEX pack_object_keep_true ON pack_object (visited)
    WHERE keep = true;

MySQL:

DROP TABLE pack_object;
CREATE TABLE pack_object (
    zoid        BIGINT NOT NULL PRIMARY KEY,
    keep        BOOLEAN NOT NULL,
    keep_tid    BIGINT NOT NULL,
    visited     BOOLEAN NOT NULL DEFAULT FALSE
) ENGINE = MyISAM;
CREATE INDEX pack_object_keep_zoid ON pack_object (keep, zoid);

Oracle:

DROP TABLE pack_object;
CREATE TABLE pack_object (
    zoid        NUMBER(20) NOT NULL PRIMARY KEY,
    keep        CHAR NOT NULL CHECK (keep IN ('N', 'Y')),
    keep_tid    NUMBER(20) NOT NULL,
    visited     CHAR DEFAULT 'N' NOT NULL CHECK (visited IN ('N', 'Y'))
);
CREATE INDEX pack_object_keep_zoid ON pack_object (keep, zoid);
Migrating from RelStorage version 1.0 or 1.0.1 to version 1.1

PostgreSQL:

CREATE INDEX object_state_prev_tid ON object_state (prev_tid);

DROP INDEX pack_object_keep_zoid;
CREATE INDEX pack_object_keep_false ON pack_object (zoid)
    WHERE keep = false;
CREATE INDEX pack_object_keep_true ON pack_object (zoid, keep_tid)
    WHERE keep = true;

ALTER TABLE transaction ADD COLUMN empty BOOLEAN NOT NULL DEFAULT FALSE;

CREATE INDEX current_object_tid ON current_object (tid);

ALTER TABLE object_ref ADD PRIMARY KEY (tid, zoid, to_zoid);
DROP INDEX object_ref_from;
DROP INDEX object_ref_tid;
DROP INDEX object_ref_to;

CREATE TABLE pack_state (
    tid         BIGINT NOT NULL,
    zoid        BIGINT NOT NULL,
    PRIMARY KEY (tid, zoid)
);

CREATE TABLE pack_state_tid (
    tid         BIGINT NOT NULL PRIMARY KEY
);

Users of PostgreSQL 8.2 and above should also drop the pack_lock table since it has been replaced with an advisory lock:

DROP TABLE pack_lock;

Users of PostgreSQL 8.1 and below still need the pack_lock table. If you have deleted it, please create it again with the following statement:

CREATE TABLE pack_lock ();

MySQL:

CREATE INDEX object_state_prev_tid ON object_state (prev_tid);

ALTER TABLE transaction ADD COLUMN empty BOOLEAN NOT NULL DEFAULT FALSE;

CREATE INDEX current_object_tid ON current_object (tid);

ALTER TABLE object_ref ADD PRIMARY KEY (tid, zoid, to_zoid);
ALTER TABLE object_ref DROP INDEX object_ref_from;
ALTER TABLE object_ref DROP INDEX object_ref_tid;
ALTER TABLE object_ref DROP INDEX object_ref_to;

CREATE TABLE pack_state (
    tid         BIGINT NOT NULL,
    zoid        BIGINT NOT NULL,
    PRIMARY KEY (tid, zoid)
) ENGINE = MyISAM;

CREATE TABLE pack_state_tid (
    tid         BIGINT NOT NULL PRIMARY KEY
) ENGINE = MyISAM;

Oracle:

CREATE INDEX object_state_prev_tid ON object_state (prev_tid);

ALTER TABLE transaction ADD empty CHAR DEFAULT 'N' CHECK (empty IN ('N', 'Y'));

CREATE INDEX current_object_tid ON current_object (tid);

ALTER TABLE object_ref ADD PRIMARY KEY (tid, zoid, to_zoid);
DROP INDEX object_ref_from;
DROP INDEX object_ref_tid;
DROP INDEX object_ref_to;

CREATE TABLE pack_state (
    tid         NUMBER(20) NOT NULL,
    zoid        NUMBER(20) NOT NULL,
    PRIMARY KEY (tid, zoid)
);

CREATE TABLE pack_state_tid (
    tid         NUMBER(20) NOT NULL PRIMARY KEY
);
1.0 Beta Migration

Use one of the following scripts to migrate from RelStorage 1.0 beta to RelStorage 1.0. Alter the scripts to match the Python default encoding. For example, if ‘import sys; print sys.getdefaultencoding()’ says the encoding is “iso-8859-1”, change all occurrences of ‘UTF-8’ or ‘UTF8’ to ‘ISO-8859-1’.

PostgreSQL 8.3 (using the psql command):

ALTER TABLE transaction
    ALTER username TYPE BYTEA USING (convert_to(username, 'UTF-8')),
    ALTER description TYPE BYTEA USING (convert_to(description, 'UTF-8'));

PostgreSQL 8.2 and below (using the psql command):

ALTER TABLE transaction
    ALTER username TYPE BYTEA USING
        (decode(replace(convert(username, 'UTF-8'), '\\', '\\\\'), 'escape')),
    ALTER description TYPE BYTEA USING
        (decode(replace(convert(description, 'UTF-8'), '\\', '\\\\'), 'escape'));

MySQL (using the mysql command):

ALTER TABLE transaction
    MODIFY username BLOB NOT NULL,
    MODIFY description BLOB NOT NULL;

Oracle (using the sqlplus command):

ALTER TABLE transaction ADD (
    new_username    RAW(500),
    new_description RAW(2000),
    new_extension   RAW(2000));

UPDATE transaction
    SET new_username = UTL_I18N.STRING_TO_RAW(username, 'UTF8'),
        new_description = UTL_I18N.STRING_TO_RAW(description, 'UTF8'),
        new_extension = extension;

ALTER TABLE transaction DROP (username, description, extension);
ALTER TABLE transaction RENAME COLUMN new_username TO username;
ALTER TABLE transaction RENAME COLUMN new_description TO description;
ALTER TABLE transaction RENAME COLUMN new_extension TO extension;
Migration From PGStorage to RelStorage

PostgreSQL:

-- Migration from PGStorage to RelStorage

-- Do all the work in a transaction
BEGIN;

-- Remove the commit_order information (RelStorage has a better solution).
DROP SEQUENCE commit_seq;
ALTER TABLE transaction DROP commit_order;

-- Make the special transaction 0 match RelStorage
UPDATE transaction SET username='system',
  description='special transaction for object creation'
  WHERE tid = 0;

-- Add the MD5 column and some more constraints.
ALTER TABLE object_state
  ADD CONSTRAINT object_state_tid_check CHECK (tid > 0),
  ADD CONSTRAINT object_state_prev_tid_fkey FOREIGN KEY (prev_tid)
      REFERENCES transaction,
  ADD COLUMN md5 CHAR(32);
UPDATE object_state SET md5=md5(state) WHERE state IS NOT NULL;

-- Replace the temporary tables used for packing.
DROP TABLE pack_operation;
DROP TABLE pack_transaction;
DROP TABLE pack_keep;
DROP TABLE pack_garbage;
CREATE TABLE pack_lock ();
CREATE TABLE object_ref (
    zoid        BIGINT NOT NULL,
    tid         BIGINT NOT NULL,
    to_zoid     BIGINT NOT NULL
);
CREATE INDEX object_ref_from ON object_ref (zoid);
CREATE INDEX object_ref_tid ON object_ref (tid);
CREATE INDEX object_ref_to ON object_ref (to_zoid);
CREATE TABLE object_refs_added (
    tid         BIGINT NOT NULL PRIMARY KEY
);
CREATE TABLE pack_object (
    zoid        BIGINT NOT NULL PRIMARY KEY,
    keep        BOOLEAN NOT NULL,
    keep_tid    BIGINT
);
CREATE INDEX pack_object_keep_zoid ON pack_object (keep, zoid);

-- Now commit everything
COMMIT;

Developing RelStorage

Hacking on RelStorage should generally be done in a virtual environment. Buildout may also be used.

Running Tests

RelStorage ships with an extensive test suite. It can be run from a local checkout using tox:

tox -e py27-mysql

There are environments for each database and each Python version.

You can also run tests manually. Most test file contain a __main__ block that can be used with Python’s -m:

python -m relstorage.tests.test_zodbconvert

There is one file that will run all tests:

python -m relstorage.tests.alltests

If not database drivers are installed, it will only run the unit tests. Otherwise, tests for all installed drivers will be attempted; this can be bypassed with --no-db or limited to particular databases with --only-[mysql|pgsql|oracle].

Databases

Many of RelStorage’s test require access to a database. You will have to have the appropriate database adapter (driver) installed to run these tests; tests for which you don’t have an adapter installed will be skipped.

For example, to get started testing RelStorage against MySQL in a virtual environment you could write:

pip install -e ".[mysql]"

from the root of the checkout. This sets up an editable installation of relstorage complete with the correct MySQL driver.

Before actually running the tests, you need to create a test user account and several databases. Use or adapt the SQL statements below to create the databases. (You can also see example scripts that are used to set up the continuous integration test environment in the .travis directory.)

PostgreSQL

Execute the following using the psql command:

psql -U postgres -c "CREATE USER relstoragetest WITH PASSWORD 'relstoragetest';"
psql -U postgres -c "CREATE DATABASE relstoragetest OWNER relstoragetest;"
psql -U postgres -c "CREATE DATABASE relstoragetest2 OWNER relstoragetest;"
psql -U postgres -c "CREATE DATABASE relstoragetest_hf OWNER relstoragetest;"
psql -U postgres -c "CREATE DATABASE relstoragetest2_hf OWNER relstoragetest;"

Also, add the following lines to the top of pg_hba.conf (if you put them at the bottom, they may be overridden by other parameters):

local   relstoragetest     relstoragetest   md5
local   relstoragetest2    relstoragetest   md5
local   relstoragetest_hf  relstoragetest   md5
local   relstoragetest2_hf relstoragetest   md5
host    relstoragetest     relstoragetest   127.0.0.1/32 md5
host    relstoragetest_hf  relstoragetest   127.0.0.1/32 md5

PostgreSQL specific tests can be run by the testposgresql module:

python -m relstorage.tests.testpostgresql

If the environment variable RS_PG_SMALL_BLOB is set when running the tests, certain blob tests will use a much smaller size, making the test run much faster.

MySQL

Execute the following using the mysql command:

mysql -uroot -e "CREATE USER 'relstoragetest'@'localhost' IDENTIFIED BY 'relstoragetest';"
mysql -uroot -e "CREATE DATABASE relstoragetest;"
mysql -uroot -e "GRANT ALL ON relstoragetest.* TO 'relstoragetest'@'localhost';"
mysql -uroot -e "CREATE DATABASE relstoragetest2;"
mysql -uroot -e "GRANT ALL ON relstoragetest2.* TO 'relstoragetest'@'localhost';"
mysql -uroot -e "CREATE DATABASE relstoragetest_hf;"
mysql -uroot -e "GRANT ALL ON relstoragetest_hf.* TO 'relstoragetest'@'localhost';"
mysql -uroot -e "CREATE DATABASE relstoragetest2_hf;"
mysql -uroot -e "GRANT ALL ON relstoragetest2_hf.* TO 'relstoragetest'@'localhost';"
mysql -uroot -e "FLUSH PRIVILEGES;"

MySQL specific tests can be run by the testmysql module:

python -m relstorage.tests.testmysql

Note

For some MySQL tests to pass (check16MObject), it may be necessary to increase the server’s max_allowed_packet setting. See the MySQL Documentation for more information.

Oracle

Initial setup will require SYS privileges. Using Oracle 10g XE, you can start a SYS session with the following shell commands:

$ su - oracle
$ sqlplus / as sysdba

Using sqlplus with SYS privileges, execute the following:

CREATE USER relstoragetest IDENTIFIED BY relstoragetest;
GRANT CONNECT, RESOURCE, CREATE TABLE, CREATE SEQUENCE TO relstoragetest;
GRANT EXECUTE ON DBMS_LOCK TO relstoragetest;
CREATE USER relstoragetest2 IDENTIFIED BY relstoragetest;
GRANT CONNECT, RESOURCE, CREATE TABLE, CREATE SEQUENCE TO relstoragetest2;
GRANT EXECUTE ON DBMS_LOCK TO relstoragetest2;
CREATE USER relstoragetest_hf IDENTIFIED BY relstoragetest;
GRANT CONNECT, RESOURCE, CREATE TABLE, CREATE SEQUENCE TO relstoragetest_hf;
GRANT EXECUTE ON DBMS_LOCK TO relstoragetest_hf;
CREATE USER relstoragetest2_hf IDENTIFIED BY relstoragetest;
GRANT CONNECT, RESOURCE, CREATE TABLE, CREATE SEQUENCE TO relstoragetest2_hf;
GRANT EXECUTE ON DBMS_LOCK TO relstoragetest2_hf;

You may need to grant tablespace privileges if you get “no privileges on tablespace” errors:

grant unlimited tablespace to relstoragetest;
grant unlimited tablespace to relstoragetest2;
grant unlimited tablespace to relstoragetest_hf;
grant unlimited tablespace to relstoragetest2_hf;

Oracle specific tests can be run by the testoracle module:

python -m relstorage.tests.testoracle

When running the tests, you can use the environment variable ORACLE_TEST_DSN to override the data source name, which defaults to “XE” (for Oracle 10g XE). For example, using Oracle’s Developer Days Virtual Box VM with an IP of 192.168.1.131, you might set ORACLE_TEST_DSN to 192.168.1.131/orcl. (And you would connect as sysdba with sqlplus 'sys/oracle@192.168.1.131/orcl' as sysdba.)

If the environment variable RS_ORCL_SMALL_BLOB is set when running the tests, certain blob tests will use a much smaller size, making the test run much faster.

Docs:
http://www.oracle.com/pls/db102/homepage
Excellent setup instructions:
http://www.davidpashley.com/articles/oracle-install.html

Work around session limit (fixes ORA-12520):

ALTER SYSTEM SET PROCESSES=150 SCOPE=SPFILE;
ALTER SYSTEM SET SESSIONS=150 SCOPE=SPFILE;
(then restart Oracle)

Manually rollback an in-dispute transaction:

select local_tran_id, state from DBA_2PC_PENDING;
rollback force '$local_tran_id';
Caching with checkpoints

The caching strategy (both local in StorageCache and in memcache) includes checkpoints. Checkpoint management is a bit complex, but important for achieving a decent cache hit rate.

Checkpoints are 64 bit integer transaction IDs. Invariant: checkpoint0 is greater than or equal to checkpoint1, meaning that if they are different, checkpoint0 is the most recent.

Cache key “$prefix:checkpoints” holds the current checkpoints (0 and 1). If the cache key is missing, set it to the current tid, which means checkpoint1 and checkpoint0 are at the same point.

Each StorageCache instance holds a Python map of {oid: tid} changes after checkpoint0. This map is called delta_after0. The map will not be shared because each instance updates the map at different times.

The (oid, tid) list retrieved from polling is sufficient for updating delta_after0 directly, unless checkpoint0 has moved since the last poll. Note that delta_after0 could have a tid more recent than the data provided by polling, due to conflict resolution. The combination should use the latest tid from each map.

Also hold a map of {oid: tid} changes after checkpoint1 and before or at checkpoint0. It is called delta_after1. This map is immutable, so it would be nice to share it between threads.

When looking up an object in the cache, try to get:

  1. The state at delta_after0.
  2. The state at checkpoint0.
  3. The state at delta_after1.
  4. The state at checkpoint1.
  5. The state from the database.

If the retrieved state is older than checkpoint0, but it was not retrieved from checkpoint0, cache it at checkpoint0. Thus if we get data from delta_after1 or checkpoint1, we should copy it to checkpoint0.

The current time is ignored; we only care about transaction timestamps. In a sense, time is frozen until the next transaction commit. This should have a side effect of making databases that don’t change often extremely cacheable.

After polling, check the number of objects now held in delta_after0. If it is beyond a threshold (perhaps 10k), suggest that future polls use new checkpoints. Update “$prefix:checkpoints”.

Checkpoint values stay constant within a transaction. Even if the transaction takes hours and its data is stale, it should keep trying to retrieve from the tids specified in delta_after(0|1) and checkpoint(0|1); it can go ahead and cache what it retrieves. Who knows, there might be yet another long running transaction that could use the cached data.

If we load objects without polling, don’t use the cache.

While polling, it is possible for checkpoint0 to be greater than the latest transaction ID just polled, since other transactions might be adding data very quickly. If that happens, the instance should ignore the checkpoint update, with the expectation that the new checkpoint will be visible after the next update.

What Happens In A Pack

Let us begin by examining this diagram of a database being packed.

In this model, a row represents an object (with an unchanging OID but a changing state) and a column represents a transaction (aka revision).

Legend:

>
Modified in transaction, refers to no other objects
>n
Modified in transaction, has a reference to object n
Red
Will be packed
Yellow
History will terminate (not go back farther than) this revision
Brown
Packed previously
White
Not packed
Grey
Pseudo-transaction
_images/pack_policy.png
Interpretation

What does this mean for each object?

1
Revisions 14, 13 and 12 are kept, while revision 10 is dropped. The prev_tid of revision 12 is set to 0. Until the current revision, this object referred to object 9.
2
Both revisions (13 and 11) are kept. This object is keeping object 3 alive. Although nothing refers to this object, packing does not remove objects with revisions beyond pack_tid.
3
Since a current revision of object 2 refers to this object, revision 12 is kept, but its history is cut short (revision 11 is dropped and the prev_tid for revision 12 is set to 0).
5
Nothing refers to this object, so the OID is completely removed from the database.
7
Since a current revision of object 3 refers to this object, revision 11 is kept.
9
Revision 13 of object 1, which refers to this object, will not be packed, so this OID must be kept even though no current revision refers to it. This object is keeping alive transaction 10, which was packed earlier.
0
This is the root object since its OID is 0. The root object is never removed from the database, but its history will be cut short: revision 10 will be removed and the prev_tid of revision 11 will be set to 0.
Notes
  • Packing removes both columns and rows from the table, with exceptions.
  • We always pack a list of transactions ranging from just after tid 0 (which is a pseudo-transaction) to pack_tid, which is the last transaction committed before an application-specified time.
  • The list of columns to remove is 0 < tid <= pack_tid.
  • The list of rows to remove is in the temporary table pack_object, where keep_tid is NULL.
  • The list of rows to cut short (i.e. leave only the current state and otherwise pack) is also in the temporary table pack_object, where keep_tid is not null.
  • The keep_tid value of the pack_object table, when set, is copied from current_object.
  • Some of the transactions may have been packed already. Previously packed transactions hang around until they contain no object states.
  • Packing does not change current_object except when it removes objects from the database.
  • After packing, some of the packed transactions may be kept because object states need them, but they will not appear in the list of undoable transactions because the packed bit is set.
  • In the model above, all of the transactions are kept (with the packed bit set), but many object states are removed. If there were a purely red or purely brown column, it would be removed completely.

Changes

2.1a3 (unreleased)
  • Nothing changed yet.
2.1a2 (2017-04-15)
  • Implemented the storage afterCompletion method, which allows RelStorage storages to be notified of transaction endings for transactions that don’t call the two-phase commit API. This allows resources to be used more efficiently because it prevents RDBMS transactions from being held open.

    Fixes: issue #147 (At least for ZODB 5.2.)

  • Oracle: Fix two queries that got broken due to the performance work in 2.1a1.

  • MySQL: Workaround a rare issue that could lead to a TypeError when getting new OIDs. See issue #173.

  • The len of a RelStorage instance now correctly reflects the approximate number of objects in the database. Previously it returned a hardcoded 0. See issue #178.

  • MySQL: Writing blobs to the database is much faster and scales much better as more blobs are stored. The query has been rewritten to use existing primary key indexes, whereas before it used a table scan due to deficiencies in the MySQL query optimizer. Thanks to Josh Zuech and enfold-josh. See issue #175.

2.1a1 (2017-02-01)
  • 3.6.0 final release is tested on CI servers.
  • Substantial performance improvements for PostgreSQL, both on reading and writing. Reading objects can be 20-40% faster. Writing objects can be 15-25% faster (the most benefit will be seen by history-free databases on PostgreSQL 9.5 and above). MySQL may have a (much) smaller improvement too, especially for small transactions. This was done through the use of prepared statements for the most important queries and the new ‘ON CONFLICT UPDATE’ syntax. See pull request #157 and issue #156.
  • The umysqldb driver no longer attempts to automatically reconnect on a closed cursor exception. That fails now that prepared statements are in use. Instead, it translates the internal exception to one that the higher layers of RelStorage recognize as requiring reconnection at consistent times (transaction boundaries).
  • Add initial support for the MySQL Connector/Python driver. See issue #155.
  • Backport ZODB #140 to older versions of ZODB. This improves write performance, especially in multi-threaded scenarios, by up to 10%. See pull request #160.
  • MySQL temporary tables now use the InnoDB engine instead of MyISAM. See pull request #162.
2.0.0 (2016-12-23)
  • MySQL and Postgres now use the same optimized methods to get the latest TID at transaction commit time as they do at poll time. This is similar to issue #89.
  • MySQL now releases the commit lock (if acquired) during pre-pack with GC of a history-free storage at the same time as PostgreSQL and Oracle did (much earlier). Reported and initial fix provided in pull request #9 by jplouis.
2.0.0rc1 (2016-12-12)
  • Writing persistent cache files has been changed to reduce the risk of stale temporary files remaining. Also, files are kept open for a shorter period of time and removed in a way that should work better on Windows.
  • RelStorage is now tested on Windows for MySQL and PostgreSQL thanks to AppVeyor.
  • Add support for Python 3.6.
2.0.0b9 (2016-11-29)
  • The MySQL adapter will now produce a more informative error if it gets an unexpected result taking the commit lock. Reported by Josh Zuech.
  • Compatibility with transaction 2.0 on older versions of ZODB (prior to the unreleased version that handles encoding meta data for us), newer versions of ZODB (that do the encoding), while maintaining compatibility with transaction 1.x. In particular, the history method consistently returns bytes for username and description.
  • In very rare cases, persistent cache files could result in a corrupt cache state in memory after loading them, resulting in AttributeErrors until the cache files were removed and the instance restarted. Reported in issue #140 by Carlos Sanchez.
2.0.0b8 (2016-10-02)
  • List CFFI in setup_requires for buildout users.
2.0.0b7 (2016-10-01)
  • Add the ability to limit the persistent cache files size. Thanks to Josh Zuech for the suggestion, which led to the next change.

  • Move the RelStorage shared cache to a windowed-LFU with segmented LRU instead of a pure LRU model. This can be a nearly optimal caching strategy for many workloads. The caching code itself is also faster in all tested cases.

    It’s especially helpful when using persistent cache files together with a file size limit, as we can now ensure we write out the most frequently useful data to the file instead of just the newest.

    For more information see issue #127 and pull request #128. Thanks to Ben Manes for assistance talking through issues related to the cache strategy.

    For write-heavy workloads, you may want to increase cache_delta_size_limit.

    The internal implementation details of the cache have been completely changed. Only the StorageCache class remains unchanged (though that’s also an implementation class). CFFI is now required, and support for PyPy versions older than 2.6.1 has been dropped.

  • On CPython, use LLBTrees for the cache delta maps. This allows using a larger, more effective size while reducing memory usage. Fixes issue #130.

  • Persistent cache files use the latest TID in the cache as the file’s modification time. This allows a more accurate choice of which file to read at startup. Fixes issue #126.

  • Fix packing of history-preserving Oracle databases. Reported in issue #135 by Peter Jacobs.

2.0.0b6 (2016-09-08)
  • Use setuptools.find_packages and include_package_data to ensure wheels have all the files necessary. This corrects an issue with the 2.0.0b5 release on PyPI. See issue #121 by Carlos Sanchez.
2.0.0b5 (2016-08-24)
  • Supporting new databases should be simpler due to a code restructuring. Note that many internal implementation classes have moved or been renamed.

  • The umysqldb support handles query transformations more efficiently.

  • umysqldb now raises a more informative error when the server sends too large a packet.

    Note

    If you receive “Socket receive buffer full” errors, you are likely experiencing this issue in ultramysql and will need a patched version, such as the one provided in this pull request.

  • The local persistent cache file format has been changed to improve reading and writing speed. Old files will be cleaned up automatically. Users of the default settings could see improvements of up to 3x or more on reading and writing.

  • Compression of local persistent cache files has been disabled by default (but there is still an option to turn it back on). Operational experience showed that it didn’t actually save that much disk space, while substantially slowing down the reading and writing process (2-4x).

  • Add an option, cache-local-dir-read-count to limit the maximum number of persistent local cache files will be used to populate a storages’s cache. This can be useful to reduce startup time if cache files are large and workers have mostly similar caches.

2.0.0b4 (2016-07-17)
  • Add experimental support for umysqldb as a MySQL driver for Python 2.7. This is a gevent-compatible driver implemented in C for speed. Note that it may not be able to store large objects (it has been observed to fail for a 16M object—it hardcodes a max_allowed_packet of exactly 16MB for read and write buffers), and has been observed to have some other stability issues.
2.0.0b3 (2016-07-16)
  • Add support for ZODB 5. RelStorage continues to run on ZODB 4 >= 4.4.2.
  • Add support for tooling to help understand RelStorage cache behaviour. This can help tune cache sizes and the choice to use Memcached or not. See issue #106 and pull request #108.
  • Fix a threading issue with certain database drivers.
2.0.0b2 (2016-07-08)
Breaking Changes
  • Support for cx_Oracle versions older than 5.0 has been dropped. 5.0 was released in 2008.
  • Support for PostgreSQL 8.1 and earlier has been dropped. 8.2 is likely to still work, but 9.0 or above is recommended. 8.2 was released in 2006 and is no longer supported by upstream. The oldest version still supported by upstream is 9.1, released in 2011.
Platform Support
  • Using ZODB >= 4.4.2 (but not 5.0) is recommended to avoid deprecation warnings due to the introduction of a new storage protocol. The next major release of RelStorage will require ZODB 4.4.2 or above and should work with ZODB 5.0.
  • Change the recommended and tested MySQL client for Python 2.7 away from the unmaintained MySQL-python to the maintained mysqlclient (the same one used by Python 3).
  • PyMySQL now works and is tested on Python 3.
  • A pure-Python PostgreSQL driver, pg8000, now works and is tested on all platforms. This is a gevent-compatible driver. Note that it requires a PostgreSQL 9.4 server or above for BLOB support.
  • Support explicitly specifying the database driver to use. This can be important when there is a large performance difference between drivers, and more than one might be installed. (Also, RelStorage no longer has the side-effect of registering PyMySQL as MySQLdb and psycopg2cffi as psycopg2.) See issue #86.
Bug Fixes
  • Memcache connections are explicitly released instead of waiting for GC to do it for us. This is especially important with PyPy and/or python-memcached. See issue #80.
  • The poll-interval option is now ignored and polling is performed when the ZODB Connection requests it (at transaction boundaries). Experience with delayed polling has shown it typically to do more harm than good, including introducing additional possibilities for error and leading to database performance issues. It is expected that most sites won’t notice any performance difference. A larger discussion can be found in issue #87.
Performance
  • Support a persistent on-disk cache. This can greatly speed up application warmup after a restart (such as when deploying new code). Some synthetic benchmarks show an 8-10x improvement. See issue #92 for a discussion, and see the options cache-local-dir and cache-local-dir-count.
  • Instances of RelStorage no longer use threading locks by default and hence are not thread safe. A ZODB Connection is documented as not being thread-safe and must be used only by a single thread at a time. Because RelStorage natively implements MVCC, each Connection has a unique storage object. It follows that the storage object is used only by a single thread. Using locks just adds unneeded overhead to the common case. If this is a breaking change for you, please open an issue. See pull request #91.
  • MySQL uses (what should be) a slightly more efficient poll query. See issue #89.
  • The in-memory cache allows for higher levels of concurrent operation via finer-grained locks. For example, compression and decompression are no longer done while holding a lock.
  • The in-memory cache now uses a better approximation of a LRU algorithm with less overhead, so more data should fit in the same size cache. (For best performance, CFFI should be installed; a warning is generated if that is not the case.)
  • The in-memory cache is now smart enough not to store compressed objects that grow during compression, and it uses the same compression markers as zc.zlibstorage to avoid double-compression. It can also gracefully handle changes to the compression format in persistent files.
2.0.0b1 (2016-06-28)
Breaking Changes
  • Update the ZODB dependency from ZODB3 3.7.0 to ZODB 4.3.1. Support for ZODB older than 3.10 has been removed; ZODB 3.10 may work, but only ZODB 4.3 is tested.
  • Remove support for Python 2.6 and below. Python 2.7 is now required.
Platform Support
  • Add support for PyPy on MySQL and PostgreSQL using PyMySQL and psycopg2cffi respectively. PyPy can be substantially faster than CPython in some scenarios; see pull request #23.
  • Add initial support for Python 3.4+ for MySQL (using mysqlclient), PostgreSQL, and Oracle.
Bug Fixes
  • Fixed loadBefore of a deleted/undone object to correctly raise a POSKeyError instead of returning an empty state. (Revealed by updated tests for FileStorage in ZODB 4.3.1.)
  • Oracle: Packing should no longer produce LOB errors. This partially reverts the speedups in 1.6.0b2. Reported in issue #30 by Peter Jacobs.
  • RelStorage.registerDB() and RelStorage.new_instance() now work with storage wrappers like zc.zlibstorage. See issue #70 and issue #71.
Included Utilities
  • zodbconvert: The --incremental option is supported with a FileStorage (or any storage that implements IStorage.lastTransaction()) as a destination, not just RelStorages.
  • zodbconvert: The --incremental option works correctly with a RelStorage as a destination. See pull request #22. With contributions by Sylvain Viollon, Mauro Amico, and Peter Jacobs. Originally reported by Jan-Wijbrand Kolman.
  • PostgreSQL: zodbconvert --clear should be much faster when the destination is a PostgreSQL schema containing lots of data. NOTE: There can be no other open RelStorage connections to the destination, or any PostgreSQL connection in general that might be holding locks on the RelStorage tables, or zodbconvert will block indefinitely waiting for the locks to be released. Partial fix for issue #16 reported by Chris McDonough.
  • zodbconvert and zodbpack use argparse instead of optparse for command line handling.
Performance
  • MySQL: Use the “binary” character set to avoid producing “Invalid utf8 character string” warnings. See issue #57.
  • Conflict resolution uses the locally cached state instead of re-reading it from the database (they are guaranteed to be the same). See issue #38.
  • Conflict resolution reads all conflicts from the database in one query, instead of querying for each individual conflict. See issue #39.
  • PostgreSQL no longer encodes and decodes object state in Base64 during database communication thanks to database driver improvements. This should reduce network overhead and CPU usage for both the RelStorage client and the database server. psycopg2 2.4.1 or above is required; 2.6.1 or above is recommended. (Or psycopg2cffi 2.7.4.)
  • PostgreSQL 9.3: Support commit-lock-timeout. Contributed in pull request #20 by Sean Upton.
Other Enhancements
  • Raise a specific exception when acquiring the commit lock (UnableToAcquireCommitLockError) or pack lock (UnableToAcquirePackUndoLockError) fails. See pull request #18.
  • RelStorage.lastTransaction() is more consistent with FileStorage and ClientStorage, returning a useful value in more cases.
  • Oracle: Add support for getting the database size. Contributed in pull request #21 by Mauro Amico.
  • Support ZODB.interfaces.IExternalGC for history-free databases, allowing multi-database garbage collection with zc.zodbdgc. See issue #47.
Project Details
  • Travis CI is now used to run RelStorage tests against MySQL and PostgreSQL on every push and pull request. CPython 2 and 3 and PyPy are all tested with the recommended database drivers.
  • Documentation has been reorganized and moved to readthedocs.
  • Updated the buildout configuration to just run relstorage tests and to select which databases to use at build time.
1.6.1 (2016-08-30)
  • Tests: Basic integration testing is done on Travis CI. Thanks to Mauro Amico.
  • RelStorage.lastTransaction() is more consistent with FileStorage and ClientStorage, returning a useful value in more cases.
  • zodbconvert: The --incremental option is supported with a FileStorage (or any storage that implements IStorage.lastTransaction()) as a destination, not just RelStorages.
  • zodbconvert: The --incremental option is supported with a RelStorage as a destination. See pull request #22. With contributions by Sylvain Viollon, Mauro Amico, and Peter Jacobs. Originally reported by Jan-Wijbrand Kolman.
  • Oracle: Packing should no longer produce LOB errors. This partially reverts the speedups in 1.6.0b2. Reported in issue #30 by Peter Jacobs.
1.6.0 (2016-06-09)
  • Tests: Use the standard library doctest module for compatibility with newer zope.testing releases.
1.6.0b3 (2014-12-08)
  • Packing: Significantly reduced the RAM consumed by graph traversal during the pre_pack phase. (Tried several methods; encoded 64 bit IISets turned out to be the most optimal.)
1.6.0b2 (2014-10-03)
  • Packing: Used cursor.fetchmany() to make packing more efficient.
1.6.0b1 (2014-09-04)
  • The local cache is now more configurable and uses zlib compression by default.
  • Added support for zodburi, which means you can open a storage using “postgres:”, “mysql:”, or “oracle:” URIs.
  • Packing: Reduced RAM consumption while packing by using IIBTree.Set instead of built-in set objects.
  • MySQL 5.5: The test suite was freezing in checkBackwardTimeTravel. Fixed.
  • Added performance metrics using the perfmetrics package.
  • zodbconvert: Add an –incremental option to the zodbconvert script, letting you convert additional transactions at a later date, or update a non-live copy of your database, copying over missing transactions.
  • Replication: Added the ro-replica-conf option, which tells RelStorage to use a read-only database replica for load connections. This makes it easy for RelStorage clients to take advantage of read-only database replicas.
  • Replication: When the database connection is stale (such as when RelStorage switches to an asynchronous replica that is not yet up to date), RelStorage will now raise ReadConflictError by default. Ideally, the application will react to the error by transparently retrying the transaction, while the database gets up to date. A subsequent transaction will no longer be stale.
  • Replication: Added the revert-when-stale option. When this option is true and the database connection is stale, RelStorage reverts the ZODB connection to the stale state rather than raise ReadConflictError. This option is intended for highly available, read-only ZODB clients. This option would probably confuse users of read-write ZODB clients, whose changes would sometimes seem to be temporarily reverted.
  • Caching: Use the database name as the cache-prefix by default. This will hopefully help people who accidentally use a single memcached for multiple databases.
  • Fixed compatibility with persistent 4.0.5 and above.
1.5.1 (2011-11-12)
  • Packing: Lowered garbage collection object reference finding log level to debug; this stage takes mere seconds, even in large sites, but could produce 10s of thousands of lines of log output.
  • RelStorage was opening a test database connection (and was leaving it idle in a transaction with recent ZODB versions that support IMVCCStorage.) RelStorage no longer opens that test connection.
  • zodbconvert: Avoid holding a list of all transactions in memory.
  • Just after installing the database schema, verify the schema was created correctly. This affects MySQL in particular.
1.5.0 (2011-06-30)
  • PostgreSQL: Fixed another minor compatibility issue with PostgreSQL 9.0. Packing raised an error when the client used old an version of libpq.

  • Delete empty transactions in batches of 1000 rows instead of all in one go, to prevent holding the transaction lock for longer than absolutely necessary.

  • Oracle: Fix object reference downloading performance for large RelStorage databases during the garbage collection phase of a pack.

  • Oracle, PostgreSQL: Switch to storing ZODB blob in chunks up to 4GB (the maximum supported by cx_Oracle) or 2GB (PostgreSQL maximum blob size) to maximize blob reading and writing performance.

    The PostgreSQL blob_chunk schema changed to support this, see notes/migrate-to-1.5.txt to update existing databases.

  • zodbconvert: When copying a database containing blobs, ensure the source blob file exists long enough to copy it.

1.5.0b2 (2011-03-02)
  • Better packing based on experience with large databases. Thanks to Martijn Pieters!

    • Added more feedback to the packing process. It’ll now report during batch commit how much of the total work has been completed, but at most every .1% of the total number of transactions or objects to process.
    • Renamed the –dry-run option to –prepack and added a –use-prepack-state to zodbpack. With these 2 options the pre-pack and pack phases can be run separately, allowing re-use of the pre-pack analysis data or even delegating the pre-pack phase off to a separate server.
    • Replaced the packing duty cycle with a nowait locking strategy. The pack operation will now request the commit lock but pauses if it is already taken. It releases the lock after every batch (defaulting to 1 second processing). This makes the packing process faster while at the same time yielding to regular ZODB commits when busy.
    • Do not hold the commit lock during pack cleanup while deleting rows from the object reference tables; these tables are pack-specific and regular ZODB commits never touch these.
  • Added an option to control schema creation / updating on startup. Setting the create-schema option to false will let you use RelStorage without a schema update.

  • Fixed compatibility with PostgreSQL 9.0, which is capable of returning a new ‘hex’ type to the client. Some builds of psycopg2 return garbage or raise an error when they see the new type. The fix was to encode more SQL query responses using base 64.

  • With the new shared-blob-dir option set to false, it was possible for a thread to read a partially downloaded blob. Fixed. Thanks for the report from Maurits van Rees.

  • Support for “shared-blob-dir false” now requires ZODB 3.9 or better. The code in the ZODB 3.8 version of ZODB.blob is not compatible with BlobCacheLayout, leading to blob filename collisions.

1.5.0b1 (2011-02-05)
  • Added a state_size column to object_state, making it possible to query the size of objects without loading the state. The new column is intended for gathering statistics. A schema migration is required.
  • Added more logging during zodbconvert to show that something is happening and give an indication of how far along the process is.
  • Fixed a missing import in the blob cache cleanup code.
  • Added a –dry-run option to zodbpack.
  • Replaced the graph traversal portion of the pack code with a more efficient implementation using Python sets (instead of SQL). The new code is much faster for packing databases with deeply nested objects.
1.5.0a1 (2010-10-21)
  • Added an option to store ZODB blobs in the database. The option is called “shared-blob-dir” and it behaves very much like the ZEO option of the same name. Blobs stored in the database are broken into chunks to reduce the impact on RAM.
  • Require setuptools or distribute. Plain distutils is not sufficient.
1.4.2 (2011-02-04)
  • Fixed compatibility with ZODB 3.10. As reported by JĂźrgen Herrmann, there was a problem with conflict errors. The RelStorage patch of the sync() method now works with ZODB 3.10.
  • Fixed a bug in packing history-free databases. If changes were made to the database during the pack, the pack code could delete too many objects. Thanks to Chris Withers for writing test code that revealed the bug. A schema migration is required for history-free databases; see notes/migration-to-1.4.txt.
  • Enabled logging to stderr in zodbpack.
1.4.1 (2010-10-21)
  • Oracle: always connect in threaded mode. Without threaded mode, clients of Oracle 11g sometimes segfault.
1.4.0 (2010-09-30)
  • Made compatible with ZODB 3.10.0b7.
  • Enabled ketama and compression in pylibmc_wrapper. Both options are better for clusters. [Helge Tesdal]
  • Oracle: Use a more optimal query for POSKeyError logging. [Helge Tesdal]
  • Fixed a NameError that occurred when getting the history of an object where transaction extended info was set. [Helge Tesdal]
1.4.0c4 (2010-09-17)
  • Worked around an Oracle RAC bug: apparently, in a RAC environment, the read-only transaction mode does not isolate transactions in the manner specified by the documentation, so Oracle users now have to use serializable isolation like everyone else. It’s slower but more reliable.
  • Use the client time instead of the database server time as a factor in the transaction ID. RelStorage was using the database server time to reduce the need for synchronized clocks, but in practice, that policy broke tests and did not really avoid the need to synchronize clocks. Also, the effect of unsynchronized clocks is predictable and manageable: you’ll get bunches of transactions with sequential timestamps.
  • If the database returns an object from the future (which should never happen), generate a ReadConflictError, hopefully giving the application a chance to recover. The most likely causes of this are a broken database and threading bugs.
1.4.0c3 (2010-07-31)
  • Always update the RelStorage cache when opening a database connection for loading, even when no ZODB Connection is using the storage. Otherwise, code that used the storage interface directly could cause the cache to fall out of sync; the effects would be seen in the next ZODB.Connection.
  • Added a ZODB monkey patch that passes the “force” parameter to the sync method. This should help the poll-interval option do its job better.
1.4.0c2 (2010-07-28)
  • Fixed a subtle bug in the cache code that could lead to an AssertionError indicating a cache inconsistency. The inconsistency was caused by after_poll(), which was ignoring the randomness of the order of the list of recent changes, leading it to sometimes put the wrong transfer ID in the “delta_after” dicts. Also expanded the AssertionError with debugging info, since cache inconsistency can still be caused by database misconfiguration and mismatched client versions.
  • Oracle: updated the migration notes. The relstorage_util package is not needed after all.
1.4.0c1 (2010-06-19)
  • History-preserving storages now replace objects on restore instead of just inserting them. This should solve problems people were having with the zodbconvert utility.

  • Oracle: call the DBMS_LOCK.REQUEST function directly instead of using a small package named relstorage_util. The relstorage_util package was designed as a secure way to access the DBMS_LOCK package, but the package turned out to be confusing to DBAs and provided no real security advantage. People who have already deployed RelStorage 1.4.x on Oracle need to do the following:

    GRANT EXECUTE ON DBMS_LOCK TO <zodb_user>;

    You can also drop the relstorage_util package. Keep the relstorage_op package.

  • Made compatible with ZODB 3.10.

  • MySQL: specify the transaction isolation mode for every connection, since the default is apparently not necessarily “read committed” anymore.

1.4.0b3 (2010-02-02)
  • Auto-reconnect in new_oid().
1.4.0b2 (2010-01-30)
  • Include all test subpackages in setup.py.
  • Raise an error if MySQL reverts to MyISAM rather than using the InnoDB storage engine.
1.4.0b1 (2009-11-17)
  • Added the keep-history option. Set it to false to keep no history. (Packing is still required for garbage collection and blob deletion.)
  • Added the replica-conf and replica-timeout options. Set replica-conf to a filename containing the location of database replicas. Changes to the file take effect at transaction boundaries.
  • Expanded the option documentation in README.txt.
  • Revised the way RelStorage uses memcached. Minimized the number of trips to both the cache server and the database.
  • Added an in-process pickle cache that serves a function similar to the ZEO cache.
  • Added a wrapper module for pylibmc.
  • Store operations now use multi-insert and multi-delete SQL statements to reduce the effect of network latency.
  • Renamed relstorage.py to storage.py to overcome import issues. Also moved the Options class to options.py.
  • Updated the patch for ZODB 3.7 and 3.8 to fix an issue with blobs and subtransactions.
  • Divided the implementation of database adapters into many small objects, making the adapter code more modular. Added interfaces that describe the duties of each part.
  • Oracle: Sped up restore operations by sending short blobs inline.
  • Oracle: Use a timeout on commit locks. This requires installation of a small PL/SQL package that can access DBMS_LOCK. See README.txt.
  • Oracle: Used PL/SQL bulk insert operations to improve write performance.
  • PostgreSQL: use the documented ALTER SEQUENCE RESTART WITH statement instead of ALTER SEQUENCE START WITH.
  • Moved MD5 sum computation to the adapters so they can choose not to use MD5.
  • Changed loadSerial to load from the store connection only if the load connection can not provide the object requested.
  • Stopped wrapping database disconnect exceptions. Now the code catches and handles them directly.
  • Use the store connection rather than the load connection for OID allocation.
  • Detect and handle backward time travel, which can happen after failover to an out-of-date asynchronous slave database. For simplicity, invalidate the whole ZODB cache when this happens.
  • Replaced the speed test script with a separately distributed package, zodbshootout.
  • Added the zodbpack script.
1.3.0b1 (2009-09-04)
  • Added support for a blob directory. No BlobStorage wrapper is needed. Cluster nodes will need to use a shared filesystem such as NFS or SMB/CIFS.
  • Added the blob-dir parameter to the ZConfig schema and README.txt.
1.2.0 (2009-09-04)
  • In Oracle, trim transaction descriptions longer than 2000 bytes.
  • When opening the database for the first time, don’t issue a warning about the inevitable POSKeyError on the root OID.
  • If RelStorage tries to unpickle a corrupt object state during packing, it will now report the oid and tid in the log.
1.2.0b2 (2009-05-05)
  • RelStorage now implements IMVCCStorage, making it compatible with ZODB 3.9.0b1 and above.
  • Removed two-phase commit support from the PostgreSQL adapter. The feature turned out to be unnecessary.
  • Added MySQL 5.1.34 and above to the list of supportable databases.
  • Fixed minor test failures under Windows. Windows is now a supportable platform.

Information about older releases can be found here.

Historical Changes

1.1.3 (2009-02-04)
  • In rare circumstances, ZODB can legitimately commit an object twice in a single transaction. Fixed RelStorage to accept that.
  • Auto reconnect to Oracle sometimes did not work because cx_Oracle was raising a different kind of exception than expected. Fixed.
  • Included LICENSE.txt in the source distribution.
1.1.2 (2009-01-27)
  • When both cache-servers and poll-interval are set, we now poll the cache for changes on every request. This makes it possible to use a high poll-interval to reduce the database polling burden, yet every client can see changes immediately.
  • Added the pack-dry-run option, which causes pack operations to only populate the pack tables with the list of objects and states to pack, but not actually pack.
  • Refined the pack algorithm. It was not removing as many object states as it should have. As a bonus, there is now a method of adapters called fill_object_refs(), which could be useful for debugging. It ensures the object_ref table is fully populated.
  • Began using zc.buildout for development.
  • Increased automated test coverage.
  • Fixed KeyError reporting to not trip over a related KeyError while logging.
1.1.1 (2008-12-27)
  • Worked around MySQL performance bugs in packing. Used temporary tables and another column in the pack_object table. The other databases may benefit from the optimization as well.
  • Applied an optimization using setinputsizes() to the Oracle code, bringing write speed back up to where it was in version 1.0.
1.1 (2008-12-19)
  • Normalized poll-invalidation patches as Solaris’ patch command would not accept the current format. The patches now apply with: patch -d lib/python/ZODB -p0 < poll-invalidation-1-zodb-3-X-X.patch
  • In MySQL, Use DROP TABLE IF EXISTS instead of TRUNCATE to clear ‘temp_store’ because:
    • TRUNCATE has one page of caveats in the MySQL documentation.
    • TEMPORARY TABLEs have half a page of caveats when it comes to replication.
    • The end result is that ‘temp_store’ may not exist on the replication slave at the exact same time(s) it exists on the master.
  • Implemented the database size query in MySQL, based on a patch from Kazuhiko Shiozaki. Thanks!
  • Optimized Oracle object retrieval by causing BLOBs to be sent inline when possible, based on a patch by Helge Tesdal. By default, the optimization is activated automatically when cx_Oracle 5 is used.
  • Updated the storage iterator code to be compatible with ZODB 3.9. The RelStorage tests now pass with the shane-poll-invalidations branch of ZODB 3.9.
  • Added a translation of README.txt to Brazilian Portuguese by Rogerio Ferreira. Thanks!
1.1c1
  • Added optional memcache integration. This is useful when the connection to the relational database has high latency.
  • Made it possible to set the pack and memcache options in zope.conf.
  • Log more info when a KeyError occurs within RelStorage.
1.1b2
  • Made the MySQL locks database-specific rather than server-wide. This is important for multi-database configurations.
  • In the PostgreSQL adapter, made the pack lock fall back to table locking rather than advisory locks for PostgreSQL 8.1.
  • Changed a query for following object references (used during packing) to work around a MySQL performance bug. Thanks to Anton Stonor for discovering this.
1.1b1
  • Fixed the use of setup.py without setuptools. Thanks to Chris Withers.
  • Fixed type coercion of the transaction extension field. This fixes an issue with converting databases. Thanks to Kevin Smith for discovering this.
  • Added logging to the pack code to help diagnose performance issues.
  • Additions to the object_ref table are now periodically committed during pre_pack so that the work is not lost if pre_pack fails.
  • Modified the pack code to pack one transaction at a time and release the commit lock frequently. This should help large pack operations.
  • Fixed buildout-based installation of the zodbconvert script. Thanks to Jim Fulton.
1.0.1 (2008-03-11)
  • The speedtest script failed if run on a test database that has no tables. Now the script creates the tables if needed. Thanks to Flavio Coelho for discovering this.
  • Reworked the auto-reconnect logic so that applications never see temporary database disconnects if possible. Thanks to Rigel Di Scala for pointing out this issue.
  • Improved the log messages explaining database connection failures.
  • Moved poll_invalidations to the common adapter base class, reducing the amount of code to maintain.
1.0 (2008-02-29)
  • Added a utility for converting between storages called zodbconvert.
1.0c1
  • The previous fix for non-ASCII characters was incorrect. Now transaction metadata is stored as raw bytes. A schema migration is required; see notes/migrate-1.0-beta.txt.
  • Integrated setuptools and made an egg.
1.0 beta
  • Renamed to reflect expanding database support.
  • Added support for Oracle 10g.
  • Major overhaul with many scalability and reliability improvements, particularly in the area of packing.
  • Moved to svn.zope.org and switched to ZPL 2.1.
  • Made two-phase commit optional in both Oracle and PostgreSQL. They both use commit_lock in such a way that the commit is not likely to fail in the second phase.
  • Switched most database transaction isolation levels from serializable to read committed. It turns out that commit_lock already provides the serializability guarantees we need, so it is safe to take advantage of the potential speed gains. The one major exception is the load connection, which requires an unchanging view of the database.
  • Stored objects are now buffered in a database table rather than a file.
  • Stopped using the LISTEN and NOTIFY statements in PostgreSQL since they are not strictly transactional in the sense we require.
  • Started using a prepared statement in PostgreSQL for getting the newest transaction ID quickly.
  • Removed the code in the Oracle adapter for retrying connection attempts. (It is better to just reconfigure Oracle.)
  • Added support for MySQL 5.0.
  • Added the poll_interval option. It reduces the frequency of database polls, but it also increases the potential for conflict errors on servers with high write volume.
  • Implemented the storage iterator protocol, making it possible to copy transactions to and from FileStorage and other RelStorage instances.
  • Fixed a bug that caused OIDs to be reused after importing transactions. Added a corresponding test.
  • Made it possible to disable garbage collection during packing. Exposed the option in zope.conf.
  • Valery Suhomlinov discovered a problem with non-ASCII data in transaction metadata. The problem has been fixed for all supported databases.

PGStorage history

0.4
  • Began using the PostgreSQL LISTEN and NOTIFY statements as a shortcut for invalidation polling.
  • Removed the commit_order code. The commit_order idea was intended to allow concurrent commits, but that idea is a little too ambitious while other more important ideas are being tested. Something like it may come later.
  • Improved connection management: only one database connection is held continuously open per storage instance.
  • Reconnect to the database automatically.
  • Removed test mode.
  • Switched from using a ZODB.Connection subclass to a ZODB patch. The Connection class changes in subtle ways too often to subclass reliably; a patch is much safer.
  • PostgreSQL 8.1 is now a dependency because PGStorage uses two phase commit.
  • Fixed an undo bug. Symptom: attempting to examine the undo log revealed broken pickles. Cause: the extension field was not being wrapped in psycopg2.Binary upon insert. Solution: used psycopg2.Binary. Unfortunately, this doesn’t fix existing transactions people have committed. If anyone has any data to keep, fixing the old transactions should be easy.
  • Moved from a private CVS repository to Sourceforge. See http://pgstorage.sourceforge.net . Also switched to the MIT license.
  • David Pratt added a basic getSize() implementation so that the Zope management interface displays an estimate of the size of the database.
  • Turned PGStorage into a top-level package. Python generally makes top-level packages easier to install.
0.3
  • Made compatible with Zope 3, although an undo bug apparently remains.
0.2
  • Fixed concurrent commits, which were generating deadlocks. Fixed by adding a special table, “commit_lock”, which is used for synchronizing increments of commit_seq (but only at final commit.) If you are upgrading from version 0.1, you need to change your database using the ‘psql’ prompt:

    create table commit_lock ();

  • Added speed tests and an OpenDocument spreadsheet comparing FileStorage / ZEO with PGStorage. PGStorage wins at reading objects and writing a lot of small transactions, while FileStorage / ZEO wins at writing big transactions. Interestingly, they tie when writing a RAM disk.

Indices and tables

Development

RelStorage is hosted at GitHub:

Continuous integration

A test suite is run for every push and pull request submitted. Travis CI is used to test on Linux, and AppVeyor runs the builds on Windows.

https://travis-ci.org/zodb/relstorage.svg?branch=master https://ci.appveyor.com/api/projects/status/pccddlgujdoqvl83?svg=true Documentation Status

Builds on Travis CI automatically submit updates to coveralls.io to monitor test coverage.

https://coveralls.io/repos/zodb/relstorage/badge.svg?branch=master&service=github

Likewise, builds on Travis CI will automatically submit updates to landscape.io to monitor code health (adherence to PEP8, absence of common code smells, etc).

Code Health

Project URLs

Indices and tables