Strainer: Fast Functional Serializers¶
Strainer is a different take on object serialization and validation in python.
It utilizes a functional style over classes.
A Strainer Example
import datetime
from strainer import (serializer, field, child,
formatters, validators,
ValidationException)
artist_serializer = serializer(
field('name', validators=[validators.required()])
)
album_schema = serializer(
field('title', validators=[validators.required()]),
field('release_date',
validators=[validators.required(), validators.datetime()],
formatters=[formatters.format_datetime()]),
child('artist', serializer=artist_serializer, validators=[validators.required()])
)
class Artist(object):
def __init__(self, name):
self.name = name
class Album(object):
def __init__(self, title, release_date, artist):
self.title = title
self.release_date = release_date
self.artist = artist
bowie = Artist(name='David Bowie')
album = Album(
artist=bowie,
title='Hunky Dory',
release_date=datetime.datetime(1971, 12, 17)
)
Given that we can now serialize, deserialize, and validate data
>>> album_schema.serialize(album)
{'artist': {'name': 'David Bowie'},
'release_date': '1971-12-17T00:00:00',
'title': 'Hunky Dory'}
>>> album_schema.deserialize(album_schema.serialize(album))
{'artist': {'name': 'David Bowie'},
'release_date': datetime.datetime(1971, 12, 17, 0, 0, tzinfo=<iso8601.Utc>),
'title': 'Hunky Dory'}
>>> input = album_schema.serialize(album)
>>> del input['artist']
>>> album_schema.deserialize(input)
ValidationException: {'artist': ['This field is required']}
the example has been borrowed from Marshmallow
Introduction to Strainer¶
Strainer was built with restful api’s in mind. Here is an informal overview of how to use strainer in that domain.
The goal of this document is to give you enough technical specifics to understand how Strainer works, but this isn’t intended to be a tutorial or reference. Once you have your bearings dive into the more technical parts of the documentation.
Background¶
Strainer was built to serialize rich Python objects into simple data structures. You might use Strainer with an object relation mapper like Django’s ORM, or SQLAlchemy. So, first we are going to define some models that we will use for the rest of the introduction.
We are going to cover some aspects of creating an API that will track RSS feeds and their items. Here are two simple models that could represent RSS feeds and their items.
class Feed(object):
def __init__(self, feed, name, items):
self.feed = feed
self.name = name
self.items = items
class FeedItem(object):
def __init__(self, title, pub_date):
self.title = title
self.pub_date = pub_date
We have the models, but now we want to create a JSON API for our models. We will need to serialize our models, which are rich python objects, into simple dicts so that we may convert them into JSON. First step is to create the serializer.
Create A Feed Serializer¶
To start, we will create serializers for each model. The job of a serializer is to take a rich python object and boil it down to a simple python dict that can be eaisly converted into JSON. Given the Feed model we just created, a serializer might look like this.
from strainer import serializer, field, formatters, validators
feed_serializer = serializer(
field('feed', validators=[validators.required()]),
field('name', validators=[validators.required()]),
)
This serializer will map the feed, and name attributes into a simple python dict. Now, we can nest the item serializer into the feed serializer, here’s how.
from strainer import serializer, field, many, formatters, validators
feed_item_serializer = serializer(
field('title', validators=[validators.required()]),
field('pub_date', validators=[validators.required(), validators.datetime()],
formatters=[formatters.format_datetime()]),
)
feed_serializer = serializer(
field('feed', validators=[validators.required()]),
field('name', validators=[validators.required()]),
many('items', serializer=feed_item_serializer),
)
Using A Feed Serializer¶
We can now use the serializer. We first can instantiate some models, and then we will serialize them into dicts.
>>> import datetime
>>> feed_items = [FeedItem('A Title', datetime.datetime(2016, 11, 10, 10, 15))]
>>> feed_items += [FeedItem('Another Title', datetime.datetime(2016, 11, 10, 10, 20))]
>>> feed = Feed('http://example.org/feed.xml', 'A Blog', feed_items)
>>> feed_serializer.serialize(feed)
{'feed': 'http://example.org/feed.xml',
'items': [{'pub_date': '2016-11-10T10:15:00', 'title': 'A Title'},
{'pub_date': '2016-11-10T10:20:00', 'title': 'Another Title'}],
'name': 'A Blog'}
At this point, if we had REST API, we could convert this simple data structure into JSON and return it as the response body.
Validation¶
This is a great start to building a JSON API, but now we want to reverse the process and accept JSON. When we accept input from the outside, we first need to validate that it well-formed before we begin to work with it.
Since, we have already described our data, including what makes it valid, we can use our existing serializer, just in reverse. So, let’s say we are going to create feed item, we can do the following
feed_item = {
'title': 'A Title',
'pub_date': '2016-11-10T10:15:00',
}
print feed_item_serializer.deserialize(feed_item)
# {'pub_date': datetime.datetime(2016, 11, 10, 10, 15, tzinfo=<iso8601.Utc>), 'title': 'A Title'}
At this point, we could take that deserialized input and instantiate a FeedItem oject. If we were using an ORM we could then persist that object to the database.
Error Reporting¶
Data will not always be valid, and when it isn’t valid we should be able to report those errors back the user agent. So, we need a way to catch and present errors.
from strainer import ValidationException
feed_item = {
'title': 'A Title',
}
try:
feed_item_serializer.deserialize(feed_item)
except ValidationException, e:
print e.errors
# {'pub_date': ['This field is required']}
Here, we catch any possible validation exceptions. When a ValidationException is thrown there is a property on the exception called errors. That will have the reasons why the input is invalid. In a format that is ready to be returned as an API response.
Structures¶
Strainer exists to convert data structures comprised of rich python objects into simple datastructures ready to be converted into something suitable for HTTP resposes. It also exsists to take those simple data structures back to rich python types, and validate that the data is what it’s suppose to be.
The meat of that serialization is strainers structures. They descrbe the entire process from serialization, to validation, to deserialization.
The Basics of Structures¶
All structures return a Translator object. Translator objects have only two methods. .serialize will turn rich python objects into simple python data structures, and .deserialize will validate, and turn simple data structures into righ python types.
You can compose comples serializers by combining a number of structures.
The Field¶
A field is the smallest structure. It maps one attribute, and one value. That value can be a list, but everything inside the list needs to be the same type.
A field shouldn’t be used by its self, but you can define a field by it’s self.
from strainer import field
a_field = field('a')
During serialization this field will map the attribute a from a python object to the key a in a dict. During deserialization it will map a key a from the input to a key a in the ouput and validate that the value is correct.
Target Field¶
Sometimes, the field name in the output isn’t always the same as the attribute name in the input. So, you can pass a second optional argument to achieve different names.
from strainer import field
a_field = field('a', target_field='z')
Now a_field will serialize the attribute a to the field z in the output, and during deserialization the reverse will happen. All structures have the target_field argument.
Validators¶
When deserializing a structure you can have a series of validators run, validtors server two functions. The first is too convert incoming data into the correct form if possible, and the second is to validate that the incoming data is correct. Validators are always run when deserialization is called, and they are only run during deserialization. Validators are called in order.
from strainer import field, validators
a_field = field('a', validators=[validators.required(), validators.string(max_length=10)])
Read more about validators see, Validators.
Multiple Values¶
It is possible to declare a field as a list instead of single value. If you do so each value in the list will be validated as a single value. If any fail, the validation will fail.
from strainer import multiple_field, validators
a_field = multiple_field('a')
Custom Attribute Getter¶
The default method for geting attributes from objects is to use the operator.attrgetter function. You can pass in a different function.
This will attempt to fetch a key from a dict instead of using attrgetter.
from strainer import field
a_field = field('a', attr_getter=lambda x: x.get('a'))
Format A Value For Serialization¶
By default the value that is fetched from the attribute of the object is passed forward as-is, but you can format values for serialization by passing in a list of formatters.
from strainer import field, validators, formatters
a_field = field('a', validators=[validators.datetime()], formatters=[formatters.format_datetime()])
Read more about formatters, see , Formatters.
The Dict Field¶
The dict_field is almost exactly like the field, except that it will attempt to get a key from a dict instead of an attribute from an object.
from strainer import dict_field
a_field = dict_field('a')
The Child¶
When creating a serializer, often one will need to model one object nested in another object. This is where the child strucutre comes handy. It allows you to nest one serializer in another.
from strainer import serializer, field, child
c_serializer = serializer(
field('c1'),
)
a_serializer = serializer(
field('b'),
child('c', serializer=c_serializer),
)
Target Field¶
Sometimes, the field name in the output isn’t always the same as the attribute name in the input. So, you can pass a second optional argument to achieve different names.
from strainer import serializer, field
c_serializer = serializer(
field('c1'),
)
a_serializer = serializer(
field('b'),
child('c', target_field='a', serializer=c_serializer),
)
Now a_serializer will serialize the attribute c to the field a in the output, and during deserialization the reverse will happen.
Validators¶
Just like the regular field, you can apply validations to a child structure. These validators run before the inner object is deserialized it’s self.
In this example you may want to require that the child object exists.
from strainer import serializer, field, validators
c_serializer = serializer(
field('c1'),
)
a_serializer = serializer(
field('b'),
child('c', validators=[validators.required()], serializer=c_serializer),
)
The Many¶
The Many structure is like the Child structure. It allows you to nest objects. The Many though allows you to nest an array of values instead of one. Like the child strucutre you can also use validators.
from strainer import many, serializer, field, validators
c_serializer = serializer(
field('c1'),
)
a_serializer = serializer(
field('b'),
many('c', validators=[validators.required()], serializer=c_serializer),
)
One thing to keep in mind is that the passed validators to many will be passed all the data in the target key. That way you can perform validation over the whole structure. For instance you could limit the length of a list. The full validation will happen before the data is passed to the serialier.
The Serializer¶
A serializer is composed of any number of Translators, usually produce by other structures like field, child, and many. The serializer returns a translator object that can serializer, and deserialize.
from strainer import serializer, field
a_serializer = serializer(
field('a'),
field('b'),
)
Validators¶
Validators convert incoming data into the correct format, and also raise excpetions if data is invalid.
Current Validators¶
integer¶
Will validate that a value is an integer.
>>> from strainer import validators
>>> int_validators = validators.integer()
>>> int_validators('1')
1
You can also optionally, clamp an integer to bounds
>>> from strainer import validators
>>> int_validators = validators.integer(bounds=(2, 10))
>>> int_validators('1')
2
string¶
Will validate that a value is a string
>>> from strainer import validators
>>> string_validators = validators.string()
>>> string_validators(1)
'1'
You can also apply a max_length. If the string is longer then the max_length an exception will be thrown.
>>> from strainer import validators
>>> string_validators = validators.string(max_length=100)
required¶
Will validate that a value exists and that it is not falsey. It will accept 0, but raise an exception on False, None, ‘’, [], and {}.
boolean¶
Will coerce value into either a True, or False value. 0, False, None, ‘’, ‘[]’, and {} would all count as False values, anything else would be True.
datetime¶
This validator will attempt to parse an ISO 8601 string into a python datetime object.
The default timezone is UTC, but you can modify that by passing a default_tzinfo.
Custom Validators¶
A validator returns a function that will be used to validate a value during serialization. You can use the export_validator function to create a custom validation function.
from strainer import validators, ValidationException
@validators.export_validator
def my_silly_validators(value, context=None):
if value == 'An apple':
raise ValidationException("An apple is not silly")
return '%s is silly.' % (value)
Formatters¶
Formmatters help fields prepare values for serializaiton. Most formatters accept a value, and a context and return a formatted value.
Current Formatters¶
format_datetime¶
This formatter will take a datetime, or a date object and convert it into an ISO8601 string representation.
>>> import datetime
>>> from strainer import formatters
>>> dt_formatter = formatters.format_datetime()
>>> dt_formatter(datetime.datetime(1984, 6, 11))
'1984-06-11T00:00:00'
Custom Formatters¶
A formatter returns a function that will be used to format a value before serialization, you could build a silly formatter like this.
def custom_formatter():
def _my_formatter(value, context=None):
return '%s is silly.' % (value)
return _my_formatter
my_formatter = custom_formatter()
print my_formatter('A clown')
# A clown is silly
In practice it’s probably better to use the export_formatter decorator. It’s as simple way to create a formatter.
from strainer import formatters
@formatters.export_formatter
def my_silly_formatter(value, context=None):
return '%s is silly.' % (value)
It’s clear, and their is less nesting.
API¶
Structure¶
Use these structures to build up a serializers.
Every structure returns an object that has two methods. serialize returns objects ready to be encoded into JSON, or other formats. deserialize will validate and return objects ready to be used internally, or it will raise a validation excepton.
-
class
strainer.structure.
Translator
(serialize, deserialize)¶ Translator is an internal data structure that holds a reference to a serialize and deserialize function. All structures return a translator.
-
strainer.structure.
child
(source_field, target_field=None, serializer=None, validators=None, attr_getter=None, full_validators=None)¶ A child is a nested serializer.
-
strainer.structure.
dict_field
(*args, **kwargs)¶ dict_field is just like field except that it pulls attributes out of a dict, instead of off an object.
-
strainer.structure.
field
(source_field, target_field=None, validators=None, attr_getter=None, formatters=None)¶ Constructs an indvidual field for a serializer, this is on the order of one key, and one value.
The field determines the mapping between keys internaly, and externally. As well as the proper validation at the level of the field.
>>> from collections import namedtuple >>> Aonly = namedtuple('Aonly', 'a') >>> model = Aonly('b') >>> one_field = field('a') >>> one_field.deserialize(model) {'a': 'b'}
Parameters: - source_field (str) – What attribute to get from a source object
- target_field (str) – What attribute to place the value on the target, optional. If optional target is equal to source_field
- validators (list) – A list of validators that will be applied during deserialization.
- formaters (list) – A list of formaters that will be applied during serialization.
- attr_getter (function) – Overrides the default method for getting the soure_field off of an object
-
strainer.structure.
many
(source_field, target_field=None, serializer=None, validators=None, attr_getter=None)¶ Many allows you to nest a list of serializers
-
strainer.structure.
serializer
(*fields)¶ This function creates a serializer from a list fo fields
Validators¶
Validators are functions that validate data.
-
strainer.validators.
boolean
(*args, **kwargs)¶ Converts a field into a boolean
-
strainer.validators.
datetime
(*args, **kwargs)¶ validates that a a field is an ISO 8601 string, and converts it to a datetime object.
-
strainer.validators.
integer
(*args, **kwargs)¶ converts a value to integer, applying optional bounds
-
strainer.validators.
required
(*args, **kwargs)¶ validates that a field exists in the input
-
strainer.validators.
string
(*args, **kwargs)¶ converts a value into a string, optionally with a max length