Ckan API client documentation¶
Contents:
Clients¶
There are currently three clients for the Ckan API, each one providing a different level of abstraction, and thus can be user for different needs:
CkanLowlevelClient
– just a wrapper around the API.- High-level client: provides more abstraction around the CRUD methods.
- Syncing client: provides facilities for “syncing” a collection of objects into Ckan.
Low-level¶
This is the client providing the lowest level of abstraction.
-
class
ckan_api_client.low_level.
CkanLowlevelClient
(base_url, api_key=None)[source]¶ Ckan low-level client.
- Handles authentication and response validation
- Handles request body serialization and response body deserialization
- Raises HTTPError exceptions on failed HTTP requests
- Performs some checks on return values from the API
-
anonymous
¶ Property, returning a copy of this client, without an api_key set
-
request
(method, path, **kwargs)[source]¶ Wrapper around
requests.request()
.Extra functionality provided:
- Add
Authorization
header to requests - If data is an object, serialize it with json and
add the
Content-type: application/json
header. - If the response didn’t contain an “ok” code,
raises a
HTTPError
exception.
Parameters: - method – HTTP method to be used
- path – Path, relative to the Ckan root.
For example:
/api/3/action/package_list
- headers – HTTP headers to be added to the request
- data – Data to be sent in the request body
- kwargs – Extra keyword arguments will be passed
directly to the
requests.request()
call.
Raises: ckan_api_client.exceptions.HTTPError – in case the HTTP request returned a non-ok status code
Returns: a requests response object
- Add
-
get_dataset
(dataset_id)[source]¶ Get a dataset, using API v2
Parameters: dataset_id – ID of the requested dataset Returns: a dict containing the data as returned from the API Return type: dict
-
post_dataset
(dataset)[source]¶ POST a dataset, using API v2 (usually for creation)
Parameters: dataset (dict) – a dict containing data to be sent to Ckan. Should not already contain an id Returns: a dict containing the data as returned from the API Return type: dict
-
put_dataset
(dataset)[source]¶ PUT a dataset, using API v2 (usually for update)
Parameters: dataset (dict) – a dict containing data to be sent to Ckan. Must contain an id, that will be used to build the URL Returns: a dict containing the updated dataset as returned from the API Return type: dict
High-level¶
-
class
ckan_api_client.high_level.
CkanHighlevelClient
(base_url, api_key=None, fail_on_inconsistency=False)[source]¶ High-level client, handling CRUD of objects.
This class only returns / handles CkanObjects, to make sure we are handling consistent data (they have validators in place)
Parameters: - base_url – Base URL for the Ckan instance.
- api_key – API key to be used when accessing Ckan. This is required for writing.
- fail_on_inconsistency – Whether to fail on “inconsistencies” (mismatching updated objects). This is especially useful during development, in order to catch many problems with the client itself (or new bugs in Ckan..).
-
get_dataset
(id, allow_deleted=False)[source]¶ Get a specific dataset, by id
Note
Since the Ckan API use both ids and names as keys, both
get_dataset()
andget_dataset_by_name()
will perform the exact same request in the background.The difference is only in the high-level handling: the function will check whether the expected id has the correct value, and raise an HTTPError(404, ..) otherwise..
Parameters: - id (str) – the dataset id
- allow_deleted – Whether to return even logically deleted objects.
If set to
False
(the default) will raise aHTTPError(404, ..)
ifstate != 'active'
Return type:
-
get_dataset_by_name
(name, allow_deleted=False)[source]¶ Get a specific dataset, by name
Note
See note on
get_dataset()
Parameters: - name (str) – the dataset name
- allow_deleted – Whether to return even logically deleted objects.
If set to
False
(the default) will raise aHTTPError(404, ..)
ifstate != 'active'
Return type:
-
save_dataset
(dataset)[source]¶ If the dataset already has an id, call
update_dataset()
, otherwise, callcreate_dataset()
.Returns: as returned by the called function. Return type: CkanDataset
-
create_dataset
(dataset)[source]¶ Create a dataset
Return type: CkanDataset
-
update_dataset
(dataset)[source]¶ Update a dataset
Return type: CkanDataset
-
get_organization
(id, allow_deleted=False)[source]¶ Get organization, by id.
Note
See note on
get_dataset()
Parameters: - id (str) – the organization id
- allow_deleted – Whether to return even logically deleted objects.
If set to
False
(the default) will raise aHTTPError(404, ..)
ifstate != 'active'
Return type:
-
get_organization_by_name
(name, allow_deleted=False)[source]¶ Get organization by name.
Note
See note on
get_dataset()
Parameters: - name (str) – the organization name
- allow_deleted – Whether to return even logically deleted objects.
If set to
False
(the default) will raise aHTTPError(404, ..)
ifstate != 'active'
Return type:
-
create_organization
(organization)[source]¶ Create an organization
Return type: CkanOrganization
-
update_organization
(organization)[source]¶ Return type: CkanOrganization
-
get_group
(id, allow_deleted=False)[source]¶ Get group, by id.
Note
See note on
get_dataset()
Parameters: - id (str) – the group id
- allow_deleted – Whether to return even logically deleted objects.
If set to
False
(the default) will raise aHTTPError(404, ..)
ifstate != 'active'
Return type:
-
get_group_by_name
(name, allow_deleted=False)[source]¶ Get group by name.
Note
See note on
get_dataset()
Parameters: - name (str) – the group name
- allow_deleted – Whether to return even logically deleted objects.
If set to
False
(the default) will raise aHTTPError(404, ..)
ifstate != 'active'
Return type:
Synchronization¶
-
class
ckan_api_client.syncing.
SynchronizationClient
(base_url, api_key=None, **kw)[source]¶ Synchronization client, providing functionality for importing collections of datasets into a Ckan instance.
Synchronization acts as follows:
- Snsure all the required organizations/groups are there; create a map between “source” ids and Ckan ids. Optionally update existing organizations/groups with new details.
- Find all the Ckan datasets matching the
source_name
- Determine which datasets...
- ...need to be created
- ...need to be updated
- ...need to be deleted
- First, delete datasets to be deleted in order to free up names
- Then, create datasets that need to be created
- Lastly, update datasets using the configured merge strategy (see constructor arguments).
-
__init__
(base_url, api_key=None, **kw)[source]¶ Parameters: - base_url – Base URL of the Ckan instance, passed to high-level client
- api_key – API key to be used, passed to high-level client
- organization_merge_strategy –
One of:
- ‘create’ (default) if the organization doesn’t exist, create it. Otherwise, leave it alone.
- ‘update’ if the organization doesn’t exist, create it. Otherwise, update with new values.
- group_merge_strategy –
One of:
- ‘create’ (default) if the group doesn’t exist, create it. Otherwise, leave it alone.
- ‘update’ if the group doesn’t exist, create it. Otherwise, update with new values.
- dataset_preserve_names – if
True
(the default) will preserve old names of existing datasets - dataset_preserve_organization – if
True
(the default) will preserve old organizations of existing datasets. - dataset_group_merge_strategy –
- ‘add’ add groups, keep old ones (default)
- ‘replace’ replace all existing groups
- ‘preserve’ leave groups alone
-
sync
(source_name, data)[source]¶ Synchronize data from a source into Ckan.
- datasets are matched by _harvest_source
- groups and organizations are matched by name
Parameters: - source_name – String identifying the source of the data. Used to build ids that will be used in further synchronizations.
- data – Data to be synchronized. Should be a dict (or dict-like)
with top level keys coresponding to the object type,
mapping to dictionaries of
{'id': <object>}
.
Modules¶
ckan_api_client.exceptions¶
Exceptions used all over the place
-
exception
ckan_api_client.exceptions.
HTTPError
(status_code, message, original=None)[source]¶ Bases:
exceptions.Exception
Exception representing an HTTP response error.
-
status_code
¶ HTTP status code
-
message
¶ Informative error message, if available
-
status_code
-
message
-
original
¶
-
-
exception
ckan_api_client.exceptions.
BadApiError
[source]¶ Bases:
exceptions.Exception
Exception used to mark bad behavior from the API
ckan_api_client.objects¶
Base objects¶
Classes to represent / validate Ckan objects.
-
class
ckan_api_client.objects.base.
BaseField
(default=<object object>, is_key=<object object>, required=False)[source]¶ Bases:
object
Pseudo-descriptor, accepting field names along with instance, to allow better retrieving data for the instance itself.
Warning
Beware that fields shouldn’t carry state of their own, a part from the one used for generic field configuration, as they are shared between instances.
-
default
= None¶
-
is_key
= False¶
-
get
(instance, name)[source]¶ Get the value for the field from the main instace, by looking at the first found in:
- the updated value
- the initial value
- the default value
-
validate
(instance, name, value)[source]¶ The validate method should be the (updated) value to be used as the field value, or raise an exception in case it is not acceptable at all.
-
delete
(instance, name)[source]¶ Delete the modified value for a field (logically restores the original one)
-
-
class
ckan_api_client.objects.base.
BaseObject
(values=None)[source]¶ Bases:
object
Base for the other objects, dispatching get/set/deletes to
BaseField
instances, if available.
Base fields¶
-
class
ckan_api_client.objects.fields.
StringField
(default=<object object>, is_key=<object object>, required=False)[source]¶ Bases:
ckan_api_client.objects.base.BaseField
-
default
= None¶
-
-
class
ckan_api_client.objects.fields.
ListField
(default=<object object>, is_key=<object object>, required=False)[source]¶ Bases:
ckan_api_client.objects.fields.MutableFieldMixin
,ckan_api_client.objects.base.BaseField
-
static
default
()¶
-
static
-
class
ckan_api_client.objects.fields.
DictField
(default=<object object>, is_key=<object object>, required=False)[source]¶ Bases:
ckan_api_client.objects.fields.MutableFieldMixin
,ckan_api_client.objects.base.BaseField
-
static
default
()¶
-
static
Ckan dataset / resource¶
-
class
ckan_api_client.objects.ckan_dataset.
ResourcesField
(default=<object object>, is_key=<object object>, required=False)[source]¶ Bases:
ckan_api_client.objects.fields.ListField
The ResourcesField should behave pretty much as a list field, but will keep track of changes, and make sure all elements are CkanResources.
-
class
ckan_api_client.objects.ckan_dataset.
CkanDataset
(values=None)[source]¶ Bases:
ckan_api_client.objects.base.BaseObject
-
id
= StringField(default=None, is_key=True, required=False)¶
-
name
= StringField(default=None, is_key=False, required=False)¶
-
title
= StringField(default=None, is_key=False, required=False)¶
-
license_id
= StringField(default='', is_key=False, required=False)¶
-
maintainer
= StringField(default='', is_key=False, required=False)¶
-
maintainer_email
= StringField(default='', is_key=False, required=False)¶
-
notes
= StringField(default='', is_key=False, required=False)¶
-
owner_org
= StringField(default='', is_key=False, required=False)¶
-
private
= BoolField(default=False, is_key=False, required=False)¶
-
state
= StringField(default='active', is_key=False, required=False)¶
-
type
= StringField(default='dataset', is_key=False, required=False)¶
-
url
= StringField(default='', is_key=False, required=False)¶
-
extras
= ExtrasField(default=<function <lambda>>, is_key=False, required=False)¶
-
groups
= GroupsField(default=<function <lambda>>, is_key=False, required=False)¶
-
resources
= ResourcesField(default=<function <lambda>>, is_key=False, required=False)¶
-
-
class
ckan_api_client.objects.ckan_dataset.
CkanResource
(values=None)[source]¶ Bases:
ckan_api_client.objects.base.BaseObject
-
id
= StringField(default=None, is_key=True, required=False)¶
-
description
= StringField(default='', is_key=False, required=False)¶
-
format
= StringField(default='', is_key=False, required=False)¶
-
mimetype
= StringField(default=None, is_key=False, required=False)¶
-
mimetype_inner
= StringField(default=None, is_key=False, required=False)¶
-
name
= StringField(default='', is_key=False, required=False)¶
-
resource_type
= StringField(default='', is_key=False, required=False)¶
-
size
= StringField(default=None, is_key=False, required=False)¶
-
url
= StringField(default='', is_key=False, required=False)¶
-
url_type
= StringField(default=None, is_key=False, required=False)¶
-
Ckan group¶
-
class
ckan_api_client.objects.ckan_group.
CkanGroup
(values=None)[source]¶ Bases:
ckan_api_client.objects.base.BaseObject
-
id
= StringField(default=None, is_key=True, required=False)¶
-
name
= StringField(default=None, is_key=False, required=False)¶
-
title
= StringField(default='', is_key=False, required=False)¶
-
approval_status
= StringField(default='approved', is_key=False, required=False)¶
-
description
= StringField(default='', is_key=False, required=False)¶
-
image_url
= StringField(default='', is_key=False, required=False)¶
-
is_organization
= BoolField(default=False, is_key=False, required=False)¶
-
state
= StringField(default='active', is_key=False, required=False)¶
-
type
= StringField(default='group', is_key=False, required=False)¶
-
extras
= ExtrasField(default=<function <lambda>>, is_key=False, required=False)¶
-
groups
= GroupsField(default=<function <lambda>>, is_key=False, required=False)¶
-
Ckan organization¶
-
class
ckan_api_client.objects.ckan_organization.
CkanOrganization
(values=None)[source]¶ Bases:
ckan_api_client.objects.base.BaseObject
-
id
= StringField(default=None, is_key=True, required=False)¶
-
name
= StringField(default=None, is_key=False, required=False)¶
-
title
= StringField(default='', is_key=False, required=False)¶
-
approval_status
= StringField(default='approved', is_key=False, required=False)¶
-
description
= StringField(default='', is_key=False, required=False)¶
-
image_url
= StringField(default='', is_key=False, required=False)¶
-
is_organization
= BoolField(default=True, is_key=False, required=False)¶
-
state
= StringField(default='active', is_key=False, required=False)¶
-
type
= StringField(default='organization', is_key=False, required=False)¶
-
extras
= ExtrasField(default=<function <lambda>>, is_key=False, required=False)¶
-
groups
= GroupsField(default=<function <lambda>>, is_key=False, required=False)¶
-
ckan_api_client.utils¶
-
class
ckan_api_client.utils.
IDPair
[source]¶ Bases:
ckan_api_client.utils.IDPair
A pair (named tuple) mapping a “source” id with the one used internally in Ckan.
This is mostly used associated with
IDMap
.Keys:
source_id
,ckan_id
-
class
ckan_api_client.utils.
SuppressExceptionIf
(cond)[source]¶ Bases:
object
Context manager used to suppress exceptions if they match a given condition.
Usage example:
is_404 = lambda x: isinstance(x, HTTPError) and x.status_code == 404 with SuppressExceptionIf(is_404): client.request(...)
-
class
ckan_api_client.utils.
IDMap
[source]¶ Bases:
object
Two-way hashmap to map source ids to ckan ids and the other way back.
-
class
ckan_api_client.utils.
FrozenDict
(*a, **kw)[source]¶ Bases:
_abcoll.MutableMapping
Frozen dictionary. Acts as a read-only dictionary, preventing changes and returning frozen objects when asked for values.
-
class
ckan_api_client.utils.
FrozenSequence
(data)[source]¶ Bases:
_abcoll.Sequence
Base class for the FrozenList/FrozenTuple classes. Acts as a read-only sequence type, returning frozen versions of mutable objects.
-
class
ckan_api_client.utils.
FrozenList
(data)[source]¶ Bases:
ckan_api_client.utils.FrozenSequence
Immutable list-like.
-
class
ckan_api_client.utils.
FrozenTuple
(data)[source]¶ Bases:
ckan_api_client.utils.FrozenSequence
Immutable tuple-like.
Testing¶
All the testing is done via py.test. See the pages below on how to run and write tests.
Fixtures¶
Documentation of the available fixtures for tests.
Fixture functions¶
Utility objects¶
Utility functions¶
Functions used by fixtures.
Testing utilities¶
Data generation¶
-
ckan_api_client.tests.utils.generate.
generate_organization
()[source]¶ Generate a random organization object, with:
name
, random, example:"org-abc123"
title
, random, example:"Organization abc123"
description
, randomimage
, url pointing to a random-generated pic
-
ckan_api_client.tests.utils.generate.
generate_group
()[source]¶ Generate a random group object, with:
name
, random, example:"grp-abc123"
title
, random, example:"Group abc123"
description
, randomimage
, url pointing to a random-generated pic
-
ckan_api_client.tests.utils.generate.
generate_dataset
()[source]¶ Generate a dataset, populated with random data.
Fields:
name
– random string, in the formdataset-{random}
title
– random string, in the formDataset {random}
author
– random-generated nameauthor_email
– random-generated email addresslicense_id
– random license id. One ofcc-by
,cc-zero
,cc-by-sa
ornotspecified
.maintainer
– random-generated namemaintainer_email
– random-generated email addressnotes
– random string, containing some markdownowner_org
– set to Noneprivate
– Fixed toFalse
tags
– random list of tags (strings)type
– fixed string:"dataset"
url
– random url of dataset on an “external source”extras
– dictionary containing random key / value pairsgroups
– empty listresources
– list of random resourcesrelationships
– empty list
Note
The
owner_org
andgroups
fields will be blank, as they must match with existing groups / organizations and we don’t have access to database from here (nor is it in the scope of this function!)
-
ckan_api_client.tests.utils.generate.
generate_resource
()[source]¶ Generate a random resource, to be put in a dataset.
Fields:
url
– resource URL on an “external source”resource_type
– one ofapi
orfile
name
– random-generated nameformat
– a random format (eg:csv
,json
)description
– random generated string
Generate
amount
random tags. Each tag is in the formtag-<random-int>
.Returns: a list of tag names
-
ckan_api_client.tests.utils.generate.
generate_extras
(amount)[source]¶ Generate a dict with
amount
random key/value pairs.
-
ckan_api_client.tests.utils.generate.
generate_data
(dataset_count=50, orgs_count=10, groups_count=15)[source]¶ Generate a bunch of random data. Will also associate datasets with random organizations / groups.
Returns: a dict with the dataset
,organization
andgroup
keys; each of them a dict of{key: object}
.
HTTP Utilities¶
Utilities for handling / checking HTTP responses
-
ckan_api_client.tests.utils.http.
check_response_ok
(response, status_code=200)[source]¶ Warning
deprecated function. Use
check_api_v3_response()
.
-
ckan_api_client.tests.utils.http.
check_response_error
(response, status_code)[source]¶ Warning
deprecated function. Use
check_api_v3_error()
.
-
ckan_api_client.tests.utils.http.
check_api_v3_response
(response, status_code=200)[source]¶ Make sure that
response
is a valid successful response from API v3.- check http status code to be in the 200-299 range
- check http status code to match
status_code
- check content-type to be application/json
- check charset to be utf-8
- check content body to be valid json
- make sure response object contains the
success
,result
andhelp
keys. - check that
success
is True - check that
error
key is not in the response
Parameters: - response – a
requests
response - status_code – http status code to be checked (default: 200)
String-related¶
String generation functions.
-
ckan_api_client.tests.utils.strings.
generate_password
(length=20)[source]¶ Generate random password of the given
length
.Beware that the string will be generate as random data from urandom, and returned as headecimal string of twice the
length
.
-
ckan_api_client.tests.utils.strings.
generate_random_alphanum
(length=10)[source]¶ Generate a random string, made of ascii letters + digits
-
ckan_api_client.tests.utils.strings.
gen_random_id
(length=10)[source]¶ Generate a random id, made of lowercase ascii letters + digits
-
ckan_api_client.tests.utils.strings.
gen_picture
(s, size=200)[source]¶ Generate URL to picture from some text hash