NIDM API

This is a tool to deploy a REST API to run queries on and visualize NIDM turtle objects. Queries can be found in the nidm-query, repo, along with a nidm viewer. This API is under development, and please submit issues and requests to the nidm-api repo.

Why do I want to use this?

You might want to use this tool if you have a NIDM data structure, meaning NIDM-results, NIDM-experiment, or NIDM-workflow, and you want to get information out of it but you don’t know a single thing about RDF files or the query language for them, which is called sparql. This tool will allow you to run pre-generated queries easily, and return results in a format that is easily parsable by modern web technologies (eg, json for javascript or python), and (coming soon) csv and tsv files.

Under Development

The tool currently implements returning basic json from a query against a ttl (turtle) file. The following will be developed:

  • interactive interfaces for creating new queries
  • returning interactive graphs (d3, neo4j)
  • functions to search / filter queries
  • returning more data types
  • validation of query data structures

Contents:

Installation

To install

pip install git+git://github.com/incf-nidash/nidm-api.git

Getting Started

You have three options for using the nidm-api: as a local REST API, a REST API (served to the world on a public server), or as modules (to integrate into your python applications).

Local Machine REST API

When you install the module, an executable, “nidm” is placed into your system or local bin (it will tell you the location upon installation). If this bin is added to your path, you can start the server:

nidm

Then open your browser to localhost:8088. The index view that you see is a complete list of validated queries, for example:

_images/apiQueryMeta.png

The available queries must pass through validation to be available (not yet implemented). The queries are organized by their uid, which is just the name of the json file that is found in the nidm-query repo. You can contribute to this repo if you want to make a new query, and tools will be developed for you to generate these data structures in a graphical interface. The “type” variable in the returned json is generated dynamically, and corresponds to the folder name in nidm-query repo. We currently support “results,” “experiment,” and “workflow,” as these are the different kinds of NIDM data structures that are being developed.

The first thing you might want to do is retrieve all the meta data for a single query. This means that we will look at the list in the photo above, and find the uid of one that we like. We can then ask to see a single query:

http://localhost:8088/api/7950f524-90e8-4d54-ad6d-7b22af2e895d
_images/singleQuery.png

You might then have a turtle file that you want to actually run a query on. For example, if you look in the nidm-api examples directory, we have provided a nidm.ttl file that corresponds to a NIDM-Results folder. We can run the query we just saw above to get coordinates:

http://localhost:8088/api/query/7950f524-90e8-4d54-ad6d-7b22af2e895d?ttl=/home/vanessa/Desktop/nidm.ttl

The only difference is that we’ve added the “query” to the URL, and specified the ttl file as a variable, indicated by the ”?” at the end of the URL. We will eventually give you more options to customize your query, for now that’s it. When you do the query, your result will be returned again in json:

_images/doQuery.png

Note that I gave the REST API a local path on my computer. We can also give it a URL and it will work.

http://localhost:8088/api/query/7950f524-90e8-4d54-ad6d-7b22af2e895d?ttl=https://rawgithub.com/incf-nidash/nidm-api/master/example/nidm.ttl

Boum.

If you screw something up, meaning that the query did not work for any reason (e.g., you gave it a wrong file, improperly formatted file, or the query logic has an error), it will tell you:

_images/screwUp.png

Note that the current (tiny) set of queries is currently not being validated, and they need work and contribution.

REST API on Server

You can use something like gunicorn to run the flask application on a server, for the world to use. More documentation on how to do this will come, as we currently do not have a server to host this. Please be aware that the debug mode in the Flask application is set to True, and you should `read about <http://flask.pocoo.org/docs/0.10/deploying/ >`_ the proper way to deploy a flask application before doing something that might compromise the security of your server.

Integration into Python

An example turtle file is provided in the “example” directory of the repo, and running a query on this file from within python is shown below:

#!/usr/bin/python

from nidm.query import get_query_directory, validate_queries, make_lookup, do_query

# Get updated queries, validate, and generate a lookup dict:
query_dir = get_query_directory()
query_json = validate_queries(query_dir)
query_dict = make_lookup(query_json,key_field="uid")

# Let's use the query to get coordinates
qid = "7950f524-90e8-4d54-ad6d-7b22af2e895d"

# Here is a ttl file that I want to query, nidm-results
ttl_file = "nidm.ttl"

result = do_query(ttl_file=ttl_file,query=query_dict[qid]["sparql"])

# The result is a pandas data frame. I can turn it into other things too
result = result.to_dict(orient="records")

nidm

nidm package

Subpackages
nidm.script package
Submodules
nidm.script.post_install module
Module contents
nidm.templates package
Module contents
Submodules
nidm.app module
class nidm.app.NIDMServer(*args, **kwargs)[source]

Bases: flask.app.Flask

class nidm.app.apiDoQuery[source]

Bases: flask_restful.Resource

return result of query on ttl file Paramters ========= qid: str

the uid associated with the query
ttl: str
the url of the turtle file
endpoint = 'apidoquery'
get(qid, output_format)[source]
mediatypes(resource_cls)
methods = ['GET']
class nidm.app.apiIndex[source]

Bases: flask_restful.Resource

Main view for REST API to display all available queries

endpoint = 'apiindex'
get()[source]
mediatypes(resource_cls)
methods = ['GET']
class nidm.app.apiQueryMeta[source]

Bases: flask_restful.Resource

return complete meta data for specific query

endpoint = 'apiquerymeta'
get(qid)[source]
mediatypes(resource_cls)
methods = ['GET']
nidm.app.generateQuery()[source]
nidm.app.newQuery()[source]
nidm.app.previewQuery()[source]
nidm.app.start(port=8088)[source]
nidm.experiment module
nidm.query module

query: part of the nidm-api general functions to work with query data structures for nidm-queries

class nidm.query.Queries(components=['experiment', 'results', 'workflow'])[source]
nidm.query.do_query(ttl_file, query, rdf_format='turtle', serialize_format='csv', output_df=True)[source]
nidm.query.download_queries(destination)[source]

Download queries repo to a destination Parameters ========== destination:

the full path to download the repo to
nidm.query.find_queries(query_folders, search_pattern='*.json')[source]

searches one or more folders for valid queries, meaning json files. In the case of multiple directories, will append the folder name as a variable to indicate the type Parameters ========== query_folders: list or str

one or more full paths to directories with json objects
search_pattern: str
pattern for glob to use to find query objects default is “*.json”
queries: list
a list of full paths to query object files
nidm.query.format_sparql(sparql_text)[source]

split sparql text into a list, and extract parameter options from select.

nidm.query.generate_query_template(output_dir=None, template_path=None, fields=None)[source]
output_dir: str
full path to output directory for json data structure. if none specified, will not save the data structure
template_path: str
path to json file to use as a template. Only should be specified if the user has reason to use a custom template default is the standard provided by nidm-api.
fields: dict (optional)
a dictionary with fields that correspond to template keys. if provided, template will be filled with keys. Possible values include
template: json (dict)
A python dictionary (json) that can be filled with new query information
nidm.query.get_query_directory(tmpdir=None)[source]

get_query_directory: Download queries repo to tmp directory Parameters ========== tmpdir: str

path to directory to download queries to
nidm.query.make_lookup(query_list, key_field)[source]

returns dict object to quickly look up query based on uid Parameters ========== query_list: list

a list of query (dict objects)
key_field: str
the key in the dictionary to base the lookup key
query_dict: dict
dict (json) with key as “key_field” from query_list
nidm.query.read_queries(query_paths)[source]

Read in a list of query (json) objects. Parameters ========== query_paths: list a list of full paths to query objects to read Returns ======= queries_: list

dict to be served as json describing queries available a “type” variable is added to indicate folder query was found in
nidm.query.save_query_template(template, output_dir)[source]

generate_query_template Parameters ========== output_dir: string path

full path to output directory for json data structure. the template filename is generated from the uid variable
success: boolean
True if save was successful, false otherwise
nidm.query.validate_queries(query_dir, queries=None, components=['sparql'])[source]

returns json object with query data structures, and a field ‘valid’ to describe if query was valid Parameters ========== queries: list

a list of full paths to json files, each a query
query_dir: str
full path to a nidm-query repo
components: folders to include corresponding to nidm
query language (currently only option is sparql)
queries: json
dict (json) with all read in queries available

from nidm-query, provided by API

nidm.results module
nidm.scripts module

script.py: part of nidmapi package Runtime executable

nidm.scripts.main()[source]
nidm.utils module

utils: part of the nidm-api general functions for the api

nidm.utils.clean_fields(mydict)[source]

Ensures that keys and values of dictionary are in utf-8 so rendering in javascript is clean. Paramters ========= mydict: dict

dictionary to clean
newdict: dict
dictionary with all fields encoded in utf-8
nidm.utils.copy_directory(src, dest)[source]

Copy an entire directory recursively

nidm.utils.find_directories(root, fullpath=True)[source]

Return directories at one level specified by user (not recursive)

nidm.utils.find_subdirectories(basepath)[source]

Return directories (and sub) starting from a base

nidm.utils.get_installdir()[source]

returns installation directory of nidm-api

nidm.utils.get_query_template()[source]

get_standard_template returns the full path to the standard template for queries

nidm.utils.get_template(template_file)[source]

get_template: read in and return a template file

nidm.utils.has_internet_connectivity()[source]

Checks for internet connectivity by way of trying to retrieve google IP address. Returns True/False

nidm.utils.is_type(var, types=[<type 'int'>, <type 'float'>, <type 'list'>])[source]

Check type

nidm.utils.load_json(json_path)[source]

returns a loaded json file Parameters ========== json_path: str

full path to json file to load
thejson: json (dict)
loaded json (dict)
nidm.utils.remove_unicode_dict(input_dict, encoding='utf-8')[source]

remove unicode keys and values from dict, encoding in utf8

nidm.utils.save_template(output_file, html_snippet)[source]
nidm.utils.set_permissions(path, permission=128)[source]
nidm.utils.sub_template(template, template_tag, substitution)[source]

make a substitution for a template_tag in a template

nidm.workflow module
Module contents

Development

How do I contribute a new query?

We have a dynamic web interface that will allow you to generate, preview (and eventually test) a new data structure for adding to nidm-query. This can also be done, using the same functions, programatically. You can then add it to the repo by submitting a pull request to add it. A pull request affords group discussion, and we will eventually have continuous integration that will run tests on your new query. We recommend that you use the generation functions to ensure accuracy in the format and fields of your data structure.

Web Query Generator

To generate a query with the interactive web interface, first start up the nidm application

nidm

Then open your browser to localhost:8088/query/new. You will be presented with fields to fill in for your query:

_images/queryField.png
Fields
  • title: This should be a single sentence that describes what your query does, what object model it is associated with, etc.
  • creator: We ask for your ORCID ID to be filled into the creator spot. If you don’t have this, you can put a name or email. We are currently not validating this, so you can really do whatever you like. We will (hopefully) decide on a standard.
  • keywords: should be keywords to describe your query.
  • component: name of the nidm component the query belongs to - one of [PROV, Results, Experiment, Workflow, Dataset Description, General]
  • format: the type of sparql query - one of [SELECT, CONSTRUCT, ASK, UPDATE]
  • model: name of the object model being queried - one of [fMRI Results, Freesurfer, General]
  • description: This should be a text description of your query, please give details. The title and description will eventually be provided in a static web interface served with the repo for people to search and find queries they want to use.

When you click next, the next page is where you should copy paste your sparql:

_images/querySparql.png

If you are not ready to generate your file, you can click “Preview” for a new tab to open with the query. The parameters that the user is allowed to ask for will be extracted from your “select” line, indicated by a word preceded with ? (e.g., ?hello). Do not worry about capitalization.

_images/queryPreview.png

When you are ready to save your query, click “Create!”

_images/queryDownload1.png

A file will be downloaded to your computer. You should drop this file into the appropriate directory in your nidm-query. repo and submit a PR to add it to the nidm-api. We will eventually have tests for the queries, and an interactive web interface hosted with the nidm-api to explore the queries available (before downloading the nidm-api).

_images/queryDownload2.png

You can then download to your local machine:

How do I develop the API?

You will want to fork the repo, clone the fork, and then run the flask application directly (so that it updates with changes to your code):

git clone https://github.com/[username]/nidm-api
cd nidm-api
python setup.py install --user
python nidm/app.py

How does this work?

Flask is a web framework (in python) that makes it ridiculously easy to do stuff with python in the browser. You can conceptually think of it like Django released wild and free from its mom’s minivan. If you look at app.py, you will see a bunch of functions that return different views, and each view is associated with a particular url (with variables) that the user might want. You should first familiarize yourself with flask before trying to develop with it.

Queries vs. API

The queries are kept separate from the api itself, in the nidm-query repo. We did this because the world of writing sparql, and developing a web framework / API to serve the queries, are two separate things. A developer writing queries should be able to submit a PR to add a single file to the nidm-queries repo without needing to know about the nidm-api infrastructure. A developer working on the API shouldn’t need to worry about the sparql side of things.

Application Logic

The basic application logic is as follows:

  • The user installs the application with pip. This installs the python modules to the user’s site-packages, but it also adds an executable, “nidm” to the users bin. This exectuable can be run to start the server instantly.
  • Upon the creation of the server, the nidm-queries repo is downloaded to a temporary directory. This ensures that queries are up to date at the start of the server. If you are using the functions from within your application, you can download the repo to a location of your choice and specify the location in your application.
  • The queries are json (ld) files. This just means they have a key called @context with some kind of stuff that semantic web folk understand. We are showing them as standard .json files because the .jsonld extension is not widely known, and could be confusing.
  • The tool reads in all queries, and presents valid queries to the user at the base url of the server, localhost:8088. (Note that validation is not currently implemented). The user can select a query of choice based on the unique id, the “uid” variable in the json structure presented at localhost:8088.
  • The user can then look at the details for a query by way of locahost:8088/api/[qid], or perform a query on a ttl file with localhost:8088/api/query/[qid]?ttl=[ttl_file]. The [ttl_file] can be a local path, or a URL. This is the extent of the tool thus far, it is still under development.

Serving an API and web interfaces

The url /api/[more-stuff-here] is linked up to serve a RESTful API, however the beauty of flask is that we can configure other URLs to do other interesting things. For example, /create might bring up an interactive web interface to write inputs to generate a new query object. /api/visual may be configured to return an interactive d3 or neo4j version of some part of the graph extracted from your ttl file. Having python and the infinite amount of web visualization technology at our fingertips makes the options really unlimited.

Indices and tables