LWR¶
This project is a Python server application that allows a Galaxy server to run jobs on remote systems (including Windows) without requiring a shared mounted file systems. Unlike traditional Galaxy job runners - input files, scripts, and config files may be transferred to the remote system, the job is executed, and the result downloaded back to the Galaxy server.
Full documentation for the project can be found on Read The Docs.
Configuring Galaxy¶
Galaxy job runners are configured in Galaxy’s job_conf.xml
file. Some small examples of how to configure this can be found here, but be sure to checkout job_conf.xml.sample_advanced
in your Galaxy code base or on
Bitbucket
for complete information.
Downloading LWR¶
The LWR server application is distributed as a Python project and can be obtained via mercurial from bitbucket.org using the following command:
hg clone http://bitbucket.org/jmchilton/lwr
LWR Dependencies¶
Several Python packages must be installed to run the LWR server. These can
either be installed into a Python virtualenv
or into your system wide
Python environment using easy_install
. Instructions for both are outlined
below. Additionally, if DRMAA is going to be used to communicate with a
cluster, this dependency must be installed as well - again see note below.
virtualenv¶
The script setup_venv.sh
distributed with the LWR server is a
short-cut for *nix machines to setup a Python environment (including
the installation of virtualenv). Full details for installation
suitable for *nix are as follows. These instructions can work for Windows
as well but generally the easy_install
instructions below are more
robust for Window’s environments.
Install virtualenv (if not available):
pip install virtualenv
Create a new Python environment:
virtualenv .venv
Activate environment (varies by OS).
From a Linux or MacOS terminal:
. .venv/bin/activate
From a Windows terminal:
.venv\Scripts\activate
Install required dependencies into this virtual environment:
pip install -r requirements.txt
easy_install¶
Install python setuptools for your platform, more details on how to do this can be found here.
The easy_install
command line application will be installed as
part of setuptools. Use the following command to install the needed
packages via easy_install
:
easy_install paste wsgiutils PasteScript PasteDeploy webob six psutil
DRMAA¶
If your LWR instance is going to communicate with a cluster via DRMAA, in addition to the above dependencies, a DRMAA library will need to be installed and the python dependency drmaa will need to be installed as well.:
. .venv/bin/activate; pip install drmaa
or:
easy_install drmaa
Running the LWR Server Application¶
*nix Instructions¶
The LWR can be started and stopped via the run.sh
script distributed with
the LWR.:
./run.sh --daemon
./run.sh --stop-daemon
These commands will start and stop the WSGI web server in daemon mode. In this
mode, logs are writtin to paster.log
.
If uWSGI, circus and/or chassuette, are available, more sophisticated web
servers can be launched via this run.sh
command. See the script for more
details.
Alternative Cross Platform Instructions (Windows and *nix)¶
The paster
command line application will be installed as part of the
previous dependency installation process. This application can be used to
start and stop a paste web server running the LWR. This can be done by
executing the following command:
The server may be ran as a daemon via the command:
paster serve server.ini --daemon
When running as daemon, the server may be stopped with the following command:
paster serve server.ini --stop-daemon
If you setup a virtual environment for the LWR you will need to activate this before executing these commands.
Configuring the LWR Server Application¶
Rename the server.ini.sample
file distributed with LWR to server.ini
,
and edit the values therein to configure the server
application. Default values are specified for all configuration
options that will work if LWR is running on the same host as
Galaxy. However, the parameter “host” must be specified for remote
submissions to the LWR server to run properly. The server.ini
file
contains documentation for many configuration parameters you may want
to modify.
Some advanced configuration topics are discussed below.
Security¶
Out of the box the LWR essentially allows anyone with network access to the LWR server to execute arbitrary code and read and write any files the web server can. Hence, in most settings steps should be taken to secure the LWR server.
LWR Web Server¶
The LWR web server can be configured to use SSL and to require the client (i.e. Galaxy) to pass along a private token authorizing use.
pyOpenSSL
is required to configure an LWR web server to server content via
HTTPS/SSL. This dependency can be difficult to install and seems to be getting
more difficult. Under Linux you will want to ensure the needed dependencies to
compile pyOpenSSL are available - for instance in a fresh Ubuntu image you
will likely need:
sudo apt-get install libffi-dev python-dev libssl-dev
Then pyOpenSSL can be installed with the following command (be sure to source your virtualenv if setup above):
pip install pyOpenSSL
Under Windows only older versions for pyOpenSSL are installable via pre- compiled binaries (i.e. using easy_install) so it might be good to use non- standard sources such as eGenix.
Once installed, you will need to set the option ssl_pem
in server.ini
.
This parameter should reference an OpenSSL certificate file for use by the
Python paste server. This parameter can be set to *
to automatically
generate such a certificate. Such a certificate can manually be generated by
the following method:
$ openssl genrsa 1024 > host.key
$ chmod 400 host.key
$ openssl req -new -x509 -nodes -sha1 -days 365 \
-key host.key > host.cert
$ cat host.cert host.key > host.pem
$ chmod 400 host.pem
More information can be found in the paste httpserver documentation.
Finally, in order to force Galaxy to authorize itself, you will want to
specify a private token - by simply setting private_key
to some long
random string in server.ini
.
Once SSL has been enabled and a private token configured, Galaxy job
destinations should include a private_token
parameter to authenticate
these jobs.
LWR Message Queue¶
If LWR is processing Galaxy requests via a message queue instead of a web
server the underlying security mechanisms of the message queue should be used
to secure the LWR communication - configuring SSL with the LWR and a
private_token
above are not required.
This will likely consist of setting some combination of
amqp_connect_ssl_ca_certs
, amqp_connect_ssl_keyfile
,
amqp_connect_ssl_certfile
, amqp_connect_ssl_cert_reqs
, in LWR’s
server.ini
file. See server.ini.sample
for more details and the Kombo
documentation for
even more information.
Customizing the LWR Environment¶
In more sophisticated deployments, the LWR’s environment will need to be
tweaked - for instance to define a DRMAA_LIBRARY_PATH
environment variable
for the drmaa
Python module or to define the location to a find a location
of Galaxy (via GALAXY_HOME
) if certain Galaxy tools require it or if
Galaxy metadata is being set by the LWR. The recommend way to do this is to
copy local_env.sh.sample
to local_env.sh
and customize it.
This file of deployment specific environment tweaks will be source by
run.sh
if it exists as well as by other LWR scripts in more advanced usage
scenarios.
Job Managers (Queues)¶
By default the LWR will maintain its own queue of jobs. While ideal for simple deployments such as those targetting a single Windows instance, if the LWR is going to be used on more sophisticate clusters, it can be configured to maintain multiple such queues with different properties or to delegate to external job queues (via DRMAA, qsub/qstat CLI commands, or Condor).
For more information on configured external job managers, see the job managers documentation.
Warning: If you are using DRMAA, be sure to define DRMAA_LIBRARY_PATH
in
local_env.sh
defined above.
Galaxy Tools¶
Some Galaxy tool wrappers require a copy of the Galaxy codebase itself to run.
Such tools will not run under Windows, but on *nix hosts the LWR can be
configured to add the required Galaxy code a jobs PYTHON_PATH
by setting
GALAXY_HOME
environment variable in the LWR’s local_env.sh
file
(described above).
Caching (Experimental)¶
LWR and its clients can be configured to cache job input files. For some
workflows this can result in a significant decrease in data transfer and
greater throughput. On the LWR side - the property file_cache_dir
in
server.ini
must be set. See Galaxy’s
job_conf.xml
for information on configuring the client.
More discussion on this can be found in this galaxy-dev mailing list thread and future plans and progress can be tracked on this Trello card.
Message Queue (Experimental)¶
Galaxy and the LWR can be configured to communicate via a message queue instead of an LWR web server. In this mode, the LWR will download files from and upload files to Galaxy instead of the inverse - this may be very advantageous if the LWR needs to be deployed behind a firewall or if the Galaxy server is already setup (via proxy web server) for large file transfers.
To bind the LWR server to a message queue, one needs to first ensure the
kombu
Python dependency is installed (pip install kombu
). Once this
available, simply set the message_queue_url
property in server.ini
to
the correct URL of your configured AMQP
endpoint.
Configuring your AMQP compatible message queue is beyond the scope of this document - see RabbitMQ for instance for more details (other MQs should work also).
Testing¶
A simple sanity test can be run against a running LWR server by executing the following command (replace the URL command with the URL of your running LWR application):
python run_client_tests.py --url=http://localhost:8913
Development¶
This project is distributed with unit and integration tests (many of which will not run under Windows), the following command will install the needed python components to run these tests.:
pip install -r dev-requirements.txt
The following command will then run these tests:
nosetests
The following command will then produce a coverage report corresponding to this test and place it in the coverage_html_report subdirectory of this project.:
coverage html
Job Managers¶
By default the LWR will maintain its own queue of jobs. Alternatively, the LWR can be configured to maintain multiple such queues with different properties or to delegate to external job queues (via DRMAA, qsub/qstat CLI commands, or Condor).
To change the default configuration, rename the file
job_managers.ini.sample
distributed with the LWR to job_managers.ini
and modify it to reflect your desired configuration, and finally uncomment the
line #job_managers_config = job_managers.ini
in server.ini
.
Likely the cleanest way to interface with an external queueing system is going
to be DRMAA. In this case, one should likely copy local_env.sh.sample
to
local_env.sh
and update it to set DRMAA_LIBRARY_PATH
to point to the
correct libdrmaa.so
file. Also, the Python drmaa
module must be
installed (see more information about drmaa dependency <https://lwr.readthedocs.org/#job-managers>).
Sample Configuration¶
## Default job manager is queued and runs 1 concurrent job.
[manager:_default_]
type = queued_python
max_concurrent_jobs=1
## Create a named queued (example) and run as many concurrent jobs as
## server has cores. The Galaxy LWR url should have /managers/example
## appended to it to use a named manager such as this.
#[manager:example]
#type=queued_python
#max_concurrent_jobs=*
## DRMAA backed manager (vanilla).
## Be sure drmaa Python module install and DRMAA_LIBRARY_PATH points
## to a valid DRMAA shared library file. You may also need to adjust
## LD_LIBRARY_PATH.
#[manager:_default_]
#type=queued_drmaa
#native_specification=-P bignodes -R y -pe threads 8
## Condor backed manager.
#[manager:_default_]
#type=queued_condor
## Optionally, additional condor submission parameters can be
## set as follows:
#submit_universe=vanilla
#submit_request_memory=32
#submit_requirements=OpSys == "LINUX" && Arch =="INTEL"
#submit_rank=Memory >= 64
## These would set universe, request_memory, requirements, and rank
## in the condor submission file to the specified values. For
## more information on condor submission files see the following link:
## http://research.cs.wisc.edu/htcondor/quick-start.html.
## CLI Manager Locally
## Manage jobs via command-line execution of qsub, qdel, stat.
#[manager:_default_]
#type=queued_cli
#job_plugin=Torque
## CLI Manager via Remote Shell
## Manage jobs via qsub, qdel, qstat on remote host `queuemanager` as
## Unix user `queueuser`.
#[manager:_default_]
#type=queued_cli
#job_plugin=Torque
#shell_plugin=SecureShell
#shell_hostname=queuemanager
#shell_username=queueuser
## DRMAA (via external users) manager.
## This variant of the DRMAA manager will run jobs as the supplied user.
#[manager:_default_]
#type=queued_external_drmaa
#production=true
#chown_working_directory_script=scripts/chown_working_directory.bash
#drmaa_kill_script=scripts/drmaa_kill.bash
#drmaa_launch_script=scripts/drmaa_launch.bash
## NOT YET IMPLEMENTED. PBS backed manager.
#[manager:_default_]
#type=queued_pbs
## Disable server-side LWR queuing (suitable for older style LWR use
## when queues were maintained in Galaxy.) Deprecated, will be removed
## at some point soon.
#[manager:_default_]
#type=unqueued
## MQ-Options:
## If using a message queue the LWR will actively monitor status of jobs
## in order to issue status update messages. The following options are
## then available to any managers.
## Minimum seconds between polling intervals (increase to reduce resources
## consumed by the LWR).
#min_polling_interval = 0.5
Running Jobs As External User¶
TODO: Fill out this section with information from this thread <http://dev.list.galaxyproject.org/Managing-Data-Locality-tp4662438.html>.
Galaxy Configuration¶
Examples¶
The most complete and updated documentation for configuring Galaxy job
destinations is Galaxy’s job_conf.xml.sample_advanced
file (check it out on
Bitbucket).
These examples just provide a different LWR-centric perspective on some of the
documentation in that file.
Simple Windows LWR Web Server¶
The following Galaxy job_conf.xml
assumes you have deployed a simple LWR
web server to the Windows host windowshost.examle.com
on the default port
(8913
) with a private_key
(defined in server.ini
) of
123456789changeme
. Most Galaxy jobs will just route use Galaxy’s local job
runner but msconvert
and proteinpilot
will be sent to the LWR server
on windowshost.examle.com
. Sophisticated tool dependency resolution is not
available for Windows-based LWR servers so ensure the underlying application
are on the LWR’s path.
<?xml version="1.0"?>
<job_conf>
<plugins>
<plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner"/>
<plugin id="lwr" type="runner" load="galaxy.jobs.runners.lwr:LwrJobRunner"/>
</plugins>
<handlers>
<handler id="main"/>
</handlers>
<destinations default="local">
<destination id="local" runner="local"/>
<destination id="win_lwr" runner="lwr">
<param id="url">https://windowshost.examle.com:8913/</param>
<param id="private_token">123456789changeme</param>
</destination>
</destinations>
<tools>
<tool id="msconvert" destination="win_lwr" />
<tool id="proteinpilot" destination="win_lwr" />
</tools>
</job_conf>
Targeting a Linux Cluster (LWR Web Server)¶
The following Galaxy job_conf.xml
assumes you have a very typical Galaxy
setup - there is a local, smaller cluster that mounts all of Galaxy’s data (so
no need for the LWR) and a bigger shared resource that cannot mount Galaxy’s
files requiring the use of the LWR. This variant routes some larger assembly
jobs to the remote cluster - namely the trinity and abyss tools. Be sure
the underlying applications required by the trinity
and abyss
tools
are the LWR path or set tool_dependency_dir
in server.ini
and setup
Galaxy env.sh-style packages definitions for these applications).
<?xml version="1.0"?>
<job_conf>
<plugins>
<plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner">
<plugin id="lwr" type="runner" load="galaxy.jobs.runners.lwr:LwrJobRunner"/>
</plugins>
<handlers>
<handler id="main"/>
</handlers>
<destinations default="drmaa">
<destination id="local_cluster" runner="drmaa">
<param id="native_specification">-P littlenodes -R y -pe threads 4</param>
</destination>
<destination id="remote_cluster" runner="lwr">
<param id="url">http://remotelogin:8913/</param>
<param id="submit_native_specification">-P bignodes -R y -pe threads 16</param>
<!-- Look for trinity package at remote location - define tool_dependency_dir
in the LWR server.ini file.
-->
<param id="dependency_resolution">remote</params>
<!-- Use more correct parameter generation for *nix. Needs testing on Windows
servers before this becomes default. -->
<param id="rewrite_parameters">True</params>
</destination>
</destinations>
<tools>
<tool id="trinity" destination="remote_cluster" />
<tool id="abyss" destination="remote_cluster" />
</tools>
</job_conf>
For this configuration, on the LWR side be sure to set a
DRMAA_LIBRARY_PATH
in local_env.sh
, install the Python drmaa
module, and configure a DRMAA job manager (example job_managers.ini
follows).
[manager:_default_]
type=queued_drmaa
Targeting a Linux Cluster (LWR over Message Queue)¶
For LWR instances sitting behind a firewall a web server may be impossible. If
the same LWR configuration discussed above is additionally configured with a
message_queue_url
of amqp://rabbituser:rabb8pa8sw0d@mqserver:5672//
in
server.ini
the following Galaxy configuration will cause this message
queue to be used for communication. This is also likely better for large file
transfers since typically your production Galaxy server will be sitting behind
a high-performance proxy but not the LWR.
<?xml version="1.0"?>
<job_conf>
<plugins>
<plugin id="drmaa" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner">
<plugin id="lwr" type="runner" load="galaxy.jobs.runners.lwr:LwrJobRunner">
<!-- Must tell LWR where to send files. -->
<param id="galaxy_url">https://galaxyserver</param>
<!-- Message Queue Connection (should match message_queue_url in LWR's server.ini)
-->
<param id="url">amqp://rabbituser:rabb8pa8sw0d@mqserver:5672//</param>
</plugin>
</plugins>
<handlers>
<handler id="main"/>
</handlers>
<destinations default="drmaa">
<destination id="local_cluster" runner="drmaa">
<param id="native_specification">-P littlenodes -R y -pe threads 4</param>
</destination>
<destination id="remote_cluster" runner="lwr">
<!-- Tell Galaxy where files are being store on remote system, no
web server it can simply ask for this information.
-->
<param id="jobs_directory">/path/to/remote/lwr/lwr_staging/</param>
<!-- Invert file transfers - have LWR initiate downloads during preprocessing
and uploads during postprocessing. -->
<param id="default_file_action">remote_transfer</param>
<!-- Remaining parameters same as previous example -->
<param id="submit_native_specification">-P bignodes -R y -pe threads 16</param>
<param id="dependency_resolution">remote</params>
<param id="rewrite_parameters">True</params>
</destination>
</destinations>
<tools>
<tool id="trinity" destination="remote_cluster" />
<tool id="abyss" destination="remote_cluster" />
</tools>
</job_conf>
Targeting Apache Mesos (Prototype)¶
See commit message for initial work on this and this post on galaxy-dev.
Etc...¶
There are many more options for configuring what paths get staging/unstaged how, how Galaxy metadata is generated, running jobs as the real user, defining multiple job managers on the LWR side, etc.... If you ever have any questions please don’t hesistate to ask John Chilton (jmchilton@gmail.com).
File Actions¶
Most of the parameters settable in Galaxy’s job configuration file
job_conf.xml
are straight forward - but specifing how Galaxy and the LWR
stage various files may benefit from more explaination.
As demonstrated in the above default_file_action
describes how inputs,
outputs, etc... are staged. The default transfer
has Galaxy initiate HTTP
transfers. This makes little sense in the contxt of message queues so this
should be overridden and set to remote_transfer
which causes the LWR to
initiate the file transfers. Additional options are available including
none
, copy
, and remote_copy
.
In addition to this default - paths may be overridden based on various patterns to allow optimization of file transfers in real production infrastructures where various systems mount different file stores and file stores with different paths on different systems.
To do this, the LWR destination in job_conf.xml
may specify a parameter
named file_action_config
. This needs to be some config file path (if
relative, relative to Galaxy’s root) like lwr_actions.yaml
(can be YAML or
JSON - but older Galaxy’s only supported JSON).
paths:
# Use transfer (or remote_transfer) if only Galaxy mounts a directory.
- path: /galaxy/files/store/1
action: transfer
# Use copy (or remote_copy) if remote LWR server also mounts the directory
# but the actual compute servers do not.
- path: /galaxy/files/store/2
action: copy
# If Galaxy, the LWR, and the compute nodes all mount the same directory
# staging can be disabled altogether for given paths.
- path: /galaxy/files/store/3
action: none
# Following block demonstrates specifying paths by globs as well as rewriting
# unstructured data in .loc files.
- path: /mnt/indices/**/bwa/**/*.fa
match_type: glob
path_types: unstructured # Set to *any* to apply to defaults & unstructured paths.
action: transfer
depth: 1 # Stage whole directory with job and just file.
# Following block demonstrates rewriting paths without staging. Useful for
# instance if Galaxy's data indices are mounted on both servers but with
# different paths.
- path: /galaxy/data
path_types: unstructured
action: rewrite
source_directory: /galaxy/data
destination_directory: /work/galaxy/data
Configuring a Public LWR Server¶
An LWR server can be pointed at a Galaxy toolbox XML file and opened to the world. By default, an LWR is allowed to run anything Galaxy (or other client) sends it. The toolbox and referenced tool files are used to restrict what what the LWR will run.
This can be sort of thought of as web services defined by Galaxy tool files - with all the advantages (dead simple configuration for clients, ability to hide details related date and computation) and disadvantages (lack of reproducibility if the LWR server goes away, potential lack of transparency).
Securing a Public LWR¶
The following options should be set in server.ini
to configure a
public LWR server.
assign_ids=uuid
- By default the LWR will just the ids Galaxy instances. Setting this setting touuid
will result in each job being assigned a UUID, ensuring different clients will not and cannot interfer with each other.tool_config_files=/path/to/tools.xml
- As noted above, this is used to restrict what tools clients can run. All tools on public LWR servers should have validators for commands (and optionally for configfiles) defined. The syntax for these elements can be found in the ValidatorTest test case.
Writing Secure Tools¶
Validating in this fashion is complicated and potentially error prone, so it is advisable to keep command-lines as simple as possible. configfiles and reorganizing parameter handling in wrappers scripts can assist in this.
Consider the following simple example:
tool.xml
:
<tool>
<command interpreter="python">wrapper.py --input1 'Text' --input2 'Text2' --input3 4.5</command>
...
wrapper.py
:
def main():
parser = OptionParser()
parser.add_option("--input1")
parser.add_option("--input2")
parser.add_option("--input3")
(options, args) = parser.parse_args()
Even this simple example is easier to validate and secure if it is reworked as so:
tool.xml
:
<tool>
<configfiles>
<configfile name="args">--input1 'Text' --input2 'Text2' --input3 4.5</configfile>
</configfiles>
<command interpreter="python">wrapper.py $args</command>
...
wrapper.py
:
import sys, shlex
def main():
args_config = sys.argv[1]
args_string = open(args_config, "r").read()
parser = OptionParser()
parser.add_option("--input1")
parser.add_option("--input2")
parser.add_option("--input3")
(options, args) = parser.parse_args(shlex.split(args_string))
Server Code¶
lwr.managers
Module¶
Job Managers
lwr.managers.base
Module¶
Base Classes and Infrastructure Supporting Concret Manager Implementations.
lwr.managers.base.base_drmaa
Module¶
lwr.managers.base.directory
Module¶
lwr.managers.base.external
Module¶
-
class
lwr.managers.base.external.
ExternalBaseManager
(name, app, **kwds)[source]¶ Bases:
lwr.managers.base.directory.DirectoryBaseManager
Base class for managers that interact with external distributed resource managers.
-
class
lwr.managers.base.
BaseManager
(name, app, **kwds)[source]¶ Bases:
lwr.managers.ManagerInterface
-
job_directory
(job_id)¶
-
-
class
lwr.managers.base.
JobDirectory
(staging_directory, job_id, lock_manager=None)[source]¶ Bases:
lwr.lwr_client.job_directory.RemoteJobDirectory
-
lwr.managers.base.
get_mapped_file
(directory, remote_path, allow_nested_files=False, local_path_module=<module 'posixpath' from '/home/docs/checkouts/readthedocs.org/user_builds/lwr/envs/latest/lib/python2.7/posixpath.pyc'>, mkdir=True)[source]¶ >>> import ntpath >>> get_mapped_file(r'C:\lwr\staging\101', 'dataset_1_files/moo/cow', allow_nested_files=True, local_path_module=ntpath, mkdir=False) 'C:\\lwr\\staging\\101\\dataset_1_files\\moo\\cow' >>> get_mapped_file(r'C:\lwr\staging\101', 'dataset_1_files/moo/cow', allow_nested_files=False, local_path_module=ntpath) 'C:\\lwr\\staging\\101\\cow' >>> get_mapped_file(r'C:\lwr\staging\101', '../cow', allow_nested_files=True, local_path_module=ntpath, mkdir=False) Traceback (most recent call last): Exception: Attempt to read or write file outside an authorized directory.
lwr.managers.queued
Module¶
-
class
lwr.managers.queued.
QueueManager
(name, app, **kwds)[source]¶ Bases:
lwr.managers.unqueued.Manager
A job manager that queues up jobs directly (i.e. does not use an external queuing software such PBS, SGE, etc...).
-
manager_type
= 'queued_python'¶
-
lwr.managers.queued_drmaa
Module¶
lwr.managers.queued_external_drmaa
Module¶
-
class
lwr.managers.queued_external_drmaa.
ExternalDrmaaQueueManager
(name, app, **kwds)[source]¶ Bases:
lwr.managers.base.base_drmaa.BaseDrmaaManager
DRMAA backed queue manager.
-
manager_type
= 'queued_external_drmaa'¶
-
lwr.managers.queued_condor
Module¶
-
class
lwr.managers.queued_condor.
CondorQueueManager
(name, app, **kwds)[source]¶ Bases:
lwr.managers.base.external.ExternalBaseManager
Job manager backend that plugs into Condor.
-
manager_type
= 'queued_condor'¶
-
lwr.managers.queued_pbs
Module¶
-
class
lwr.managers.queued_pbs.
PbsQueueManager
(name, app, **kwds)[source]¶ Bases:
lwr.managers.base.BaseManager
Placeholder for PBS-python backed queue manager. Not yet implemented, for many situations this would be used the DRMAA or CLI+Torque managers may be better choices or at least stop gaps.
-
manager_type
= 'queued_pbs'¶
-
lwr.managers.unqueued
Module¶
-
class
lwr.managers.unqueued.
Manager
(name, app, **kwds)[source]¶ Bases:
lwr.managers.base.directory.DirectoryBaseManager
A simple job manager that just directly runs jobs as given (no queueing). Preserved for compatibilty with older versions of LWR client code where Galaxy is used to maintain queue (like Galaxy’s local job runner).
-
manager_type
= 'unqueued'¶
-
lwr.managers.stateful
Module¶
-
class
lwr.managers.stateful.
ActiveJobs
(manager)[source]¶ Bases:
object
Keeps track of active jobs (those that are not yet “complete”). Current implementation is file based, but could easily be made database-based instead.
TODO: Keep active jobs in memory after initial load so don’t need to repeatedly hit disk to recover this information.
-
class
lwr.managers.stateful.
ManagerMonitor
(stateful_manager)[source]¶ Bases:
object
Monitors active jobs of a StatefulManagerProxy.
lwr.managers.status
Module¶
lwr.managers.util
Module¶
This module and its submodules contains utilities for running external processes and interfacing with job managers. This module should contain functionality shared between Galaxy and the LWR.
lwr.managers.staging
Module¶
This module contains the code that allows the LWR to stage file’s during preprocessing (currently this means downloading or copying files) and then unstage or send results back to client during postprocessing.
-
class
lwr.managers.
ManagerInterface
[source]¶ Bases:
object
Defines the interface to various job managers.
-
clean
(job_id)[source]¶ Delete job directory and clean up resources associated with job with id job_id.
-
get_status
(job_id)[source]¶ Return status of job as string, currently supported statuses include ‘cancelled’, ‘running’, ‘queued’, and ‘complete’.
-
job_directory
(job_id)[source]¶ Return a JobDirectory abstraction describing the state of the job working directory.
-
launch
(job_id, command_line, submit_params={}, dependencies_description=None, env=[])[source]¶ Called to indicate that the client is ready for this job with specified job id and command line to be executed (i.e. run or queue this job depending on implementation).
-
return_code
(job_id)[source]¶ Return integer indicating return code of specified execution or LWR_UNKNOWN_RETURN_CODE.
-
setup_job
(input_job_id, tool_id, tool_version)[source]¶ Setup a job directory for specified input (galaxy) job id, tool id, and tool version.
-
lwr.daemon
Module¶
-
class
lwr.daemon.
LwrConfigBuilder
(args=None, **kwds)[source]¶ Bases:
object
Generate paste-like configuration from supplied command-line arguments.
-
class
lwr.daemon.
LwrManagerConfigBuilder
(args=None, **kwds)[source]¶ Bases:
lwr.daemon.LwrConfigBuilder
lwr.scripts
Module¶
This module contains entry points into various LWR scripts. Corresponding
shell scripts that setup the deployment specific environments and then
delegate to these Python scripts can be found in LWR_ROOT/scripts
.
lwr.scripts.drmaa_kill
Module¶
lwr.scripts.drmaa_launch
Module¶
lwr.scripts.lwr_submit
Module¶
lwr.scripts.mesos_executor
Module¶
lwr.scripts.mesos_framework
Module¶
lwr.web
Module¶
The code explicitly related to the LWR web server can be found in this module and its submodules.
lwr.web.framework
Module¶
Tiny framework used to power LWR application, nothing in here is specific to running or staging jobs. Mostly deals with routing web traffic and parsing parameters.
-
class
lwr.web.framework.
Controller
(response_type='OK')[source]¶ Bases:
object
Wraps python functions into controller methods.
lwr.web.routes
Module¶
-
class
lwr.web.routes.
LwrController
(**kwargs)[source]¶ Bases:
lwr.web.framework.Controller
lwr.web.wsgi
Module¶
-
class
lwr.web.wsgi.
LwrWebApp
(lwr_app)[source]¶ Bases:
lwr.web.framework.RoutingApp
Web application for LWR web server.
lwr.messaging
Module¶
This module contains the server-side only code for interfacing with
message queues. Code shared between client and server can be found in
submodules of lwr.lwr_client
.
lwr.mesos
Module¶
This module and submodules contain code for interfacing the Apache Mesos framework.
lwr.mesos.framework
Module¶
lwr.app
Module¶
Deprecated module for wsgi app factory. LWR servers should transition to
lwr.web.wsgi:app_factory
.
lwr.manager_endpoint_util
Module¶
Composite actions over managers shared between HTTP endpoint (routes.py) and message queue.
lwr.manager_factory
Module¶
lwr.tools
Module¶
Tools
lwr.tools.toolbox
Module¶
-
class
lwr.tools.toolbox.
InputsValidator
(command_validator, config_validators)[source]¶ Bases:
object
-
class
lwr.tools.toolbox.
SimpleToolConfig
(tool_el, tool_path)[source]¶ Bases:
lwr.tools.toolbox.ToolConfig
Abstract description of a Galaxy tool loaded from a toolbox with the tool tag not containing a guid, i.e. one not from the toolshed.
-
class
lwr.tools.toolbox.
ToolBox
(path_string)[source]¶ Bases:
object
Abstraction over a tool config file largely modelled after Galaxy’s shed_tool_conf.xml. Hopefully over time this toolbox schema will be a direct superset of Galaxy’s with extensions to support simple, non-toolshed based tool setups.
-
class
lwr.tools.toolbox.
ToolConfig
[source]¶ Bases:
object
Abstract description of a Galaxy tool.
-
inputs_validator
¶
-
-
class
lwr.tools.toolbox.
ToolShedToolConfig
(tool_el, tool_path)[source]¶ Bases:
lwr.tools.toolbox.SimpleToolConfig
Abstract description of a Galaxy tool loaded from a toolbox with the tool tag, i.e. one from the toolshed.
- <tool file=”../shed_tools/gvk.bx.psu.edu/repos/test/column_maker/f06aa1bf1e8a/column_maker/column_maker.xml” guid=”gvk.bx.psu.edu:9009/repos/test/column_maker/Add_a_column1/1.1.0”>
- <tool_shed>gvk.bx.psu.edu:9009</tool_shed> <repository_name>column_maker</repository_name> <repository_owner>test</repository_owner> <installed_changeset_revision>f06aa1bf1e8a</installed_changeset_revision <id>gvk.bx.psu.edu:9009/repos/test/column_maker/Add_a_column1/1.1.0</id> <version>1.1.0</version>
</tool>
lwr.tools.authorization
Module¶
Bases:
object
Bases:
object
Allow any, by default LWR is assumed to be secured using a firewall or private_token.
Bases:
object
Work In Progress: Implement tool based white-listing of what jobs can run and what those jobs can do.
lwr.tools.validator
Module¶
Client Code¶
lwr.lwr_client
Module¶
lwr_client¶
This module contains logic for interfacing with an external LWR server.
Configuring Galaxy¶
Galaxy job runners are configured in Galaxy’s job_conf.xml
file. See job_conf.xml.sample_advanced
in your Galaxy code base or on
Bitbucket
for information on how to configure Galaxy to interact with the LWR.
Galaxy also supports an older, less rich configuration of job runners directly
in its main universe_wsgi.ini
file. The following section describes how to
configure Galaxy to communicate with the LWR in this legacy mode.
Legacy¶
A Galaxy tool can be configured to be executed remotely via LWR by
adding a line to the universe_wsgi.ini
file under the
galaxy:tool_runners
section with the format:
<tool_id> = lwr://http://<lwr_host>:<lwr_port>
As an example, if a host named remotehost is running the LWR server
application on port 8913
, then the tool with id test_tool
can
be configured to run remotely on remotehost by adding the following
line to universe.ini
:
test_tool = lwr://http://remotehost:8913
Remember this must be added after the [galaxy:tool_runners]
header
in the universe.ini
file.