Welcome to the documentation for sarge, a wrapper for subprocess
which aims to make life easier for anyone who needs to interact with external
applications from their Python code.
Please note: this documentation is work in progress.
This is the default timeout which will be used by Capture
instances when you don’t specify one in the Capture constructor.
This is currently set to 0.02 seconds.
This function is a convenience wrapper which constructs a
Pipeline instance from the passed parameters, and then invokes
run() and close() on that instance.
input (Text, bytes or a file-like object containing bytes (not text).) – Input data to be passed to the command(s). If text is passed,
it’s converted to bytes using the default encoding. The
bytes are converted to a file-like object (a
BytesIO instance). If a value such as a file-like
object, integer file descriptor or special value like
subprocess.PIPE is passed, it is passed through
unchanged to subprocess.Popen.
kwargs – Any keyword parameters which you might want to pass to the
wrapped Pipeline instance.
This function is a convenience wrapper which does the same as run()
while capturing the stdout of the subprocess(es). This captured output
is available through the stdout attribute of the return value from
this function.
This function is a convenience wrapper which does the same as
capture_stdout() but also returns the text captured. Use this when
you know the output is not voluminous, so it doesn’t matter that it’s
buffered in memory.
This function is a convenience wrapper which does the same as run()
while capturing the stderr of the subprocess(es). This captured output
is available through the stderr attribute of the return value from
this function.
This function is a convenience wrapper which does the same as
capture_stderr() but also returns the text captured. Use this when
you know the output is not voluminous, so it doesn’t matter that it’s
buffered in memory.
This function is a convenience wrapper which does the same as run()
while capturing the stdout and the stderr of the subprocess(es).
This captured output is available through the stdout and
stderr attributes of the return value from this function.
This function is a convenience wrapper which does the same as
capture_both() but also returns the text captured. Use this when
you know the output is not voluminous, so it doesn’t matter that it’s
buffered in memory.
Format a shell command with format placeholders and variables to fill
those placeholders.
Note: you must specify positional parameters explicitly, i.e. as {0}, {1}
instead of {}, {}. Requiring the formatter to maintain its own counter can
lead to thread safety issues unless a thread local is used to maintain
the counter. It’s not that hard to specify the values explicitly
yourself :-)
Parameters:
fmt (str, or unicode on 2.x) – The shell command as a format string. Note that you will need
to double up braces you want in the result, i.e. { -> {{ and
} -> }}, due to the way str.format() works.
args – Positional arguments for use with fmt.
kwargs – Keyword arguments for use with fmt.
Returns:
The formatted shell command, which should be safe for use in
shells from the point of view of shell injection.
input (Text, bytes or a file-like object containing bytes.) – Input data to be passed to the command. If text is
passed, it’s converted to bytes using the default
encoding. The bytes are converted to a file-like object (a
BytesIO instance). The contents of the
file-like object are written to the stdin
stream of the sub-process.
async (bool) – If True, the command is run asynchronously – that is
to say, wait() is not called on the underlying
Popen instance.
This represents a set of commands which need to be run as a unit.
Parameters:
source (str) – The source text with the command(s) to run.
posix (bool) – Whether the source will be parsed using Posix conventions.
kwargs – Any keyword parameters you would pass to
subprocess.Popen, other than stdin (for which,
you need to use the input parameter of the
run() method instead). You can pass
Capture instances for stdout and stderr
keyword arguments, which will cause those streams to be
captured to those instances.
async – The same as for the Command.run() method. Note that
parts of the pipeline may specify synchronous or
asynchronous running – this flag refers to the pipeline
as a whole.
A class which allows an output stream from a sub-process to be captured.
Parameters:
timeout (float) – The default timeout, in seconds. Note that you can
override this in particular calls to read input. If
None is specified, the value of the module attribute
default_capture_timeout is used instead.
buffer_size (int) – The buffer size to use when reading from the underlying
streams. If not specified or specified as zero, a 4K
buffer is used. For interactive applications, use a value
of 1.
size (int) – The number of bytes to read. If not specified, the intent is
to read the stream until it is exhausted.
block (bool) – Whether to block waiting for input to be available,
timeout (float) – How long to wait for input. If None,
use the default timeout that this instance was
initialised with. If the result is None, wait
indefinitely.
This looks for a pattern in the captured output stream. If found, it
returns immediately; otherwise, it will block until the timeout expires,
waiting for a match as bytes from the captured stream continue to be read.
Parameters:
string_or_pattern – A string or pattern representing a regular
expression to match. Note that this needs to
be a bytestring pattern if you pass a pattern
in; if you pass in text, it is converted to
bytes using the utf-8 codec and then to
a pattern used for matching (using search).
If you pass in a pattern, you may want to
ensure that its flags include re/MULTILINE
so that you can make use of ^ and $ in
matching line boundaries. Note that on Windows,
you may need to use \r?$ to match ends of
lines, as $ matches Unix newlines (LF) and
not Windows newlines (CRLF).
timeout – If not specified, the module’s default_expect_timeout
is used.
Returns:
A regular expression match instance, if a match was found
within the specified timeout, or None if no match was
found.
close(stop_threads=False):
Close the capture object. By default, this waits for the threads which
read the captured streams to terminate (which may not happen unless the
child process is killed, and the streams read to exhaustion). To ensure
that the threads are stopped immediately, specify True for the
stop_threads parameter, which will asks the threads to terminate
immediately. This may lead to losing data from the captured streams
which has not yet been read.
This is a subclass of subprocess.Popen which is provided mainly
to allow a process’ stdout to be mapped to its stderr. The
standard library version allows you to specify stderr=STDOUT to
indicate that the standard error stream of the sub-process be the same as
its standard output stream. However. there’s no facility in the standard
library to do stdout=STDERR – but it is provided in this subclass.
In fact, the two streams can be swapped by doing stdout=STDERR,stderr=STDOUT in a call. The STDERR value is defined in sarge
as an integer constant which is understood by sarge (much as
STDOUT is an integer constant which is understood by subprocess).
The sarge parser looks for commands which are separated by ; and &:
echo foo; echo bar & echo baz
which means to run echo foo, wait for its completion,
and then run echobar and then echobaz without waiting for echobar to complete.
The commands which are separated by & and ; are conditional commands,
of the form:
a && b
or:
c || d
Here, command b is executed only if a returns success (i.e. a
return code of 0), whereas d is only executed if c returns failure,
i.e. a return code other than 0. Of course, in practice all of a, b,
c and d could have arguments, not shown above for simplicity’s sake.
Each operand on either side of && or || could also consist of a
pipeline – a set of commands connected such that the output streams of one
feed into the input stream of another. For example:
echo foo | cat
or:
command-a |& command-b
where the use of | indicates that the standard output of echofoo is
piped to the input of cat, whereas the standard error of command-a is
piped to the input of command-b.
In general, file descriptors other than 1 and 2 are not allowed,
as the functionality needed to provided them (dup2) is not properly
supported on Windows. However, an esoteric special case is recognised:
echo foo | tee stdout.log 3>&1 1>&2 2>&3 | tee stderr.log > /dev/null
This redirection construct will put foo in both stdout.logandstderr.log. The effect of this construct is to swap the standard output
and standard error streams, using file descriptor 3 as a temporary as in the
code analogue for swapping variables a and b using temporary variable
c:
c=aa=bb=c
This is recognised by sarge and used to swap the two streams,
though it doesn’t literally use file descriptor 3,
instead using a cross-platform mechanism to fulfill the requirement.
You can see this post for a longer explanation of
this somewhat esoteric usage of redirection.
Please activate JavaScript to enable the search
functionality.
From here you can search these documents. Enter your search
words into the box below and click "search". Note that the search
function will automatically search for all of the words. Pages
containing fewer words won't appear in the result list.
A Capture consists of a queue, some output streams from sub-processes,
and some threads to read from those streams into the queue. One thread is
created for each stream, and the thread exits when its stream has been
completely read. When you read from a Capture instance using methods
like read(), readline() and
readlines(), you are effectively reading from the queue.
Each of the read(), readline() and
readlines() methods has optional block and timeout
keyword arguments. These default to True and None respectively,
which means block indefinitely until there’s some data – the standard
behaviour for file-like objects. However, these can be overridden internally
in a couple of ways:
The Capture constructor takes an optional timeout keyword
argument. This defaults to None, but if specified, that’s the timeout used
by the readXXX methods unless you specify values in the method calls.
If None is specified in the constructor, the module attribute
default_capture_timeout is used, which is currently set to 0.02
seconds. If you need to change this default, you can do so before any
Capture instances are created (or just provide an alternative default
in every Capture creation).
If all streams feeding into the capture have been completely read,
then block is always set to False.
There shouldn’t be any special implications of handling large amounts of
data, other than buffering, buffer sizes and memory usage (which you would
have to think about anyway). Here’s an example of piping a 20MB file into a
capture across several process boundaries:
A new constant, STDERR, is defined by sarge. If you specify
stdout=STDERR, this means that you want the child process stdout to
be the same as its stderr. This is analogous to the core functionality in
subprocess.Popen where you can specify stderr=STDOUT to have the
child process stderr be the same as its stdout. The use of this
constant also allows you to swap the child’s stdout and stderr,
which can be useful in some cases.
This functionality works through a class sarge.Popen which subclasses
subprocess.Popen and overrides the internal _get_handles method to
work the necessary magic – which is to duplicate, close and swap handles as
needed.
The shell_quote() function works as follows. Firstly,
an empty string is converted to ''. Next, a check is made to see if the
string has already been quoted (i.e. it begins and ends with the '
character), and if so, it is returned enclosed in " and with any contained
“ characters escaped with a backslash. Otherwise, it’s bracketed with the
' character and every internal instance of ' is replaced with
'"'"'.
This is inspired by Nick Coghlan’s shell_command project. An internal ShellFormatter
class is derived from string.Formatter and overrides the
string.Formatter.convert_field() method to provide quoting for placeholder
values. This formatter is simpler than Nick’s in that it forces you to
explicitly provide the indices of positional arguments: You have to use e.g.
'cp{0}{1} instead of cp{}{}. This avoids the need to keep an
internal counter in the formatter, which would make its implementation be not
thread-safe without additional work.
where WORD and NUM are terminal tokens with the meanings you would expect.
The parser constructs a parse tree, which is used internally by the
Pipeline class to manage the running of the pipeline.
The standard library’s shlex module contains a class which is used for
lexical scanning. Since the shlex.shlex class is not able to provide
the needed functionality, sarge includes a module, shlext,
which defines a subclass, shell_shlex, which provides the necessary
functionality. This is not part of the public API of sarge, though it has
been submitted as an
enhancement on the Python
issue tracker.
Sometimes, you can get deadlocks even though you think you’ve taken
sufficient measures to avoid them. To help identify where deadlocks are
occurring, the sarge source distribution includes a module,
stack_tracer, which is based on MIT-licensed code by László Nagy in an
ActiveState recipe. To see how
it’s invoked, you can look at the sarge test harness test_sarge.py –
this is set to invoke the tracer if the TRACE_THREADS variable is set (which
it is, by default). If the unit tests hang on your system, then the
threads-X.Y.log file will show where the deadlock is (just look and see what
all the threads are waiting for).
At the moment, if a Capture is used, it will read from its sub-process
output streams into a queue, which can then be read by your code. If you don’t
read from the Capture in a timely fashion, a lot of data could
potentially be buffered in memory – the same thing that happens when you use
subprocess.Popen.communicate(). There might be added some means of
“turning the tap off”, i.e. pausing the reader threads so that the capturing
threads stop reading from the sub-process streams. This will, of course, cause
those sub-processes to block on their I/O, so at some point the tap would need
to be turned back on. However, such a facility would afford better
sub-process control in some scenarios.
PK Z6BZ sarge-0.1.1/.buildinfo# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: d9a83a8776bb71a7b77baa7b6d264013
tags: fbb0d17656682115ca4d033fb2f83ba1
PK Z6Bb sarge-0.1.1/objects.inv# Sphinx inventory version 2
# Project: Sarge
# Version: 0.1
# The remainder of this file is compressed using zlib.
xڕT;O0+@bj$nbTcߓ`jKU-9q(+a<ɹlvTj{2h2]I%N
UӁUjÖs`c쾜IWKAy`2hHT8˵?ʙuD;F=Q6XK$4`?C;eqBC)3ɇf
FsL`fB+4_Q_\Q @cZ[
ZEM$bQSѢRc܄+,(_5_uPxTUcsꅅJ3)nz1љ(cȩMy`J}Z?ƥCM0))4*1[tZeӮ {V}}*Һv6&w^PK Z6Bd6A7 7 sarge-0.1.1/tutorial.html
Tutorial — Sarge 0.1.1 documentation
sarge is a pure-Python library. You should be able to install it using:
pip install sarge
for installing sarge into a virtualenv or other directory where you have
write permissions. On Posix platforms, you may need to invoke using sudo
if you need to install sarge in a protected location such as your system
Python’s site-packages directory.
A full test suite is included with sarge. To run it, you’ll need to unpack
a source tarball and run pythonsetup.pytest in the top-level directory
of the unpack location. You can of course also run pythonsetup.pyinstall
to install from the source tarball (perhaps invoking with sudo if you need
to install to a protected location).
In the simplest cases, sarge doesn’t provide any major advantage over
subprocess:
>>> fromsargeimportrun>>> run('echo "Hello, world!"')Hello, world!<sarge.Pipeline object at 0x1057110>
The echo command got run, as expected, and printed its output on the
console. In addition, a Pipeline object got returned. Don’t worry too much
about what this is for now – it’s more useful when more complex combinations
of commands are run.
By comparison, the analogous case with subprocess would be:
We had to call split() on the command (or we could have passed
shell=True), and as well as running the command, the call() method
returned the exit code of the subprocess. To get the same effect with sarge
you have to do:
You get two return codes, one for each command. The same information is
available from sarge, in one place – the Pipeline instance that’s
returned from a run() call:
The returncodes property of a Pipeline instance returns a
list of the return codes of all the commands that were run,
whereas the returncode property just returns the last element of
this list. The Pipeline class defines a number of useful properties
- see the reference for full details.
By default, sarge does not run commands via the shell. This means that
wildcard characters in user input do not have potentially dangerous
consequences:
>>> run('ls *.py')ls: cannot access *.py: No such file or directory<sarge.Pipeline object at 0x20f3dd0>
There might be circumstances where you need to use shell=True,
in which case you should consider formatting your commands with placeholders
and quoting any variable parts that you get from external sources (such as
user input). Which brings us on to ...
Formatting commands with placeholders for safe usage¶
If you need to merge commands with external inputs (e.g. user inputs) and you
want to prevent shell injection attacks, you can use the shell_format()
function. This takes a format string, positional and keyword arguments and
uses the new formatting (str.format()) to produce the result:
You can pass a string, bytes or a file-like object of bytes. If it’s a string
or bytes, what you pass in is converted to a file-like object of bytes,
which is sent to the child process’ stdin stream in a separate thread.
You can also pass in special values like subprocess.PIPE – these are
passed to the subprocess layer as-is.
You can use && and || to chain commands conditionally using
short-circuit Boolean semantics. For example:
>>> fromsargeimportrun>>> run('false && echo foo')<sarge.Pipeline object at 0xb8dd50>
Here, echofoo wasn’t called, because the false command evaluates to
False in the shell sense (by returning an exit code other than zero).
Conversely:
>>> run('false || echo foo')foo<sarge.Pipeline object at 0xa11d50>
Here, foo is output because we used the || condition; because the left-
hand operand evaluates to False, the right-hand operand is evaluated (i.e.
run, in this context). Similarly, using the true command:
>>> run('true && echo foo')foo<sarge.Pipeline object at 0xb8dd50>>>> run('true || echo foo')<sarge.Pipeline object at 0xa11d50>
To capture output for commands, just pass a Capture instance for the
relevant stream:
>>> fromsargeimportrun,Capture>>> p=run('echo foo; echo bar | cat',stdout=Capture())>>> p.stdout.textu'foo\nbar\n'
The Capture instance acts like a stream you can read from: it has
read(), readline() and readlines()
methods which you can call just like on any file-like object,
except that they offer additional options through block and timeout
keyword parameters.
As in the above example, you can use the bytes or text property of a
Capture instance to read all the bytes or text captured. The latter
just decodes the former using UTF-8 (the default encoding isn’t used,
because on Python 2.x, the default encoding isn’t UTF-8 – it’s ASCII).
There are some convenience functions – capture_stdout(),
capture_stderr() and capture_both() – which work just like
run() but capture the relevant streams to Capture instances,
which can be accessed using the appropriate attribute on the
Pipeline instance returned from the functions.
A Capture instance can capture output from one or
more sub-process streams, and will create a thread for each such stream so
that it can read all sub-process output without causing the sub-processes to
block on their output I/O. However, if you use a Capture,
you should be prepared either to consume what it’s read from the
sub-processes, or else be prepared for it all to be buffered in memory (which
may be problematic if the sub-processes generate a lot of output).
You can iterate over Capture instances. By default you will get
successive lines from the captured data, as bytes; if you want text,
you can wrap with io.TextIOWrapper. Here’s an example using Python
3.2:
Sometimes you need to interact with a child process in an interactive manner.
To illustrate how to do this, consider the following simple program,
named receiver, which will be used as the child process:
#!/usr/bin/env pythonimportsysdefmain(args=None):whileTrue:user_input=sys.stdin.readline().strip()ifnotuser_input:breaks='Hi, %s!\n'%user_inputsys.stdout.write(s)sys.stdout.flush()# need this when run as a subprocessif__name__=='__main__':sys.exit(main())
This just reads lines from the input and echoes them back as a greeting. If
we run it interactively:
$ ./receiver
Fred
Hi, Fred!
Jim
Hi, Jim!
Sheila
Hi, Sheila!
The program exits on seeing an empty line.
We can now show how to interact with this program from a parent process:
The p.returncode didn’t print anything, indicating that the return code
was None. This means that although the child process has exited,
it’s still a zombie because we haven’t “reaped” it by making a call to
wait(). Once that’s done, the zombie disappears and we get the
return code.
From the point of view of buffering, note that two elements are needed for
the above example to work:
We specify buffer_size=1 in the Capture constructor. Without this,
data would only be read into the Capture’s queue after an I/O completes –
which would depend on how many bytes the Capture reads at a time. You can
also pass a buffer_size=-1 to indicate that you want to use line-
buffering, i.e. read a line at a time from the child process. (This may only
work as expected if the child process flushes its outbut buffers after every
line.)
We make a flush call in the receiver script, to ensure that the pipe
is flushed to the capture queue. You could avoid the flush call in the
above example if you used python-ureceiver as the command (which runs
the script unbuffered).
This example illustrates that in order for this sort of interaction to work,
you need cooperation from the child process. If the child process has large
output buffers and doesn’t flush them, you could be kept waiting for input
until the buffers fill up or a flush occurs.
If a third party package you’re trying to interact with gives you buffering
problems, you may or may not have luck (on Posix, at least) using the
unbuffer utility from the expect-dev package (do a Web search to find
it). This invokes a program directing its output to a pseudo-tty device which
gives line buffering behaviour. This doesn’t always work, though :-(
Looking for specific patterns in child process output¶
You can look for specific patterns in the output of a child process, by using
the expect() method of the Capture class. This takes a
string, bytestring or regular expression pattern object and a timeout, and
either returns a regular expression match object (if a match was found in the
specified timeout) or None (if no match was found in the specified
timeout). If you pass in a bytestring, it will be converted to a regular
expression pattern. If you pass in text, it will be encoded to bytes using the
utf-8 codec and then to a regular expression pattern. This pattern will be
used to look for a match (using search). If you pass in a regular
expression pattern, make sure it is meant for bytes rather than text (to avoid
TypeError on Python 3.x). You may also find it useful to specify
re.MULTILINE in the pattern flags, so that you can match using ^ and
$ at line boundaries. Note that on Windows, you may need to use \r?$
to match ends of lines, as $ matches Unix newlines (LF) and not Windows
newlines (CRLF).
New in version 0.1.1: The expect method was added.
To illustrate usage of Capture.expect(), consider the program
lister.py (which is provided as part of the source distribution, as it’s
used in the tests). This prints line1, line2 etc. indefinitely with
a configurable delay, flushing its output stream after each line. We can
capture the output from a run of lister.py, ensuring that we use
line-buffering in the parent process:
Some programs don’t work through their stdin/stdout/stderr
streams, instead opting to work directly with their controlling terminal. In
such cases, you can’t work with these programs using sarge; you need to use
a pseudo-terminal approach, such as is provided by (for example)
pexpect. Sarge works within the limits
of the subprocess module, which means sticking to stdin, stdout
and stderr as ordinary streams or pipes (but not pseudo-terminals).
Examples of programs which work directly through their controlling terminal
are ftp and ssh - the password prompts for these programs are
generally always printed to the controlling terminal rather than stdout or
stderr.
In the subprocess.Popen constructor, the env keyword argument, if
supplied, is expected to be the complete environment passed to the child
process. This can lead to problems on Windows, where if you don’t pass the
SYSTEMROOT environment variable, things can break. With sarge, it’s
assumed that anything you pass in env is added to the contents of
os.environ. This is almost always what you want – after all,
in a Posix shell, the environment is generally inherited with certain
additions for a specific command invocation.
Note
On Python 2.x on Windows, environment keys and values must be of
type str - Unicode values will cause a TypeError. Be careful of
this if you use from__future__importunicode_literals. For example,
the test harness for sarge uses Unicode literals on 2.x,
necessitating the use of different logic for 2.x and 3.x:
ifPY3:env={'FOO':'BAR'}else:# Python 2.x wants native strings, at least on Windowsenv={b'FOO':b'BAR'}
You can set the working directory for a Command or Pipeline
using the cwd keyword argument to the constructor, which is passed through
to the subprocess when it’s created. Likewise, you can use the other keyword
arguments which are accepted by the subprocess.Popen constructor.
All data between your process and sub-processes is communicated as bytes. Any
text passed as input to run() or a run() method will be
converted to bytes using UTF-8 (the default encoding isn’t used, because on
Python 2.x, the default encoding isn’t UTF-8 – it’s ASCII).
As sarge requires Python 2.6 or later, you can use from__future__importunicode_literals and byte literals like b'foo' so that your code
looks and behaves the same under Python 2.x and Python 3.x. (See the note on
using native string keys and values in Environments.)
As mentioned above, Capture instances return bytes, but you can wrap
with io.TextIOWrapper if you want text.
The Capture and Pipeline classes can be used as context
managers:
>>> withCapture()asout:... withPipeline('cat; echo bar | cat',stdout=out)asp:... p.run(input='foo\n')...<sarge.Pipeline object at 0x7f3320e94310>>>> out.read().split()['foo', 'bar']
Synchronous and asynchronous execution of commands¶
By default. commands passed to run() run synchronously,
i.e. all commands run to completion before the call returns. However, you can
pass async=True to run, in which case the call returns a Pipeline
instance before all the commands in it have run. You will need to call
wait() or close() on this instance when you
are ready to synchronise with it; this is needed so that the sub processes
can be properly disposed of (otherwise, you will leave zombie processes
hanging around, which show up, for example, as <defunct> on Linux systems
when you run ps-ef). Here’s an example:
Here, foo is printed to the terminal by the last cat command, but all
the sub-processes are zombies. (The run function returned immediately,
so the interpreter got to issue the >>>`prompt*before*the``foo output
was printed.)
If you run commands asynchronously by using & in a command pipeline, then a
thread is spawned to run each such command asynchronously. Remember that thread
scheduling behaviour can be unexpected – things may not always run in the order
you expect. For example, the command line:
echo foo & echo bar & echo baz
should run all of the echo commands concurrently as far as possible,
but you can’t be sure of the exact sequence in which these commands complete –
it may vary from machine to machine and even from one run to the next. This has
nothing to do with sarge – there are no guarantees with just plain Bash,
either.
On Posix, subprocess uses os.fork() to create the child process,
and you may see dire warnings on the Internet about mixing threads, processes
and fork(). It is a heady mix, to be sure: you need to understand what’s
going on in order to avoid nasty surprises. If you run into any such, it may be
hard to get help because others can’t reproduce the problems. However, that’s
no reason to shy away from providing the functionality altogether. Such issues
do not occur on Windows, for example: because Windows doesn’t have a
fork() system call, child processes are created in a different way which
doesn’t give rise to the issues which sometimes crop up in a Posix environment.
For an exposition of the sort of things which might bite you if you are using
locks, threading and fork() on Posix, see this post.
If you want to interact with external programs from your Python applications,
Sarge is a library which is intended to make your life easier than using the
subprocess module in Python’s standard library.
Sarge is, of course, short for sergeant – and like any good non-commissioned
officer, sarge works to issue commands on your behalf and to inform you
about the results of running those commands.
The acronym lovers among you might be amused to learn that sarge can also
stand for “Subprocess Allegedly Rewards Good Encapsulation” :-)
Here’s a taster (example suggested by Kenneth Reitz’s Envoy documentation):
>>> fromsargeimportcapture_stdout>>> p=capture_stdout('fortune|cowthink')>>> p.returncode0>>> p.commands[Command('fortune'), Command('cowthink')]>>> p.returncodes[0, 0]>>> print(p.stdout.text) ____________________________________( The last thing one knows in )( constructing a work is what to put )( first. )( )( -- Blaise Pascal ) ------------------------------------ o ^__^ o (oo)\_______ (__)\ )\/\ ||----w | || ||
The capture_stdout() function is a convenient form of an underlying
function, run(). You can also use conditionals:
The conditional logic is being done by sarge and not the shell – which means
you can use the identical code on Windows. Here’s an example of some more
involved use of pipes, which also works identically on Posix and Windows:
The subprocess module in the standard library contains some very
powerful functionality. It encapsulates the nitty-gritty details of subprocess
creation and communication on Posix and Windows platforms, and presents the
application programmer with a uniform interface to the OS-level facilities.
However, subprocess does not do much more than this,
and is difficult to use in some scenarios. For example:
You want to use command pipelines, but using subprocess out of the box
often leads to deadlocks because pipe buffers get filled up.
You want to use bash-style pipe syntax on Windows,
but Windows shells don’t support some of the syntax you want to use,
like &&, ||, |& and so on.
You want to process output from commands in a flexible way,
and communicate() is not flexible enough for your
needs – for example, you need to process output a line at a time.
You want to avoid shell injection problems by
having the ability to quote your command arguments safely.
subprocess allows you to let stderr be the same as stdout,
but not the other way around – and you need to do that.
A simple run command which allows a rich subset of Bash-style shell
command syntax, but parsed and run by sarge so that you can run on Windows
without cygwin.
The ability to format shell commands with placeholders,
such that variables are quoted to prevent shell injection attacks:
>>> fromsargeimportshell_format>>> shell_format('ls {0}','*.py')"ls '*.py'">>> shell_format('cat {0}','a file name with spaces')"cat 'a file name with spaces'"
The ability to capture output streams without requiring you to program your
own threads. You just use a Capture object and then you can read
from it as and when you want:
Here, the sleep commands ensure that the asynchronous echo calls
occur in the order foo (no delay), baz (after a delay of one second)
and bar (after a delay of two seconds); the capturing works as expected.
Sarge is intended to be used on any Python version >= 2.6 and is tested on
Python versions 2.6, 2.7, 3.1, 3.2 and 3.3 on Linux, Windows, and Mac OS X (not
all versions are tested on all platforms, but are expected to work correctly).
The project has reached alpha status in its development: there is a test
suite and it has been exercised on Windows, Ubuntu and Mac OS X. However,
because of the timing sensitivity of the functionality, testing needs to be
performed on as wide a range of hardware and platforms as possible.
The source repository for the project is on BitBucket:
You can leave feedback by raising a new issue on the issue
tracker
(BitBucket registration not necessary, but recommended).
Note
For testing under Windows, you need to install the GnuWin32
coreutils
package, and copy the relevant executables (currently libiconv2.dll,
libintl3.dll, cat.exe, echo.exe, tee.exe, false.exe,
true.exe, sleep.exe and touch.exe) to the directory from which
you run the test harness (test_sarge.py).
Although every attempt will be made to keep API changes to the absolute minimum,
it should be borne in mind that the software is in its very early stages. For
example, the asynchronous feature (where commands are run in separate threads
when you specify & in a command pipeline) can be considered experimental,
and there may be changes in this area. However, you aren’t forced to use this
feature, and sarge should be useful without it.