API Reference

API Reference

API Reference

This is the place where the functions and classes in sarge's public API are described.

Attributes

default_capture_timeout

This is the default timeout which will be used by Capture instances when you don’t specify one in the Capture constructor. This is currently set to 0.02 seconds.

Functions

run(command, input=None, async_=False, **kwargs)[source]

This function is a convenience wrapper which constructs a Pipeline instance from the passed parameters, and then invokes run() and close() on that instance.

Parameters:
  • command (str) – The command(s) to run.
  • input (Text, bytes or a file-like object containing bytes (not text).) – Input data to be passed to the command(s). If text is passed, it’s converted to bytes using the default encoding. The bytes are converted to a file-like object (a BytesIO instance). If a value such as a file-like object, integer file descriptor or special value like subprocess.PIPE is passed, it is passed through unchanged to subprocess.Popen.
  • kwargs – Any keyword parameters which you might want to pass to the wrapped Pipeline instance. Apart from the input and async_ keyword arguments described above, other keyword arguments are passed to the wrapped Pipeline instance, and thence to subprocess.Popen via a Command instance. Note that the env kwarg is treated differently to how it is in Popen: it is treated as a set of additional environment variables to be added to the values in os.environ, unless replace_env is also specified as true, in which case the value of env becomes the entire child process environment.

    Under Windows, you might find it useful to pass the keyword argument posix=True, which will cause command to be parsed using POSIX conventions. This makes it easier to pass parameters with spaces in them.

Returns:

The created Pipeline instance.

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

capture_stdout(command, input=None, async_=False, **kwargs)[source]

This function is a convenience wrapper which does the same as run() while capturing the stdout of the subprocess(es). This captured output is available through the stdout attribute of the return value from this function.

Parameters:
  • command – As for run().
  • input – As for run().
  • kwargs – As for run().
Returns:

As for run().

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

get_stdout(command, input=None, async_=False, **kwargs)[source]

This function is a convenience wrapper which does the same as capture_stdout() but also returns the text captured. Use this when you know the output is not voluminous, so it doesn’t matter that it’s buffered in memory.

Parameters:
  • command – As for run().
  • input – As for run().
  • kwargs – As for run().
Returns:

The captured text.

New in version 0.1.1.

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

capture_stderr(command, input=None, async_=False, **kwargs)[source]

This function is a convenience wrapper which does the same as run() while capturing the stderr of the subprocess(es). This captured output is available through the stderr attribute of the return value from this function.

Parameters:
  • command – As for run().
  • input – As for run().
  • kwargs – As for run().
Returns:

As for run().

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

get_stderr(command, input=None, async_=False, **kwargs)[source]

This function is a convenience wrapper which does the same as capture_stderr() but also returns the text captured. Use this when you know the output is not voluminous, so it doesn’t matter that it’s buffered in memory.

Parameters:
  • command – As for run().
  • input – As for run().
  • kwargs – As for run().
Returns:

The captured text.

New in version 0.1.1.

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

capture_both(command, input=None, async_=False, **kwargs)[source]

This function is a convenience wrapper which does the same as run() while capturing the stdout and the stderr of the subprocess(es). This captured output is available through the stdout and stderr attributes of the return value from this function.

Parameters:
  • command – As for run().
  • input – As for run().
  • kwargs – As for run().
Returns:

As for run().

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

get_both(command, input=None, async_=False, **kwargs)[source]

This function is a convenience wrapper which does the same as capture_both() but also returns the text captured. Use this when you know the output is not voluminous, so it doesn’t matter that it’s buffered in memory.

Parameters:
  • command – As for run().
  • input – As for run().
  • kwargs – As for run().
Returns:

The captured text as a 2-element tuple, with the stdout text in the first element and the stderr text in the second.

New in version 0.1.1.

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

shell_quote(s)[source]

Quote text so that it is safe for POSIX command shells.

For example, “.py” would be converted to “’.py’”. If the text is considered safe it is returned unquoted.

Parameters:s (str, or unicode on 2.x) – The value to quote
Returns:A safe version of the input, from the point of view of POSIX command shells
shell_format(fmt, *args, **kwargs)[source]

Format a shell command with format placeholders and variables to fill those placeholders.

Note: you must specify positional parameters explicitly, i.e. as {0}, {1} instead of {}, {}. Requiring the formatter to maintain its own counter can lead to thread safety issues unless a thread local is used to maintain the counter. It’s not that hard to specify the values explicitly yourself :-)

Parameters:
  • fmt (str, or unicode on 2.x) – The shell command as a format string. Note that you will need to double up braces you want in the result, i.e. { -> {{ and } -> }}, due to the way str.format() works.
  • args – Positional arguments for use with fmt.
  • kwargs – Keyword arguments for use with fmt.
Returns:

The formatted shell command, which should be safe for use in shells from the point of view of shell injection.

Return type:

The type of fmt.

Classes

class Command(args, **kwargs)[source]

This represents a single command to be spawned as a subprocess.

Parameters:
  • args (str if shell=True, or an array of str) – The command to run.
  • kwargs – Any keyword parameters you might pass to Popen, other than stdin (for which, you need to see the input argument of run()).

Attributes

process

The subprocess.Popen instance for the subprocess, once it has been created. It is initially None.

returncode

The subprocess returncode, when that is available. It is initially None.

exception

Any exception which occurred when trying to create the subprocess. Note that once a subprocess has been created, any exceptions in the subprocess can only be communicated via the returncode - this value is only for exceptions during subprocess creation.

New in version 0.1.8.

Methods

run(input=None, async_=False)[source]

Run the command.

Parameters:
  • input (Text, bytes or a file-like object containing bytes.) – Input data to be passed to the command. If text is passed, it’s converted to bytes using the default encoding. The bytes are converted to a file-like object (a BytesIO instance). The contents of the file-like object are written to the stdin stream of the sub-process.
  • async (bool) – If True, the command is run asynchronously – that is to say, wait() is not called on the underlying Popen instance.

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

wait(timeout=None)[source]

Wait for the command’s underlying sub-process to complete, with a specified timeout. If the timeout is reached, a subprocess.TimeoutExpired exception is raised. The timeout is ignored in versions of Python < 3.3.

Changed in version 0.1.6: The timeout parameter was added.

terminate()[source]

Terminate the command’s underlying sub-process by calling subprocess.Popen.terminate() on it.

New in version 0.1.1.

kill()[source]

Kill the command’s underlying sub-process by calling subprocess.Popen.kill() on it.

New in version 0.1.1.

poll()[source]

Poll the command’s underlying sub-process by calling subprocess.Popen.poll() on it. Returns the result of that call.

New in version 0.1.1.

class Pipeline(source, posix=True, **kwargs)[source]

This represents a set of commands which need to be run as a unit.

Parameters:
  • source (str) – The source text with the command(s) to run.
  • posix (bool) – Whether the source will be parsed using POSIX conventions.
  • kwargs – Any keyword parameters you would pass to subprocess.Popen, other than stdin (for which, you need to use the input parameter of the run() method instead). You can pass Capture instances for stdout and stderr keyword arguments, which will cause those streams to be captured to those instances.

Attributes

returncodes

A list of the return codes of all sub-processes which were actually run. This will internally poll the commands in the pipeline to find the latest known return codes.

returncode

The return code of the last sub-process which was actually run.

commands

The Command instances which were actually created.

exceptions

A list of any exceptions creating subprocesses. This should be of use in diagnosing problems with commands (e.g. typos, or executables correctly spelled but not found on the system path).

New in version 0.1.8.

Methods

run(input=None, async_=False)[source]

Run the pipeline.

Parameters:
  • input – The same as for the Command.run() method.
  • async – The same as for the Command.run() method. Note that parts of the pipeline may specify synchronous or asynchronous running – this flag refers to the pipeline as a whole.

Changed in version 0.1.5: The async keyword parameter was changed to async_, as async is a keyword in Python 3.7 and later.

wait(timeout=None)[source]

Wait for all command sub-processes to finish, with an optional timeout. If the timeout is reached, a subprocess.TimeoutExpired exception is raised. The timeout is ignored in versions of Python < 3.3. If applied, it is applied to each of the pipeline’s commands in turn, which means that the effective timeout might be cumulative.

Changed in version 0.1.6: The timeout parameter was added.

poll_last()[source]

Check if the last command in the pipeline has terminated, and return its exit code, if available, or else None. Note that :meth:~Pipeline.poll_all` should be called when you want to ensure that all commands in a pipeline have completed.

poll_all()[source]

Check if all commands to run have terminated. Return a list of exit codes, where available, or None values where return codes are not yet available.

close()[source]

Wait for all command sub-processes to finish, and close all opened streams.

class Capture(timeout=None, buffer_size=0)[source]

A class which allows an output stream from a sub-process to be captured.

Parameters:
  • timeout (float) – The default timeout, in seconds. Note that you can override this in particular calls to read input. If None is specified, the value of the module attribute default_capture_timeout is used instead.
  • buffer_size (int) – The buffer size to use when reading from the underlying streams. If not specified or specified as zero, a 4K buffer is used. For interactive applications, use a value of 1.

Methods

read(size=-1, block=True, timeout=None)[source]

Like the read method of any file-like object.

Parameters:
  • size (int) – The number of bytes to read. If not specified, the intent is to read the stream until it is exhausted.
  • block (bool) – Whether to block waiting for input to be available,
  • timeout (float) – How long to wait for input. If None, use the default timeout that this instance was initialised with. If the result is None, wait indefinitely.
readline(size=-1, block=True, timeout=None)[source]

Like the readline method of any file-like object.

Parameters:
  • size – As for the read() method.
  • block – As for the read() method.
  • timeout – As for the read() method.
readlines(sizehint=-1, block=True, timeout=None)[source]

Like the readlines method of any file-like object.

Parameters:
  • sizehint – As for the read() method’s size.
  • block – As for the read() method.
  • timeout – As for the read() method.
expect(string_or_pattern, timeout=None)[source]

This looks for a pattern in the captured output stream. If found, it returns immediately; otherwise, it will block until the timeout expires, waiting for a match as bytes from the captured stream continue to be read.

Parameters:
  • string_or_pattern – A string or pattern representing a regular expression to match. Note that this needs to be a bytestring pattern if you pass a pattern in; if you pass in text, it is converted to bytes using the utf-8 codec and then to a pattern used for matching (using search). If you pass in a pattern, you may want to ensure that its flags include re/MULTILINE so that you can make use of ^ and $ in matching line boundaries. Note that on Windows, you may need to use \r?$ to match ends of lines, as $ matches Unix newlines (LF) and not Windows newlines (CRLF).
  • timeout – If not specified, the module’s default_expect_timeout is used.
Returns:

A regular expression match instance, if a match was found within the specified timeout, or None if no match was found.

close(stop_threads=False):

Close the capture object. By default, this waits for the threads which read the captured streams to terminate (which may not happen unless the child process is killed, and the streams read to exhaustion). To ensure that the threads are stopped immediately, specify True for the stop_threads parameter, which will asks the threads to terminate immediately. This may lead to losing data from the captured streams which has not yet been read.

class Popen[source]

This is a subclass of subprocess.Popen which is provided mainly to allow a process’ stdout to be mapped to its stderr. The standard library version allows you to specify stderr=STDOUT to indicate that the standard error stream of the sub-process be the same as its standard output stream. However. there’s no facility in the standard library to do stdout=STDERR – but it is provided in this subclass.

In fact, the two streams can be swapped by doing stdout=STDERR, stderr=STDOUT in a call. The STDERR value is defined in sarge as an integer constant which is understood by sarge (much as STDOUT is an integer constant which is understood by subprocess).

Shell syntax understood by sarge

Shell commands are parsed by sarge using a simple parser.

Command syntax

The sarge parser looks for commands which are separated by ; and &:

echo foo; echo bar & echo baz

which means to run echo foo, wait for its completion, and then run echo bar and then echo baz without waiting for echo bar to complete.

The commands which are separated by & and ; are conditional commands, of the form:

a && b

or:

c || d

Here, command b is executed only if a returns success (i.e. a return code of 0), whereas d is only executed if c returns failure, i.e. a return code other than 0. Of course, in practice all of a, b, c and d could have arguments, not shown above for simplicity’s sake.

Each operand on either side of && or || could also consist of a pipeline – a set of commands connected such that the output streams of one feed into the input stream of another. For example:

echo foo | cat

or:

command-a |& command-b

where the use of | indicates that the standard output of echo foo is piped to the input of cat, whereas the standard error of command-a is piped to the input of command-b.

Redirections

The sarge parser also understands redirections such as are shown in the following examples:

command arg-1 arg-2 > stdout.txt
command arg-1 arg-2 2> stderr.txt
command arg-1 arg-2 2>&1
command arg-1 arg-2 >&2

In general, file descriptors other than 1 and 2 are not allowed, as the functionality needed to provided them (dup2) is not properly supported on Windows. However, an esoteric special case is recognised:

echo foo | tee stdout.log 3>&1 1>&2 2>&3 | tee stderr.log > /dev/null

This redirection construct will put foo in both stdout.log and stderr.log. The effect of this construct is to swap the standard output and standard error streams, using file descriptor 3 as a temporary as in the code analogue for swapping variables a and b using temporary variable c:

c = a
a = b
b = c

This is recognised by sarge and used to swap the two streams, though it doesn’t literally use file descriptor 3, instead using a cross-platform mechanism to fulfill the requirement.

You can see this post for a longer explanation of this somewhat esoteric usage of redirection.

Next steps

You might find it helpful to look at the mailing list.