Along with connection to servers across the internet, Twisted also connects to local processes with much the same API. The API is described in more detail in the documentation of:
twisted.internet.interfaces.IReactorProcess
twisted.internet.interfaces.IProcessTransport
twisted.internet.protocol.ProcessProtocol
Processes are run through the reactor, using reactor.spawnProcess()
. Pipes are created to the child
process, and added to the reactor core so that the application will not
block while sending data into or pulling data out of the new process. reactor.spawnProcess()
requires two arguments,
processProtocol and executable, and optionally takes six more: arguments,
environment, path, userID, groupID, and usePTY.
from twisted.internet import reactor mypp = MyProcessProtocol() reactor.spawnProcess(processProtocol, executable, args=[program, arg1, arg2], env={'HOME': os.environ['HOME']}, path, uid, gid, usePTY)
processProtocol
should be an instance of a subclass of twisted.internet.protocol.ProcessProtocol
. The
interface is described below.executable
is the full path of the program to run. It will be
connected to processProtocol.args
is a list of command line arguments to be passed to the
process. args[0]
should be the name of the process.env
is a dictionary containing the environment to pass
through to the process.path
is the directory to run the process in. The child will
switch to the given directory just before starting the new program. The
default is to stay in the current directory.uid
and gid
are the user ID and group ID to run the
subprocess as. Of course, changing identities will be more likely to
succeed if you start as root.usePTY
specifies whether the child process should be run with
a pty, or if it should just get a pair of pipes. Interactive programs
(where you don't know when it may read or write) need to be run with
ptys.args
and env
have empty default values, but many
programs depend upon them to be set correctly. At the very least,
args[0]
should probably be the same as executable
. If you
just provide os.environ
for env
, the child program will
inherit the environment from the current process, which is usually the
civilized thing to do (unless you want to explicitly clean the environment
as a security precaution).
reactor.spawnProcess()
returns an instance
that implements the twisted.internet.interfaces.IProcessTransport
.
The ProcessProtocol you pass to spawnProcess is your interaction with the process. It has a very similar signature to a regular Protocol, but it has several extra methods to deal with events specific to a process. In our example, we will interface with 'wc' to create a word count of user-given text. First, we'll start by importing the required modules, and writing the initialization for our ProcessProtocol.
from twisted.internet import protocol class WCProcessProtocol(protocol.ProcessProtocol): def __init__(self, text): self.text = text
When the ProcessProtocol is connected to the protocol, it has the connectionMade method called. In our protocol, we will write our text to the standard input of our process and then close standard input, to the let the process know we are done writing to it.
def connectionMade(self): self.transport.write(self.text) self.transport.closeStdin()
At this point, the process has receieved the data, and it's time for us to read the results. Instead of being receieved in dataReceived, data from standard output is receieve in outReceived. This is to distinguish it from data on standard error.
def outReceived(self, data): fieldLength = len(data) / 3 lines = int(data[:fieldLength]) words = int(data[fieldLength:fieldLength*2]) chars = int(data[fieldLength*2:]) self.transport.loseConnection() self.receiveCounts(lines, words, chars)
Now, the process has parsed the output, and ended the connection to the process. Then it sends the results on to the final method, receiveCounts. This is for users of the class to override, so as to do other things with the data. For our demonstration, we will just print the results.
def receiveCounts(self, lines, words, chars): print 'Received counts from wc.' print 'Lines:', lines print 'Words:', words print 'Characters:', chars
We're done! To use our WCProcessProtocol, we create an instance, and pass it to spawnProcess.
from twisted.internet import reactor wcProcess = WCProcessProtocol("accessing protocols through Twisted is fun!\n") reactor.spawnProcess(wcProcess, 'wc', ['wc']) reactor.run()
These are the methods that you can usefully override in your subclass of
ProcessProtocol
:
.connectionMade
: This is called when the program is started,
and makes a good place to write data into the stdin pipe (using self.transport.write()
)..outReceived(data)
: This is called with data that was
received from the process' stdout pipe. Pipes tend to provide data in
larger chunks than sockets (one kilobyte is a common buffer size), so you
may not experience the random dribs and drabsbehavior typical of network sockets, but regardless you should be prepared to deal if you don't get all your data in a single call. To do it properly,
outReceived
ought to simply accumulate the data and put off doing
anything with it until the process has finished..errReceived(data)
: This is called with data from the
process' stderr pipe. It behaves just like outReceived
..inConnectionLost
: This is called when the reactor notices
that the process' stdin pipe has closed. Programs don't typically close
their own stdin, so this will probably get called when your
ProcessProtocol has shut down the write side with self.transport.loseConnection()
..outConnectionLost
: This is called when the program closes
its stdout pipe. This usually happens when the program terminates..errConnectionLost
: Same as outConnectionLost
, but
for stderr instead of stdout..processEnded(status)
: This is called when the child
process has been reaped, and receives information about the process' exit
status. The status is passed in the form of a Failure
instance, created with a
.value
that either holds a ProcessDone
object if the process
terminated normally (it died of natural causes instead of receiving a
signal, and if the exit code was 0), or a ProcessTerminated
object (with an
.exitCode
attribute) if something went wrong. This scheme may
seem a bit weird, but I trust that it proves useful when dealing with
exceptions that occur in asynchronous code.
This will always be called after inConnectionLost
,
outConnectionLost
, and errConnectionLost
are
called.
The base-class definitions of these functions are all no-ops. This will result in all stdout and stderr being thrown away. Note that it is important for data you don't care about to be thrown away: if the pipe were not read, the child process would eventually block as it tried to write to a full pipe.
The following are the basic ways to control the child process:
self.transport.write(data)
: Stuff some data in the stdin
pipe. Note that this write
method will queue any data that can't
be written immediately. Writing will resume in the future when the pipe
becomes writable again.self.transport.closeStdin
: Close the stdin pipe. Programs
which act as filters (reading from stdin, modifying the data, writing to
stdout) usually take this as a sign that they should finish their job and
terminate. For these programs, it is important to close stdin when you're
done with it, otherwise the child process will never quit.self.transport.closeStdout
: Not usually called, since you're
putting the process into a state where any attempt to write to stdout will
cause a SIGPIPE error. This isn't a nice thing to do to the poor
process.self.transport.closeStderr
: Not usually called, same reason
as closeStdout
.self.transport.loseConnection
: Close all three pipes.os.kill(self.transport.pid, signal.SIGKILL)
: Kill the child
process. This will eventually result in processEnded
being
called.Here is an example that is rather verbose about exactly when all the
methods are called. It writes a number of lines into the wc
program
and then parses the output.
The exact output of this program depends upon the relative timing of some un-synchronized events. In particular, the program may observe the child process close its stderr pipe before or after it reads data from the stdout pipe. One possible transcript would look like this:
% ./process.py connectionMade! inConnectionLost! stdin is closed! (we probably did it) errConnectionLost! The child closed their stderr. outReceived! with 24 bytes! outConnectionLost! The child closed their stdout! I saw 40 lines processEnded, status 0 quitting Main loop terminated. %
Frequently, one just need a simple way to get all the output from a
program. For those cases, the
twisted.internet.utils.getProcessOutput
function can be used. Here is a simple example:
If you need to get just the final exit code, the
twisted.internet.utils.getProcessValue
function is useful. Here is an example: