Executing Other (Non-Python)

You would have to type something like the following at the command line:

$ python /usr/local/lib/python2x/CGIHTTPServer.py Serving HTTP on 0.0.0.0 port 8000 ...

That is a long line to type, and if it is a third-party, you would have to dig intosite-packages to find exactly where it is located, etc. Can we run a module from the command line without the full pathname and let Python’s import mechanism do the legwork for us?

That answer is yes. We can use the Python -c command-line switch:

$ python -c "import CGIHTTPServer; CGIHTTPServer.test()"

This option allows you to specify a Python statement you wish to run. So it does work, but the problem is that the __name__ module is not '__main__'. . . it is whatever module name you are using. (You can refer back to Section 3.4.1 for a review of __name__ if you need to.) The bottom line is that the interpreter has loaded your module by import and not as a script. Because of this, all of the code under if__name__ == '__main__' will not execute, so you have to do it manually like we did above calling the test() function of the module.

So what we really want is the best of both worlds—being able to execute a module in your library but as a script and not as an imported module. That is the main motivation behind the -m option. Now you can run a script like this:

$ python -m CGIHTTPServer

That is quite an improvement. Still, the feature was not as fully complete as some would have liked. So in Python 2.5, the -m switch was given even more capability. Starting with 2.5, you can use the same option to run modules inside packages or modules that need special loading, such as those inside ZIP files, a feature added in 2.3 (see Section 12.5.7 on page 396).

Python 2.4 only lets you execute standard library modules. So running special modules like PyChecker (Python’s “lint”), the debugger (pdb), or any of the profilers (note that these are modules that load and run other modules) was not solved with the initial -m solution but is fixed in 2.5.

14.5 Executing Other (Non-Python) Programs

We can also execute non-Python programs from within Python. These include binary executables, other shell scripts, etc. All that is required is a valid execution environment, i.e., permissions for file access and execution

ptg 654 Chapter 14 Execution Environment

must be granted, shell scripts must be able to access their interpreter (Perl, bash, etc.), binaries must be accessible (and be of the local machine’s architecture).

Finally, the programmer must bear in mind whether our Python script is required to communicate with the other program that is to be executed.

Some programs require input, others return output as well as an error code upon completion (or both). Depending on the circumstances, Python pro- vides a variety of ways to execute non-Python programs. All of the functions discussed in this section can be found in the os module. We provide a sum- mary for you in Table 14.6 (where appropriate, we annotate those that are available only for certain platforms) as an introduction to the remainder of this section.

Table 14.6 os Module Functions for External Program Execution ( Unix only, Windows only)

os Module Function Description

system(cmd) Execute program cmd given as string, wait for program completion, and return the exit code (on Win- dows, the exit code is always 0)

fork() Create a child process that runs in parallel to the parent process [usually used with exec*()];

return twice... once for the parent and once for the child

execl(file, arg0, arg1,...)

Executefile with argument list arg0,arg1, etc.

execv(file, arglist) Same as execl() except with argument vector (list or tuple) arglist

execle(file, arg0, arg1,... env)

Same as execl() but also providing environment variable dictionary env

execve(file, arglist, env)

Same as execle() except with argument vector arglist

execlp(cmd, arg0, arg1,...)

Same as execl() but search for full file pathname of cmd in user search path

execvp(cmd, arglist) Same as execlp() except with argument vector arglist

U W

ptg 14.5 Executing Other (Non-Python) Programs 655

As we get closer to the operating system layer of software, you will notice that the consistency of executing programs, even Python scripts, across platforms starts to get a little dicey. We mentioned above that the functions described in this section are in the os module. Truth is, there are multiple os modules. For example, the one for Unix-based systems (i.e., Linux, MacOS X, Solaris, BSD, etc.) is the posix module. The one for Windows is nt (regardless of which version of Windows you are running;

DOS users get the dos module), and the one for old MacOS is the mac module. Do not worry, Python will load the correct module when you call import os. You should never need to import a specific operating system module directly.

os Module Function Description execlpe(cmd, arg0,

arg1,... env)

Same as execlp() but also providing environment variable dictionary env

execvpe(cmd, arglist, env)

Same as execvp() but also providing environment variable dictionary env

spawn*a(mode, file, args[, env])

spawn*() family executes path in a new process givenargs as arguments and possibly an environment variable dictionary env;mode is a magic number indicating various modes of operation wait() Wait for child process to complete [usually used

withfork() and exec*()] waitpid(pid,

options)

Wait for specific child process to complete [usually used with fork() and exec*()]

popen(cmd,mode='r', buffering=-1)

Executecmd string, returning a file-like object as a communication handle to the running program, defaulting to read mode and default system buffering

startfileb(path) Executepath with its associated application a. spawn*() functions named similarly to exec*() (both families have eight mem-

bers);spawnv() and spawnve() new in Python 1.5.2 and the other six spawn*() functions new in Python 1.6; also spawnlp(), spawnlpe(), spawnvp() and spawnvpe() are Unix-only.

b. New in Python 2.0.

Table 14.6 os Module Functions for External Program Execution (U Unix only, WWindows only) (continued)

ptg 656 Chapter 14 Execution Environment

Before we take a look at each of these module functions, we want to point out for those of you using Python 2.4 and newer, there is a subprocess module that pretty much can substitute for all of these functions. We will show you later on in this chapter how to use some of these functions, then at the end give the equivalent using the subprocess.Popen class and subprocess.call() function.

14.5.1 os.system()

The first function on our list is system(), a rather simplistic function that takes a system command as a string name and executes it. Python execution is suspended while the command is being executed. When execution has completed, the exit status will be given as the return value from system() and Python execution resumes.

system() preserves the current standard files, including standard output, meaning that executing any program or command displaying output will be passed on to standard output. Be cautious here because certain applications such as common gateway interface (CGI) programs will cause Web browser errors if output other than valid Hypertext Markup Language (HTML) strings are sent back to the client via standard output. system() is generally used with commands producing no output, some of which include programs to compress or convert files, mount disks to the system, or any other command to perform a specific task that indicates success or failure via its exit status rather than communicating via input and/or output. The convention adopted is an exit status of 0 indicating success and non-zero for some sort of failure.

For the purpose of providing an example, we will execute two commands thatdo have program output from the interactive interpreter so that you can observe how system() works.

>>> import os

>>> result = os.system('cat /etc/motd') Have a lot of fun...

>>> result 0

>>> result = os.system('uname -a')

Linux solo 2.2.13 #1 Mon Nov 8 15:08:22 CET 1999 i586 unknown

>>> result 0

You will notice the output of both commands as well as the exit status of their execution, which we saved in the result variable. Here is an example executing a DOS command:

ptg 14.5 Executing Other (Non-Python) Programs 657

>>> import os

>>> result = os.system('dir') Volume in drive C has no label Volume Serial Number is 43D1-6C8A Directory of C:\WINDOWS\TEMP

. <DIR> 01-08-98 8:39a . .. <DIR> 01-08-98 8:39a ..

0 file(s) 0 bytes 2 dir(s) 572,588,032 bytes free

>>> result 0

14.5.2 os.popen()

Thepopen() function is a combination of a file object and the system() function. It works in the same way as system() does, but in addition, it has the ability to establish a one-way connection to that program and then to access it like a file. If the program requires input, then you would call popen() with a mode of 'w' to “write” to that command. The data that you send to the program will then be received through its standard input.

Likewise, a mode of 'r' will allow you to spawn a command, then as it writes to standard output, you can read that through your file-like handle using the familiar read*() methods of file object. And just like for files, you will be a good citizen and close() the connection when you are finished.

In one of the system() examples we used above, we called the Unix uname program to give us some information about the machine and operating system we are using. That command produced a line of output that went directly to the screen. If we wanted to read that string into a variable and perform internal manipulation or store that string to a log file, we could, using popen(). In fact, the code would look like the following:

>>> import os

>>> f = os.popen('uname -a')

>>> data = f.readline()

>>> f.close()

>>> print data,

Linux solo 2.2.13 #1 Mon Nov 8 15:08:22 CET 1999 i586 unknown

As you can see, popen() returns a file-like object; also notice that readline(), as always, preserves the NEWLINE character found at the end of a line of input text.

ptg 658 Chapter 14 Execution Environment

14.5.3 os.fork() , os.exec*() , os.wait*()

Without a detailed introduction to operating systems theory, we present a light introduction to processes in this section. fork() takes your single executing flow of control known as a process and creates a “fork in the road,” if you will. The interesting thing is that your system takes both forks—meaning that you will have two consecutive and parallel running programs (running the same code no less because both processes resume at the next line of code immediately succeeding the fork() call).

The original process that called fork() is called the parent process, and the new process created as a result of the call is known as the child process. When the child process returns, its return value is always zero;

when the parent process returns, its return value is always the process identifier (aka process ID, or PID) of the child process (so the parent can keep tabs on all its children). The PIDs are the only way to tell them apart, too!

We mentioned that both processes will resume immediately after the call tofork(). Because the code is the same, we are looking at identical execution if no other action is taken at this time. This is usually not the intention.

The main purpose for creating another process is to run another program, so we need to take divergent action as soon as parent and child return. As we stated above, the PIDs differ, so this is how we tell them apart.

The following snippet of code will look familiar to those who have experi- ence managing processes. However, if you are new, it may be difficult to see how it works at first, but once you get it, you get it.

ret = os.fork() # spawn 2 processes, both return if ret == 0: # child returns with PID of 0

child_suite # child code

else: # parent returns with child's PID parent_suite # parent code

The call to fork() is made in the first line of code. Now both child and parent processes exist running simultaneously. The child process has its own copy of the virtual memory address space and contains an exact replica of the parent’s address space—yes, both processes are nearly identical. Recall that fork() returns twice, meaning that both the parent and the child return. You might ask, how can you tell them apart if they both return?

When the parent returns, it comes back with the PID of the child process.

When the child returns, it has a return value of 0. This is how we can differ- entiate the two processes.

ptg 14.5 Executing Other (Non-Python) Programs 659

Using an if-else statement, we can direct code for the child to execute (i.e., the if clause) as well as the parent (the else clause). The code for the child is where we can make a call to any of the exec*() functions to run a completely different program or some function in the same program (as long as both child and parent take divergent paths of execution). The general convention is to let the children do all the dirty work while the parent either waits patiently for the child to complete its task or continues execution and checks later to see if the child finished properly.

All of the exec*() functions load a file or command and execute it with an argument list (either individually given or as part of an argument list). If applicable, an environment variable dictionary can be provided for the command. These variables are generally made available to programs to provide a more accurate description of the user’s current execution environment. Some of the more well-known variables include the user name, search path, current shell, terminal type, localized language, machine type, operating system name, etc.

All versions of exec*() will replace the Python interpreter running in the current (child) process with the given file as the program to execute now.

Unlikesystem(), there is no return to Python (since Python was replaced).

An exception will be raised if exec*() fails because the program cannot execute for some reason.

The following code starts up a cute little game called “xbill” in the child process while the parent continues running the Python interpreter. Because the child process never returns, we do not have to worry about any code for the child after calling exec*(). Note that the command is also a required first argument of the argument list.

ret = os.fork()

if ret == 0: # child code

execvp('xbill', ['xbill'])

else: # parent code

os.wait()

In this code, you also find a call to wait(). When children processes have completed, they need their parents to clean up after them. This task, known as “reaping a child,” can be accomplished with the wait*() functions. Immediately following a fork(), a parent can wait for the child to complete and do the clean-up then and there. A parent can also continue processing and reap the child later, also using one of the wait*() functions.

Regardless of which method a parent chooses, it must be performed.

When a child has finished execution but has not been reaped yet, it enters a

ptg 660 Chapter 14 Execution Environment

limbo state and becomes known as a zombie process. It is a good idea to min- imize the number of zombie processes in your system because children in this state retain all the system resources allocated in their lifetimes, which do not get freed or released until they have been reaped by the parent.

A call to wait() suspends execution (i.e., waits) until a child process (any child process) has completed, terminating either normally or via a signal.

wait() will then reap the child, releasing any resources. If the child has already completed, then wait() just performs the reaping procedure.

waitpid() performs the same functionality as wait() with the additional arguments’ PID to specify the process identifier of a specific child process to wait for plus options (normally zero or a set of optional flags logically OR’d together).

14.5.4 os.spawn*()

The spawn*() family of functions are similar to fork() and exec*() in that they execute a command in a new process; however, you do not need to call two separate functions to create a new process and cause it to execute a command. You only need to make one call with the spawn*() family. With its simplicity, you give up the ability to “track” the execution of the parent and child processes; its model is more similar to that of starting a function in a thread. Another difference is that you have to know the magic mode parame- ter to pass to spawn*().

On some operating systems (especially embedded real-time operating systems [RTOs]), spawn*() is much faster than fork(). (Those where this is not the case usually use copy-on-write tricks.) Refer to the Python Library Reference Manual for more details (see the Process Management section of the manual on the os module) on the spawn*() functions. Various mem- bers of the spawn*() family were added to Python between 1.5 and 1.6 (inclusive).

14.5.5 subprocess Module

After Python 2.3 came out, work was begun on a module named popen5. The naming continued the tradition of all the previous popen*() functions that came before, but rather than continuing this ominous trend, the module was eventually named subprocess, with a class named Popen that has functionality to centralize most of the process-oriented functions we have

ptg 14.5 Executing Other (Non-Python) Programs 661

discussed so far in this chapter. There is also a convenience function named call() that can easily slide into where os.system() lives. The subprocess module made its debut in Python 2.4. Below is an example of what it can do:

Replacing os.system()

Linux Example:

>>> from subprocess import call

>>> res = call(('cat', '/etc/motd'))

Linux starship 2.4.18-1-686 #4 Sat Nov 29 10:18:26 EST 2003 i686 GNU/Linux

>>> res 0

Win32 Example:

>>> res = call(('dir', r'c:\windows\temp'), shell=True) Volume in drive C has no label.

Volume Serial Number is F4C9-1C38 Directory of c:\windows\temp

03/11/2006 02:08 AM <DIR> . 03/11/2006 02:08 AM <DIR> ..

02/21/2006 08:45 PM 851 install.log 02/21/2006 07:02 PM 444 tmp.txt 2 File(s) 1,295 bytes

3 Dir(s) 55,001,104,384 bytes free

Replacing os.popen()

The syntax for creating an instance of Popen is only slightly more complex than calling the os.popen() function:

>>> from subprocess import Popen, PIPE

>>> f = Popen(('uname', '-a'), stdout=PIPE).stdout

>>> data = f.readline()

>>> f.close()

>>> print data,

Linux starship 2.4.18-1-686 #4 Sat Nov 29 10:18:26 EST 2003 i686 GNU/Linux

Mapping Type Built-in and Factory

Instance Attributes versus Class Attributes