In Chapter 2.3 we demonstrate simple manual parsing of command-line argu- ments. However, the recommended way to handle command-line arguments is to use standardized rules for specifying the arguments and standardized modules for parsing. As soon as you start using Python’sgetoptandoptparse
modules for parsing the command line, you will probably never write manual code again. The basic usage of these modules is explained right after a short introduction to different kinds of command-line options.
Short and Long Options. Originally, Unix tools used onlyshort options, like -hand-d. Later, GNU software also supportedlong options, like--helpand --directory, which are easier to understand but also require more typing.
The GNU standard is to use a double hyphen in long options, but there are many programs that use long options with only a single hyphen, as in-help and-directory. Software with command-line interfaces often supports both short options with a single hyphen and corresponding long options with a double hyphen.
An option can be followed by a value or not. For example,-d srcassigns the valuesrcto the -doption, whereas-h(for a help message) is an option without any associated value. Long options with values can take the form --directory srcor --directory=src. Long options can also be abbreviated, e.g., --dir srcis sufficient if --dir matches one and only one long option.
Short options can be combined and there is no need for a space between the option and the value. For example,-hdsrc is the same as-h -d src.
The Getopt Module. Python’s getopt module has a function getopt for parsing the command line. A typical use is
options, args = getopt.getopt(sys.argv[1:],
’hd:i’, [’help’, ’directory=’, ’confirm’])
The first argument is a list of strings representing the options to be parsed.
Short options are specified in the second function parameter by listing the letters in all short options. The colon signifies that -d takes an argument.
Long options are collected in a list, and the options that take an argument have an equal sign (=) appended to the option name.
A 2-tuple (options, args)is returned, where optionsis a list of the en- countered option-value pairs, e.g.,
[(’-d’, ’mydir/sub’), (’--confirm’, ’’)]
Theargs variable holds all the command-line arguments that were not rec- ognized as proper options. An unregistered option leads to an exception of typegetopt.GetoptError.
A typical way of extracting information from theoptionslist is illustrated next:
for option, value in options:
if option in (’-h’, ’--help’):
print usage; sys.exit(0) # 0: this exit is no error elif option in (’-d’, ’--directory’):
directory = value
elif option in (’-i’, ’--confirm’):
confirm = True
8.1. Miscellaneous Topics 321 Suppose we have a script for moving files to a destination directory. The script takes the options as defined in thegetopt.getoptcall above. The rest of the arguments on the command line are taken to be filenames. Let us exemplify various ways of setting this script’s options. With the command- line arguments
-hid /tmp src1.c src2.c src3.c we get theoptionsandargsarrays as
[(’-h’, ’’), (’-i’, ’’), (’-d’, ’/tmp’)]
[’src1.c’, ’src2.c’, ’src3.c’]
Equivalent sets of command-line arguments are --help -d /tmp --confirm src1.c src2.c src3.c
--help --directory /tmp --confirm src1.c src2.c src3.c --help --directory=/tmp --confirm src1.c src2.c src3.c The last line implies anoptionslist
[(’--help’, ’’), (’--directory’, ’/tmp’), (’--confirm’, ’’)]
Only a subset of the options can also be specified:
-i file1.c
This results inoptions as[(’-i’, ’’)]andargsas[’file1.c’].
The Optparse Module. Theoptparsemodule is a more flexible and advanced option parser thangetopt. The usage is well described in the Python Library Reference. The previous example can easily be coded usingoptparse:
from optparse import OptionParser parser = OptionParser()
# help message is automatically provided
parser.add_option(’-d’, ’--directory’, dest=’directory’, help=’destination directory’)
parser.add_option(’-i’, ’--confirm’, dest=’confirm’, action=’store_true’, default=False, help=’confirm each move’)
options, args = parser.parse_args(sys.argv[1:])
Each option is registered byadd_option, which takes the short and long option as the first two arguments, followed by a lot of possible keyword arguments.
The dest keyword is used to specify a destination, i.e., an attribute in the objectoptionsreturned fromparse_args. In our example,options.directory will contain’/tmp’if we have--directory /tmpor-d /tmpon the command line. The help keyword is used to provide a help message. This message is written to standard output together with the corresponding option if we have the flag -h or option --help on the command line. This means that the help functionality is a built-in feature ofoptparse so we do not need to
explicitly register a help option as we did when using getopt. The option -i or --confirm does not take an associated value and acts as a boolean parameter. This is specified by theaction=’store_true’ argument. When-i or--confirmis encountered,options.confirm is set toTrue. Its default value isFalse, as specified by thedefault keyword.
Providing-h or--help on the command line of our demo script triggers the help message
options:
-h, --help show this help message and exit -dDIRECTORY, --directory=DIRECTORY
destination directory -i, --confirm confirm each move The command-line arguments
--directory /tmp src1.c src2.c src3.c
result inargs as[’src1.c’, ’src2.c’, ’src3.c’], options.directory equals
’/tmp’, andoptions.confirmequals False.
The script src/py/examples/cmlparsing.py contains the examples above in a running script.
Both optparse and getopt allow only short or only long options: simply supply empty objects for the undesired option type.
Remark. Thegetoptandoptparsemodules raise an exception if an unregis- tered option is met. This is inconvenient if different parts of a program handle different parts of the command-line arguments. Each parsing call will then specify and process a subset of the possible options on the command line.
Withoptparsewe may subclassOptionParser and reimplement the function erroras an empty function:
class OptionParserNoError(OptionParser):
def error(self, msg):
return
The new classOptionParseNoError will not complain if it encounters unreg- istered options.
If all options have values, the cmldictfunction developed in Exercise 8.2 represents a simple alternative to the getopt and optparse modules. The cmldictfunction may be called many places in a code and may process only a subset of the total set of legal options in each call.