In Lecture 16, we wrote a function that counted the number of occurrences of a letter in a string. A more general version of this problem is to form a histogram of the letters in the string, that is, how many times each letter appears. Such a histogram might be useful for compressing a text file. Because different letters appear with different frequencies, we can compress a file by using shorter codes for common letters and longer codes for letters that appear less frequently.
>>> letter_counts = {}
>>> for letter in "Mississippi":
... letter_counts[letter] = letter_counts.get(letter, 0) + 1 ...
>>> letter_counts
{’M’: 1, ’s’: 4, ’p’: 2, ’i’: 4}
We start with an empty dictionary. For each letter in the string, we find the current count (possibly zero) and increment it using the get method. At the end, the dictionary contains pairs of letters and their frequencies. It might be more appealing to display the histogram in alphabetical order. We can do that with the items and sort methods:
>>> letter_items = letter_counts.items() >>> letter_items.sort()
>>> print letter_items
[(’M’, 1), (’i’, 4), (’p’, 2), (’s’, 4)]
The sort method is one of several that can be applied to lists, others include append, extend, and reverse. Consult the Python documentation for details.
21.6
Glossary
dictionary: A collection of key-value pairs that maps from keys to values. The keys can be any immutable type, and the values can be any type.
key: A value that is used to look up an entry in a dictionary. Keys must be of an immutable type (e.g. integer, float, string, tuple).
key-value pair: One of the items in a dictionary.
invoke: To call a method.
21.7
Laboratory exercises
1. Write a program that reads in a string and returns a table of the letters of the alphabet in alphabetical order which occur in the string together with the number of times each letter occurs. Case should be ignored. A sample run of the program would look this this:
Enter a string: > ThiS is String with Upper and lower case Letters. a 2 c 1 d 1 e 5 g 1 h 2 i 4 l 2 n 2 o 1 p 2 r 4 s 5 t 5 u 1 w 2
2. Write another program that reads from a file and produces a similar output to that in question 1. 3. Write a function called matrix to sparse that takes a matrix represented as a nested list as specified
in Section 19.6 as input, and returns a sparse matrix represented as a dictionary.
4. Write a function called sparse to matrix, that takes a sparse matrix represented as a dictionary, and returns a matrix represented as a nested list as in Section 19.6.
5. In Exercise 3 in Lecture 19 you wrote a function to multiply two matrices that were represented as nested lists. Write another function called matrix mult sparse, that takes two sparse matrices, represented using dictionaries, and returns a sparse matrix which is the product of the two input matrices.
Lecture 22
System programming
22.1
The sys module and argv
The sys module contains functions and variables which provide access to the environment in which the python interpreter runs. The following example shows the values of a few of these variables on one of our systems: >>> import sys >>> sys.platform ’darwin’ >>> sys.path [’/’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python25.zip’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-darwin’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-mac’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-tk’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload’, ’/Library/Python/2.5/site-packages’, ’/System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python/PyObjC’] >>> sys.version
’2.5.1 (r251:54863, Jan 13 2009, 10:26:13) \n[GCC 4.0.1 (Apple Inc. build 5465)]’
Starting Jython on vex produces different values for the same variables:
>>> import sys >>> sys.platform ’java1.6.0_04’ >>> sys.path [’’, ’.’, ’/usr/share/jython/Lib’, ’__classpath__’] >>> sys.version ’2.2b1’
The results will be different on your machine of course.
The argv variable holds a list of strings read in from the command line when a Python script is run. These
#
# demo_argv.py #
import sys
print sys.argv
Running this program from the unix command prompt demonstrates how sys.argv works:
$ python demo_argv.py this and that 1 2 3
[’demo_argv.py’, ’this’, ’and’, ’that’, ’1’, ’2’, ’3’] $
argvis a list of strings. Notice that the first element is the name of the program. Arguments are separated by white space, and separated into a list in the same way that string.split operates. If you want an argument with white space in it, use quotes:
$ python demo_argv.py "this and" that "1 2" 3 [’demo_argv.py’, ’this and’, ’that’, ’1 2’, ’3’] $
With argv we can write useful programs that take their input directly from the command line. For example, here is a program that finds the sum of a series of numbers:
#
# sum.py #
from sys import argv nums = argv[1:]
for index, value in enumerate(nums): nums[index] = float(value)
print sum(nums)
In this program we use the from <module> import <attribute> style of importing, so argv is brought into the module’s main namespace. We can now run the program from the command prompt like this:
$ python sum.py 3 4 5 11 23
$ python sum.py 3.5 5 11 100 119.5
You are asked to write similar programs as exercises.