So, what do we do when we want to group values together, but know we're
frequently going to need to access them individually? Well, we could use an empty object, as discussed in the previous section (but that is rarely useful unless we anticipate adding behavior later), or we could use a dictionary (most useful if we don't know exactly how many or which specific data will be stored), as we'll cover in the next section.
If, however, we do not need to add behavior to the object, and we know in advance what attributes we need to store, we can use a named tuple. Named tuples are tuples with attitude. Named tuples are objects without behavior. Named tuples are a great way to group data together, especially read-only data.
Constructing a named tuple takes a bit more work than a normal tuple. First we have to import namedtuple, as it is not in the namespace by default. Then we describe the named tuple by giving it a name and outlining its attributes. This returns a class-like object that we can instantiate with the required values as many times as we want:
from collections import namedtuple
Stock = namedtuple("Stock", "symbol current high low") stock = Stock("GOOG", 613.30, high=625.86, low=610.50)
The namedtuple constructor accepts two arguments. The first is an identifier for the named tuple. The second is a string of space-separated attributes that the named tuple can have. The first attribute should be listed, followed by a space, then the second attribute, then another space, and so on. The result is an object that can be used to instantiate other objects. This new object can be called just like a normal class. The constructor must have exactly the right number of arguments; these can be passed in order, or as keyword arguments, but all attributes must be specified. We can create as many instances of this "class" as we like, with different values for each.
The resulting named tuple can then be packed and unpacked like a normal tuple, but we can also access individual attributes on it as if it were a class:
>>> stock.high 625.86
>>> symbol, current, high, low = stock
>>> current 613.3
Remember that constructing named tuples is a two-step process. First, use collections.namedtuple to create a class, and then create instances of that class.
Named tuples are perfect for many "data only" representations, but they are not ideal for all situations. Like tuples and strings, named tuples are immutable, so we cannot modify an attribute once it has been set. For example, the current value of our stock has gone down since we started this discussion, but we can't set the new value:
>>> stock.current = 609.27
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
If we need to be able to change stored data, a dictionary may be what we need instead.
Dictionaries
Dictionaries are incredibly useful objects that allow us to map objects directly to other objects. An empty object with attributes to it is a sort of dictionary; the names of the properties map to the property values. This is actually closer to the truth than it sounds; internally, objects normally represent attributes as a dictionary, where the values are properties or methods on the objects. Even the attributes on a module are stored, internally, in a dictionary.
Dictionaries are extremely efficient at looking up a value, given a specific lookup object that maps to that value. They should always be used when you want to find one object based on another object. The object that is being stored is called the value; the object that is being used as an index is called the key. We've already seen dictionary syntax in some of our previous examples, but for completeness, we'll go over it again. Dictionaries can be created either using the dict() constructor, or using the {} syntax shortcut. In practice the latter format is almost always used. We can pre-populate a dictionary by separating the keys from the values using a colon, and separating the key value pairs using a comma.
For example, in a stock application, we would most often want to look up prices by the stock symbol. We can create a dictionary that uses stock symbols as keys, and tuples of current, high, and low as values like this:
stocks = {"GOOG": (613.30, 625.86, 610.50), "MSFT": (30.25, 30.70, 30.19)}
As we've seen in previous examples, we can then look up values in the dictionary by requesting a key inside square brackets. If the key is not in the dictionary, it will raise an exception:
>>> stocks["GOOG"]
(613.3, 625.86, 610.5)
>>> stocks["RIM"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'RIM'
We can, of course, catch the KeyError and handle it. But we have other options.
Remember, dictionaries are objects, even if their primary purpose is to hold other objects. As such, they have several behaviors associated with them. One of the most useful of these methods is the get method; it accepts a key as the first parameter and an optional default value if the key doesn't exist:
>>> print(stocks.get("RIM")) None
>>> stocks.get("RIM", "NOT FOUND") 'NOT FOUND'
For even more control, we can use the setdefault method. If the key is in the dictionary, this method behaves just like get; it returns the value for that key.
Otherwise, if the key is not in the dictionary, it will not only return the default value we supply in the method call (just like get does), it will also set the key to the same value. Another way to think of it is that setdefault sets a value in the dictionary only if that value has not previously been set. Then it returns the value in the dictionary, either the one that was already there, or the newly provided default value.
>>> stocks.setdefault("GOOG", "INVALID") (613.3, 625.86, 610.5)
>>> stocks.setdefault("RIM", (67.38, 68.48, 67.28)) (67.38, 68.48, 67.28)
>>> stocks["RIM"]
(67.38, 68.48, 67.28)
The GOOG stock was already in the dictionary, so when we tried to setdefault it to an invalid value, it just returned the value already in the dictionary. RIM was not in the dictionary, so setdefault returned the default value and set the new value in the dictionary for us. We then check that the new stock is, indeed, in the dictionary.
Three other very useful dictionary methods are keys(), values(), and items(). The first two return an iterator over all the keys and all the values in the dictionary. We can use these like lists or in for loops if we want to process all the keys or values.
The items() method is probably the most useful; it returns an iterator over tuples of (key, value) pairs for every item in the dictionary. This works great with tuple unpacking in a for loop to loop over associated keys and values. This example does just that to print each stock in the dictionary with its current value:
>>> for stock, values in stocks.items():
... print("{} last value is {}".format(stock, values[0])) ...
GOOG last value is 613.3 RIM last value is 67.38 MSFT last value is 30.25
Each key/value tuple is unpacked into two variables named stock and values (we could use any variable names we wanted, but these both seem appropriate) and then printed in a formatted string.
Notice that the stocks do not show up in the same order in which they were inserted.
Dictionaries, due to the efficient algorithm (known as hashing) that is used to make key lookup so fast, are inherently unsorted.
So, there are numerous ways to retrieve data from a dictionary once it has been instantiated; we can use square brackets as index syntax, the get method, the setdefault method, or iterate over the items method, among others.
Finally, as you probably already know, we can set a value in a dictionary using the same indexing syntax we use to retrieve a value:
>>> stocks["GOOG"] = (597.63, 610.00, 596.28)
>>> stocks['GOOG']
(597.63, 610.0, 596.28)
Google's price is lower today, so I've updated the tuple value in the dictionary. We can use this index syntax to set a value for any key, regardless of whether the key is in the dictionary. If it is in the dictionary, the old value will be replaced with the new one; otherwise, a new key/value pair will be created.
We've been using strings as dictionary keys, so far but we aren't limited to string keys. It is common to use strings as keys, especially when we're storing data in a dictionary to gather it together (instead of using an object with named properties).
But we can also use tuples, numbers, or even objects we've defined ourselves as dictionary keys. We can even use different types of keys in a single dictionary:
random_keys = {}
random_keys["astring"] = "somestring"
random_keys[5] = "aninteger"
random_keys[25.2] = "floats work too"
random_keys[("abc", 123)] = "so do tuples"
class AnObject:
def __init__(self, avalue):
self.avalue = avalue my_object = AnObject(14)
random_keys[my_object] = "We can even store objects"
my_object.avalue = 12 try:
random_keys[[1,2,3]] = "we can't store lists though"
except:
print("unable to store list\n") for key, value in random_keys.items():
print("{} has value {}".format(key, value))
This code shows several different types of keys we can supply to a dictionary. It also shows one type of object that cannot be used. We've already used lists extensively, and we'll be seeing many more details of them in the next section. Because lists can change at any time (by adding or removing items, for example), they cannot hash to a specific value. Objects that are hashable basically have a defined algorithm that converts the object into an integer value for rapid lookup. This hash is what is actually used to look up values in a dictionary. Strings map to integers based on the characters in the string, while tuples combine hashes of the items inside the tuple, for example. Any two objects that are somehow considered equal (like strings with the same characters or tuples with the same values) should have the same hash value, and the hash value for an object should never ever change. Lists, however, can have their contents changed, which would change their hash value (two lists should only be equal if their contents are the same). Because of this, they can't be used as dictionary keys. For the same reason, dictionaries cannot be used as keys into other dictionaries.
In contrast, there are no limits on the types of objects that can be used as dictionary values. We can use a string key that maps to a list value, for example, or we can have a nested dictionary as a value in another dictionary.