• No results found

Dataclasses offer a great deal of functionality that can help you modify the default behavior.

First and foremost, you can provide each of your declared attributes with a default value. Doing so makes them optional when you create a new instance. For example, say you want the default book price to be $20. You can say:

@dataclass

class Book(object):

title : str author : str

price : float = 20

Notice how the syntax reflects the Python 3 syntax for function parameters that have both type annotation and a default value. Just as is the case with function parameter defaults, dataclass attributes with defaults must come after those without defaults.

Rather than declaring a value for a default, you actually can pass a function that is executed (without any arguments) each time a new object is created.

To do this, and to take advantage of a number of other features having to do with dataclass attributes, you must use the field function (from the dataclass

module), which lets you tailor the way the attribute is defined and used.

If you pass a function to the default_factory parameter, that function will be invoked each time a new instance is created without a specified value for that attribute. This is very similar to the way that the defaultdict class works, except that it can be specified for each attribute.

For example, you can give each new book a default random price between $20 and $100 in the following way:

AT THE FORGE

import random

from dataclasses import dataclass, field

def random_price():

return random.randint(20,100)

@dataclass

class Book(object):

title : str author : str

price : float = field(default_factory=random_price)

Note that you cannot both set default_factory and a default value; the whole point is that default_factory lets you run a function and, thus, provides the value dynamically, when the new instance is created.

The main thing that the __init__ method in a Python object does is add attributes to the new instance. Indeed, I’d argue that the majority of __init__ methods I’ve written through the years do little more than assigning the parameters to instance attributes.

For such objects, the default behavior of dataclasses works just fine.

But in some cases, you’ll want to do more than just assign values. Perhaps you want to set up values that aren’t dependent on parameters. Perhaps you want to take the parameters and adjust them in some way. Or perhaps you want to do something bigger, such as open a file or make a network connection.

Of course, the whole point of a dataclass is that it takes care of writing __init__ for you. And thus, if you want to do more than just assign the parameters to attributes, you can’t do so, at least not in __init__. I mean, you could define __init__, but the whole point of a dataclass is that it does so for you.

For cases like this, dataclasses have another method at their disposal, called

__post_init__. If you define __post_init__, it will run after the dataclass-defined

AT THE FORGE

__init__. So, you’re assured that the attributes have been set, allowing you to adjust or add to them, as necessary.

Here’s another case that dataclasses handle. Normally, instances of user-created classes are hashable. But in the case of dataclasses, they aren’t. This means you can’t use dataclasses as keys in dictionaries or as elements in sets.

You can get around this by declaring your class to be “frozen”, making it immutable. In other words, a frozen dataclass is defined at runtime and then never changes—similar to a named tuple. You can do this by giving a True value to the dataclass decorator’s

frozen parameter:

>>> @dataclass(frozen=True) ... class Foo(object):

... x : int ...

>>> f1 = Foo(10)

>>> f1.x = 100

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/usr/local/lib/python3.7/dataclasses.py", line 448, ↪in _frozen_setattr

raise FrozenInstanceError(f'cannot assign to field {name!r}') dataclasses.FrozenInstanceError: cannot assign to field 'x'

Moreover, now you can run hash on the variable:

>>> hash(f1) 3430012387537

There are a number of other optional pieces of functionality in dataclasses as well—from indicating how your objects will be compared, which fields will be printed and the like. It’s impressive to see just how much thought has gone into

AT THE FORGE

the creation of dataclasses. I wouldn’t be surprised if in the next few years, most Python classes will be defined as dataclasses, along with whatever customization and additions the user requests.

Conclusion

Python’s classes always have suffered from some repetition, and dataclasses aim to fix that problem. But, dataclasses go beyond macros to provide a toolkit that a large number of Python developers can and should use to improve the readability of their code. The fact that dataclasses integrate so nicely into other modern Python tools and code, such as MyPy, tells me that it’s going to become the standard way to create and work with classes in Python very quickly.

Resources

Dataclasses are described most fully in the PEP (Python Enhancement Proposal) 557. If Python 3.7 isn’t out by the time you read this article, you can go to https://python.org and download a beta copy. Although you shouldn’t use it in production, you definitely should feel comfortable trying it out and using it for personal projects. ◾

Send comments or feedback

via http://www.linuxjournal.com/contact or email [email protected].

UPFRONT

AT THE FORGE

A Look at