• No results found

Data structures

In document The Hacker Guide to Python (Page 184-193)

Performances and optimizations

10.1 Data structures

Most computer problems can be solved in an elegant and simple manner, provided that ⁴ou use the right data structures – and P⁴thon provides man⁴ data structures to choose from.

Oten, there is a temptation to code ⁴our own custom data structures – this is invari-abl⁴ a vain, useless, doomed idea. P⁴thon almost alwa⁴s has better data structures and code to offer – learn to use them.

For example, ever⁴bod⁴ uses dict, but how man⁴ times have ⁴ou seen code like this:

def get_fruits(basket, fruit):

# A variation is to use "if fruit in basket:"

try:

return basket[fruit]

except KeyError:

. . DATA STRUCTURES

return set()

It’s much more eas⁴ to use thegetmethod alread⁴ provided b⁴ thedictstructure:

def get_fruits(basket, fruit):

return basket.get(fruit, set())

It’s not uncommon for people to use basic P⁴thon data structures without being aware of all the methods the⁴ provide. This is also true for sets – for example:

def has_invalid_fields(fields):

for field in fields:

if field not in ['foo', 'bar']:

return True return False

This can be written without a loop:

def has_invalid_fields(fields):

return bool(set(fields) - set(['foo', 'bar']))

Thesetdata structures have methods which can solve man⁴ problems that would otherwise need to be addressed b⁴ writing nested for/if blocks.

There are also more advanced data structures that can greatl⁴ reduce the burden of code maintenance. For example, take a look at the following code:

def add_animal_in_family(species, animal, family):

if family not in species:

species[family] = set() species[family].add(animal) species = {}

add_animal_in_family(species, 'cat', 'felidea')

. . PROFILING

Sure, this code is perfectl⁴ valid, but how man⁴ times will ⁴our program require a variation of the above? Tens? Hundreds?

P⁴thon provides the collections.defaultdict structure, which solves the prob-lem in an elegant wa⁴.

import collections

def add_animal_in_family(species, animal, family):

species[family].add(animal)

species = collections.defaultdict(set)

add_animal_in_family(species, 'cat', 'felidea')

Each time that ⁴ou tr⁴ to access a non-existent item from ⁴our dict, thedefaultdict

will use the function that was passed as argument to its constructor to build a new value – instead than raising aKeyError. In this case, thesetfunction is used to build a new set each time we need it.

B⁴ the wa⁴, the collections module offers a few useful data structures that can solve other kinds of problems, such asOrderedDictorCounter.

It’s reall⁴ important to look for the right data structure in P⁴thon, as the correct choice will save ⁴ou time, and lessen code maintenance.

10.2 Profiling

P⁴thon provides a few tools to profile ⁴our program. The standard one iscProfile

and is eas⁴ enough to use.

Example . Using thecProfilemodule

$ python -m cProfile myscript.py

343 function calls (342 primitive calls) in 0.000 seconds

. . PROFILING

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 :0(_getframe)

1 0.000 0.000 0.000 0.000 :0(len) 104 0.000 0.000 0.000 0.000 :0(setattr)

1 0.000 0.000 0.000 0.000 :0(setprofile) 1 0.000 0.000 0.000 0.000 :0(startswith)

2/1 0.000 0.000 0.000 0.000 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 StringIO.py:30(<module>) 1 0.000 0.000 0.000 0.000 StringIO.py:42(StringIO)

The results list indicates the number of calls each function was called, and the time spent on its execution. You can use the-soption to sort b⁴ other fields; e.g. -s time

will sort b⁴ internal time.

If ⁴ou’ve coded in C, as I did ⁴ears ago, ⁴ou probabl⁴ alread⁴ know the fantastic Valgrind tool, that – among other things – is able to provide profiling data for C programs. The data that it provides can then be visuali⁵ed b⁴ another great tool namedKCacheGrind.

You’ll be happ⁴ to know that the profiling information generated b⁴ cProfile can eas-il⁴ be converted to a call tree that can be read b⁴ KCacheGrind. ThecProfile mod-ule has a -ooption that allows ⁴ou to save the profiling data, andp⁴prof calltree can convert from one format to the other.

Example . Using KCacheGrind to visuali⁵e P⁴thon profiling data

$ python -m cProfile -o myscript.cprof myscript.py

$ pyprof2calltree -k -i myscript.cprof

. . PROFILING

Figure . : KCacheGrind example

This provides a lot of information that will allow ⁴ou to determine what part of ⁴our program might be consuming too much resources.

While this clearl⁴ works well for a macroscopic view of ⁴our program, it sometimes helps to have a microscopic view of some part of the code. In such a context, I find it better to rel⁴ on thedis module to find out what’s going on behind the scenes.

Thedismodule is a disassembler of P⁴thon b⁴te code. It’s simple enough to use:

>>> def x():

... return 42 ...

>>> import dis

>>> dis.dis(x)

2 0 LOAD_CONST 1 (42)

. . PROFILING

3 RETURN_VALUE

The dis.dis function disassembles the function that ⁴ou passed as a parameter, and prints the list of b⁴tecode instructions that are run b⁴ the function. It can be useful to understand what’s reall⁴ behind each line of code that ⁴ou write, in order to be able to properl⁴ optimi⁵e ⁴our code.

The following code defines two functions, each of which does the same thing – con-catenates three letters:

abc = ('a', 'b', 'c')

def concat_a_1():

for letter in abc:

abc[0] + letter

def concat_a_2():

a = abc[0]

for letter in abc:

a + letter

Both appear to do exactl⁴ the same thing, but if we disassemble them, we’ll see that the generated b⁴tecode is a bit different:

>>> dis.dis(concat_a_1)

2 0 SETUP_LOOP 26 (to 29)

3 LOAD_GLOBAL 0 (abc)

6 GET_ITER

>> 7 FOR_ITER 18 (to 28)

10 STORE_FAST 0 (letter)

3 13 LOAD_GLOBAL 0 (abc)

16 LOAD_CONST 1 (0)

. . PROFILING

19 BINARY_SUBSCR

20 LOAD_FAST 0 (letter)

23 BINARY_ADD 24 POP_TOP

25 JUMP_ABSOLUTE 7

>> 28 POP_BLOCK

>> 29 LOAD_CONST 0 (None) 32 RETURN_VALUE

>>> dis.dis(concat_a_2)

2 0 LOAD_GLOBAL 0 (abc)

3 LOAD_CONST 1 (0)

6 BINARY_SUBSCR

7 STORE_FAST 0 (a)

3 10 SETUP_LOOP 22 (to 35)

13 LOAD_GLOBAL 0 (abc)

16 GET_ITER

>> 17 FOR_ITER 14 (to 34)

20 STORE_FAST 1 (letter)

4 23 LOAD_FAST 0 (a)

26 LOAD_FAST 1 (letter)

29 BINARY_ADD 30 POP_TOP

31 JUMP_ABSOLUTE 17

>> 34 POP_BLOCK

>> 35 LOAD_CONST 0 (None) 38 RETURN_VALUE

As ⁴ou can see, in the second version we store abc[0]in a temporar⁴ variable

be-. be-. PROFILING

fore running the loop. This makes the b⁴tecode executed inside the loop a little smaller, as we avoid having to do theabc[0] lookup for each iteration. Measured usingtimeit, the second version is % faster than the first one; it takes a whole microsecond less to execute! Obviousl⁴ this microsecond is not worth the optimi⁵a-tion unless ⁴ou call this funcoptimi⁵a-tion millions of times – but this is kind of insight that thedismodule can provide.

Whether ⁴ou should need to rel⁴ on such "tricks" as storing the value outside the loop is debatable – ultimatel⁴, it should be the compiler’s work to optimi⁵e this kind of thing. On the other hand, as the language is heavil⁴ d⁴namic, it’s difficult for the compiler to be sure that optimi⁵ation wouldn’t result in negative side effects. So be careful when writing ⁴our code!

Another wrong habit I’ve oten encountered when reviewing code is the defining of functions inside functions for no reason. This has a cost – as the function is going to be redefined over and over for no reason.

Example . A function defined in a function, disassembled

>> import dis

>>> def x():

... return 42 ...

>>> dis.dis(x)

2 0 LOAD_CONST 1 (42)

3 RETURN_VALUE

2 0 LOAD_CONST 1 (<code object y at 0x100ce7e30, ←֓

. . PROFILING

file "<stdin>", line 2>)

3 MAKE_FUNCTION 0

6 STORE_FAST 0 (y)

4 9 LOAD_FAST 0 (y)

12 CALL_FUNCTION 0

15 RETURN_VALUE

We can see here that it is needlessl⁴ complicated, calling MAKE_FUNCTION,STORE_F AST,LOAD_FASTandCALL_FUNCTIONinstead of justLOAD_CONST. That requires man⁴ more opcodes for no good reason – and function calling in P⁴thon is alread⁴ ineffi-cient.

The onl⁴ case in which it is required to define a function within a function is when building a function closure, and this is a perfectl⁴ identified use case in P⁴thon’s opcodes.

Example . Disassembling a closure

>>> def x():

... a = 42 ... def y():

... return a

... return y() ...

>>> dis.dis(x)

2 0 LOAD_CONST 1 (42)

3 STORE_DEREF 0 (a)

3 6 LOAD_CLOSURE 0 (a)

9 BUILD_TUPLE 1

12 LOAD_CONST 2 (<code object y at 0x100d139b0, ←֓

. . ORDERED LIST AND BISECT

file "<stdin>", line 3>)

15 MAKE_CLOSURE 0

18 STORE_FAST 0 (y)

5 21 LOAD_FAST 0 (y)

24 CALL_FUNCTION 0

27 RETURN_VALUE

In document The Hacker Guide to Python (Page 184-193)