Class functions - Python Programming for Biology_

Class functions are functions (subroutines) that are defined within the construction of a class, so that the function’s capabilities are available to any object made according to that class specification. They are defined in the same way as ordinary functions, albeit indented within the class code block, but there is one extra twist: the function is aware of the object to which it belongs. Class functions are accessed from the variable representing the object via ‘dot’ syntax, linking the object’s name to the function name. For example, using a function we define below, for a Molecule object we might do:

name = myMolecule.getName()

Here the getName() function knows which Molecule object to use when fetching the name, without any additional information. When a class function is written, the first argument in its definition is special and represents the object that the function would be called from.3 In this way you have a handle on the object that you can use in the function code. As with all arguments to functions, you can call this argument for the object itself anything you want. However, the universal and unwavering convention in Python is to call it ‘self’, and nothing else, ever. If you call it anything else, and a Python programmer spots it, don’t be surprised if annoyance results. Here we define a class and include a class function within: class Molecule: def getName(self): # function implementation

Because the class function takes self as the first argument, the function can be used (called) any time you have made an object using the class.

Notice that when you call the function (not forgetting the brackets, see above) you do not include the self argument inside the parentheses, since it is automatically known that it refers to the object that is calling the function. This oddity can initially cause people confusion, but a convenient way to think about it is that the object, in this case molecule, substitutes for self inside the class and inside the actual function. Indeed the self argument really is set as the molecule when the program is run.

There is an alternative way to call class functions, which explicitly passes in the object. This method turns out to be useful in certain circumstances when we are dealing with superclasses and subclasses. So instead of the above, you could use the name of the class and pass the object as an argument, which perhaps makes it easier to see how self is filled:

name = Molecule.getName(molecule)

Here getName() is the general function definition in the Molecule class, rather than one bound to a particular instance of an object. Hence, the specific object to be operated on must be explicitly passed in to the function call, because otherwise the object, in this case molecule, is not known. Note, however, that this method still uses the ‘dot’ syntax. The first object.function() way of calling reads better, and is shorter, than the second way. So, unless you definitely need to use the second version, which can happen when subclasses come into play, then the first version is preferred. From the context of the implementation, inside the code that constructs the class, you write function calls with self as the object, which of course is filled in when a real object instance is made. For example, suppose we have a function that provides the name of the molecule with the first letter capitalised. To implement this function we first get hold of the molecule’s name, without capitalisation, using a call to the other function, before we do the job of changing the text:

def getName(self): # function implementation def getCapitalisedName(self): name = self.getName() return name.capitalize() Notice that in getCapitalisedName() we are assuming that name is set to a Python string and not None, otherwise capitalize() would cause an error, and generate an AttributeError exception because the capitalize() function is only guaranteed to be present for string objects. You could alternatively protect against this by not calling the capitalisation function unless the name is definitely a string: class Molecule: def getName(self): # function implementation def getCapitalisedName(self): name = self.getName() if name: return name.capitalize() else: return name

The order of the function definitions inside the class implementation code does not matter, so here getCapitalisedName() could have been listed before getName(). Although, be warned that if by mistake you actually specify a function definition more than once inside a class then the last occurrence replaces the previous one.

Moving on to consider subclasses, which build upon some other class definition, we could include new functions: class Protein(Molecule): def getSequence(self): # function implementation def getAminoAcids(self): # function implementation You would call these functions in the expected way, for example: sequence = protein.getSequence() However, it is especially notable that there is no need to repeat the code for getName() or getCapitalisedName(), which are already defined in the Molecule superclass; this is a major point of using class inheritance. The Protein class automatically inherits these functions from the other class on which it is based, so, for example, one can do:

name = protein.getName()

subclass. Accordingly .getSequence() cannot be used for an object made from the Molecule definition but only from an object made from the Protein definition.

In document Python Programming for Biology_ - Tim J. Stevens (Page 120-123)