• No results found

Why Was Python Created in the First Place?

In document Python for Bioinformatics (Page 34-41)

Here is a recounting by Guido van Rossum, Python author, about what was the motivation for “inventing” a new computer language:

“I was working in the Amoeba distributed operating system group at CWI.

We needed a better way to do system administration than by writing either C programs or Bourne shell scripts, since Amoeba had its own system call interface which wasn’t easily accessible from the Bourne shell. My experience with error handling in Amoeba made me acutely aware of the importance of exceptions as a programming language feature.

It occurred to me that a scripting language with a syntax like ABC but with access to the Amoeba system calls would fill the need. I realized that it would be foolish to write an Amoeba-specific language, so I decided that I needed a language that was generally extensible.

During the 1989 Christmas holidays, I had a lot of time on my hand, so I decided to give it a try. During the next year, while still mostly working on it in my own time, Python was used in the Amoeba project with increasing success, and the feedback from colleagues made me add many early improve-ments.

In February 1991, after just over a year of development, I decided to post to USENET. The rest is in the Misc/HISTORY file.”

In January 2009, Guido opened a blog devoted to Python history. It can be found athttp://python-history.blogspot.com.

1.5.2 Comparing Python with Other Languages

You may be wondering why you should use Python, and not more well known languages like C, Perl or JAVA. It is a good question. A programming language can be regarded as a tool, and choosing the best tool for the job makes a lot of sense.

Readability

Nonprofessional programmers tend to value the learning curve as much as the legibility of the code (both aspects are tightly related).

A simple “hello world” program in Python looks like this:

print("Hello world!")

Compare it with the equivalent code in Java:

public class Hello {

public static void main(String[] args) { System.out.printf("Hello world!");

} }

Let’s see a code sample in C language. The following program reads a file (input.txt) and copies its contents into another file (output.txt):

#include <stdio.h>

int main(int argc, char **argv) { FILE *in, *out;

The same program in Python is shorter and easier to read:

in = open("input.txt")

out = open("output.txt", "w") out.writelines(in)

in.close() out.close()

A one-liner could also do the job:

open("output.txt", "w").writelines(open("input.txt"))

Let’s see a Perl program that calculates the average of a series of numbers:

sub avg(@_) {

The equivalent program in Python def avg(data):

The purpose of this Python program could be almost fully understood by just knowing English.

Python is designed to be a highly readable language.5 The use of English keywords, the use of spaces to limit code blocks and its internal logic (in-dentation), contribute to this end. Its possible to write hard to read code in Python, but it requires a deliberate effort to obfuscate the code.6

Speed

Another parameter to consider when choosing a programming language is code execution speed. In the early days of computer programming, computers were so slow that some differences due to language implementation were very significant. It could take a week for a program to be executed in an interpreted language, while the same code in a compiled language could be executed in a day. This performance difference between interpreted and compiled languages stays with the same proportion, but it is less relevant. This is because a program that took a week to run, now takes less than ten seconds, while the compiled one takes about one second. Although the difference seems important, it is not so relevant if we consider the development time.

This does not mean that execution speed does not need to be considered.

A 10X speed difference can be crucial in some high performance computing operations. Sometimes a lot of improvements can be achieved by writing optimized code. If the code is written with speed optimization in mind, it is possible to obtain results quite similar to the ones that could be obtained in a compiled language. In the cases where the programmer is not satisfied with the speed obtained by Python, it is possible to link to an external library written in other language (like C or Fortran). This way, we can get the best of both worlds: the ease of Python programming with the speed of a compiled language.

1.5.3 How It Is Used?

Python has a wide range of applications. From cell phones to web servers, there are installed thousands of Python applications in the most diverse fields.

There is Python code powering Wikipedia robots, the OLPC (One Laptop Per Child) project7, and it is the scripting language of the OpenOffice suite.8

5Other languages are regarded as “write only,” since once written it is very difficult to understand it.

6A simple print ’Hello World’ program could be written, if you are so inclined, as print ”.join([chr((L>=65 and L<=122) and (((((L>=97) and (L-96) or (L-64))-1)+13)%26+((L>=97) and 97 or 65)) or L) for L in [ ord(C) for C in ’Uryyb Jbeyq!’]]) (py3.us/1).

7http://wiki.laptop.org/go/OLPC_Python_Environment

8http://wiki.services.openoffice.org/wiki/Python

Some languages are strong in one niche (like Perl and PHP for web appli-cations, Java for desktop programs), but Python can’t be typecasted easily.

With a single code-base, Python desktop applications run with a native look and feel on multiple platforms. Well known examples of this category include the BitTorrent p2p client/server, Emesene an IM client for Windows Live Messenger, media players like Exaile and Tim Player and even a CAD package, PythonCAD.

As a language for building web applications, Python can be found in Zooomr.

com, a popular image sharing site as well as several other Web sites like Google, Yahoo and Nasa.gov. There are specialized software for building Web sites (called webframeworks) in Python like Django, Pylons, Zope and TurboGears. Tools for accessing webservices are also available in Python (Yahoo Python Developer Center,9Google Data API,10Facebook API.11)

Python also excels in small one-use scripts. Not all programs are meant to be publicly released, some are built just to solve a user’s problem. From system administration to data analysis, Python provides a wide range of tools to this end:

• Generic Operating System Services (os, io, time, curses)

• File and Directory Access (os.path, glob, tempfile, shutil)

• Data Compression and Archiving (zipfile, gzip, bz2)

• Interprocess Communication and Networking (subprocess, socket, ssl)

• Internet Data Handling (email, mimetools, rfc822)

• Internet Protocols (cgi, urllib, urlparse)

• String Services (string, re, codecs, unicodedata)

Python is gaining users in the scientific community. There is library (SciPy) that integrates several modules like linear algebra, signal processing, opti-mization, statistics, genetic algorithms, interpolation, ODE solvers, special functions, etc. Python has support for parallel programming (if you have appropriate hardware) with the pyMPI and 2D/3D scientific data plotting.

Python is known to be used in wide and diverse fields like engineering, electronic, astronomy, biology, paleomagnetism, geography, and many more.

9http://developer.yahoo.com/python

10http://code.google.com/p/gdata-python-client

11http://wiki.developers.facebook.com/index.php/PythonPyFacebookTutorial

1.5.4 Who Uses Python?

Python is used by several companies, from small and unknown shops up to big players in their fields like Google, Yahoo, Disney, NASA, NYSE, and many more.

Google for instance has three “official languages” for deploying in pro-duction services: JAVA, C++ and Python. They have Web sites made in Python,12 stand-alone programs13 and even hosting solutions.14 As a confir-mation that Google is taking Python seriously, in December 2005 they hired Guido van Rossum, the creator of Python. He is working most of the time improving Python. It may not be Google’s main language, but this shows that they are a strong supporter of it.

Even Microsoft, a company not known for their support of open source programs, have developed a version of Python to run their “.Net” platform.

This version is called IronPython.

Many well-known Linux distributions already use Python in their key tools.

Red Hat’s Anaconda installer, and Gentoo’s Portage package manager are two examples. Ubuntu Linux (the most successful Linux distribution at this time) “... prefers the community to contribute work in Python.” Python is so tightly integrated into Linux that some distributions won’t run without a working copy of Python.

1.5.5 Flavors of Python

Although in this book I refer to Python as one specific programming lan-guage, Python is actually a language definition. What we use for programming is a specific implementation. Since there is an implementation that is used by most Python programmers (cPython, also known as Python), this subject is usually overlooked by some users.

The most relevant Python implementations are: cPython, PyPy,15 Stack-less,16Jython17and IronPython. This book will focus on the standard Python version (cPython), but it is worth knowing about the different versions.

• CPython: The most used Python version, so the terms CPython and Python are used interchangeably. It is made mostly in C (with some modules made in Python) and is the version that is available from the official Python Web site (http://www.python.org).

12See the “.py” athttp://www.google.com/support/bin/topic.py?topic=352.

13http://code.google.com/p/sitemap-generators

14http://code.google.com/appengine

15http://codespeak.net/pypy/dist/pypy/doc/home.html

16http://www.stackless.com

17http://www.jython.org/Project

• PyPy: A Python version made in Python. It was conceived to allow pro-grammers to experiment with the language in a flexible way (to change Python code without knowing C). It is mostly an experimental platform.

• Stackless: Is another experimental Python implementation. The aim of this implementation doesn’t focus on flexibility as PyPy, instead, it provides advanced features not available in the “standard” Python version. This is done in order to overcome some design decisions taken early in Python development history. Stackless allows custom designed Python application to scale better than cPython counterparts. This implementation is being used in the EVE Online massively multi-player online game, Civilization IV, Second Life, and Twisted.

• Jython: A Python version written in JAVA. It works in a JVM (Java Virtual Machine). One application of Jython is to add the Jython li-braries to their JAVA system to allow users to add functionality to the application. A very well known learning 3D programming environment (Alice18) uses Jython to let the users program their own scripts.

• IronPython: Python version adapted by Microsoft to run on “.Net” and

“.Mono” platform. .Net is a technology that aims to compete with JAVA regarding “write once, runs everywhere.” Another use of IronPython envisioned by Microsoft is as a script language for running in the Web browser along Silverlight (a Flash-like Microsoft technology).

1.5.6 Special Python Bundles

Apart from Python implementations, there are some special adaptations of the original cPython that are packaged for specific purposes:

• Python(x,y): It is defined as a “free scientific and engineering develop-ment software for numerical computations, data analysis and data vi-sualization based on Python programming language, Qt graphical user interfaces (and development framework) and Eclipse integrated devel-opment environment.” In other words, it is a bundle of several Python related package to ease the use and installation. The main advantage of Python(x,y) is that by installing just one program you end up with a complete development environment that includes, Eclipse, IPython, C++, Fortran, Extensive documentation, Numeric, SciPy, Mayavi, and others. It is available athttp://www.pythonxy.com. Up to the moment of writing this, it was available only for Windows.19 The main drawback of this approach is that the resulting package is about 254 Mb long (or 150 Mb without Eclipse).

18Alice is available for free athttp://www.alice.org.

19With an “available soon...” for Linux on the downloaded page.

• Enthought Python Distribution (EPD): Another “all-in-one” Python solution. Includes over 60 additional tools and libraries, like NumPy, SciPy, IPython, 2D and 3D visualization, database adapters, and other libraries. Everything available as a single-click installer for Windows XP, Mac OS X (a universal binary for Intel 10.4 and above), and RedHat EL3 and EL4 (x86 and amd64). This bundle is suitable for scientific users, and it is made by the same people who made NumPy and SciPy. It is free for academic and nonprofit private-sector use, and for an annual fee for commercial and governmental use. It is available at http://

www.enthought.com/products/epd.php, and since it includes so many libraries, the resulting size is about 400Mb.

• PortablePython: A Python version capable of running without the need of installation. It can be used to carry a working program environment in a pendrive or any removable storage unit. Another application of PortablePython is to distribute Python program to people that can’t or don’t want to install Python (like some controlled corporate and academic environment).

Chapter 2

In document Python for Bioinformatics (Page 34-41)