With this chapter I have described the development of the JWS Online system into a usable web-based repository of kinetic models. Through the use of a combination of freely available programming languages – Java and Python, commercial software – Mathematica and J/Link, and open source software – the Apache Web server, it is now possible to take a model described ‘on paper’ and make it freely accessible via the internet on three international servers. Amongst other things, this means that many classical models that have been often referred to, but never actually run, can now be worked with and studied. It is also possible for an author to submit a paper containing a model to a journal and have the model accessible first for the paper’s reviewers and then for the public, as soon as the paper is published. JWS Online has also been used as a concept for larger Silicon Cell projects15. Although these larger projects are in the conceptual
stages, the JWS Online system has already been used very successfully in the teaching of modelling at both an undergraduate and postgraduate level.
At the time of writing, we are unaware of any freely available, online modelling system that provides services equivalent to those provided by JWS Online to the broader scientific community.
Although a classic clich´e, ‘the proof of the pudding is in the eating’ still contains a measure of truth in that some of a service’s utility depends on its usage. Since its initial inception in 2002, we have seen a steady increase in the number users regularly using the JWS Online system. For example, analysing the log files of the Stellenbosch (South Africa) server shows that over a period of a year, we have repeated usage of the system from clients connecting from Asia, Europe, the USA and South America. This shows that JWS Online is starting to achieve its goal of becoming a global Computational Systems Biology resource.
15
http://www.siliconcell.net/refers to the Amsterdam JWS Online mirror site as: ‘Silicon Cell ready to use: the website with silicon cells that can be run over the web’
8 Conclusion
8.1
Future plans
8.1.1
PySCeS
Of course no software application is ever complete and PySCeS has a number of modules which are either in the planning or early development stages.
Systems Biology Markup Language
One such module currently being tested is an interface to the Systems Biology Markup Language1(SBML) [121]. SBML provides a neutral formal language for exchanging
models between various modelling programs. The current SBML implementation (Level 2) tries to cater for a wide range of cellular model types and such as metabolic networks, signal transduction cascades, and gene regulatory networks. However, its flexibility also makes it rather cumbersome and difficult to use. A real danger is that applications may use SBML in a non-standard way, so generating models which cannot be interpreted by other applications. Nevertheless, being able to read and write SBML is invaluable in any application that aims to be of use for the systems biology community. SBML- compatibility of PySCeS has therefore has been given a high development priority. To be fully inter-operable with other SBML application, it is important that PySCeS be able deal with multi-compartment models, which means that functionality for specifying and using compartment volumes must in future be incorporated.
1
PySCeS will soon provide a facility for saving models in either SBML level 1 or 2 format, and converting SBML model files to PySCeS input files. It will use the standard libsbml-library which is freely available from the SBML group and already has Python bindings.
Stochastic modelling
At present PySCeS is limited to the simulation of deterministic systems. Recently, the modelling of signal transduction and gene regulatory networks that deal with small numbers of component molecules has become increasingly important. It is therefore vital for any modelling application to be able to perform stochastic simulations with algorithms such as those developed by Gillespie and others [190, 191].
Since PySCeS developed in a research environment stochastic that concentrates mostly on deterministic modelling, incorporating the capability for stochastic simulation ini- tially had a low priority. However, no modern cellular simulator is complete without the ability to model stochastic processes and we plan to write PySCeS interfaces to the relevant algorithms.
Strategies for dealing with large models
As more and more of the cell’s components are identified and characterized and with the modelling of new types of cellular system (for example signal transduction networks) it is inevitable that models will grow rapidly in size and interconnectedness. This has a direct impact on the speed of an application; for example, matrix operations become significant bottlenecks in program execution. As far as PySCeS is concerned there are a few strategies that we plan to employ to deal with this problem at a computational level.
The first strategy would be to include support for sparse matrix data types and basic operations on sparse matrices. An interesting development in this regard is that the SciPy developers have begun to build sparse matrix support into SciPy and, when fully integrated, these routines will be available for use by PySCeS. A second strategy, which could be especially relevant for stoichiometric analysis, involves switching from the conventional LAPACK routines to modern iterative solvers and factorization algorithms
[192]. A third approach involves parallelization of the core solver and linear algebra routines so that PySCeS is able to take advantage of a high performance, multiprocessor architecture such as a Beowulf cluster2. This could also include a job scheduler that
distributes parameter and optimization routines over multiple processors.
The wider use of PySCeS
PySCeS has now developed up to a point where it fulfills its original design specifications of accessibility and flexibility, and includes the tools that were planned when it was first conceptualized. However, to move beyond being an in-house tool and become an widely- used modelling program it needs to be accessible to the systems biology community. To that end it is significant that PySCeS is the only interactive modelling application that runs on all the major operating systems, namely Windows, Linux and OS10. It also has significant advantages in being based on a standardized library of algorithms (SciPy) and in that it can be simply included as a module in a larger application. This makes PySCeS an attractive point of departure for other projects; the design of PySCeS should allow it to integrate with other systems biology tools and frameworks, such as the Systems Biology Workbench (SBW) [23]. To make PySCeS available as widely as possible is can be obtained freely from the major repository of open source software, namely Sourceforge.
8.1.2
JWS Online
The JWS Online project already plays an important role in the systems biology commu- nity in that it is at present the only interactive repository of metabolic models. However, up until now we have concentrated on optimising its interactivity rather than its function as a database. To rectify this, we have started to make every model in the repository available for download in either SBML or as a PySCeS input file. This will strengthen JWS Online’s claim to be a repository of curated models.
On the technical side of things JWS Online is being redeveloped to use webMathe- matica which will greatly simplify the management and administrative aspects of the JWS Online system.
2