Modelling biological systems
3.1 The BRAINCIRC modelling environment
3.1.3 Markup languages for modelling
Markup languages are popular formats for exchanging information. The two commonly used markup languages for biological modelling are Systems Biology Markup Lan-guage (SBML) and CellML. SBML is a XML based format for representing reaction networks [183, 184]. CellML [185] is a similar project, but has less of a focus on chem-ical reactions, and concentrates more on reusing models in a modular way. Other XML based formats exist for exchanging different types of information, including FieldML for representing mathematical fields such as the distribution of biochemical compounds, and Simulation Experiment Description Markup Language (SED-ML) for encoding simulation details. Only SBML and CellML will be further discussed here, since these are the most well-established, and the most relevant to the BRAINCIRC software.
Online databases exist to share models in both the SBML and CellML formats. The BioModels database [178] is the largest of these, the latest version (June 2013) contains 143 013 models, however only 963 of these are published models with the remain-der generated from reaction pathways. The database accepts models in both SBML and CellML formats. The CellML model repository [186] is specifically for CellML models and (accessed August 2013) contains 458 models. Both of these repositories consist mainly of models concentrating on specific aspects of biochemistry, although they also contain larger and organ scale models. The CellML model repository is di-vided into categories; there are 33 models in the ‘Neurobiology’ category. Only one of these, the model by Cloutier et al. [152] is a general model of brain metabolism. The
‘Metabolism’ category contains 78 models, including several of the models mentioned in Chapter 2 [152, 146, 166, 144, 170]. The BioModels database is not divided into categories in a similar way, however a search for ‘brain’ returns 144 models including, again the model by Cloutier et al. [152] and also the Braincirc model by Banaji et al.
[180] described in Section 2.2.
There are many different pieces of software for creation, analysis and use of SBML and CellML models. The SBML website [187] contains a software matrix to compare different software tools, of which there are currently 254 listed including BRAINCIRC.
The CellML website lists around 40 software tools, including two main multi-purpose modelling environments, OpenCell [188] and OpenCOR [189].
Given the popularity of these two XML formats, it is important that the BRAINCIRC environment is capable of importing and exporting models encoded in SBML and CellML. BRAINCIRC can import SBML models and convert them to the BRAIN-CIRC format, however it has no export facilities. Therefore, an SBML exporter for models in the BRAINCIRC format has been developed. The details of this, and the differences between the two formats is discussed in the following section. Adding sim-ilar capabilities for CellML is a future priority. There are tools that allow conversion between CellML and SBML in both directions. However, these currently have limited functionality, although they are likely to be improved in the future.
Comparison between BRAINCIRC format and SBML
The SBML exporter is written in python using the libSBML python API [190]. There are several differences in the way SBML and BRAINCIRC specify models that can render the import and export processes imperfect; the main problem being that chemical reactions are not preserved. The current export method translates all BRAINCIRC parameters and variables into SBML ‘parameters’. Reactions are translated into terms in the differential equations that govern the rate of change of these SBML parameters, rather than into SBML reactions. This reason for this arises from the different way in which chemical reactions are represented in the two formats, which may stem from the fact that SBML is more oriented towards reaction networks than BRAINCIRC.
In SBML, quantities taking part in reactions must be defined as species. Each species must reside within a defined compartment. The rate of chemical reactions must be given per amount of species, rather than (as is more usual) per change in concentration.
This is so that reactions between species in different compartments, such as transport reactions, are handled correctly. To illustrate this, consider the reaction R1described by
S1 −−→ S2 (3.1)
where S1and S2are species in the same compartment. The rate of the reaction is given
where k is a constant. In SBML, the reaction would be represented as
<times /> <ci> k </ci> <ci> S_2 </ci> <ci> c_1 </ci>
</apply>
</math>
</kineticLaw>
</reaction>
where c1 is the volume of the compartment containing S1 and S2. The kineticLaw element describes the rate of change of the amount of S1and S2in moles (AS1 and AS2) and it is then assumed that
dAS2
which is equivalent to Equation 3.2. If S2 is instead defined to reside in a different compartment from S1 which has a volume c2, Equation 3.2 is no longer valid. The above SBML reaction definition does remain valid however, because the rates are now considered to be
d(c2[S2])
dt = −d(c1[S1])
dt = kc1[S1]. (3.5)
BRAINCIRC does not have a global concept of compartments, and neither does it dis-tinguish between a species, and a non-chemical variable. Reaction rates are specified in the more usual way, as relative to concentrations of the chemicals involved. The reaction above, with the two substances in one compartment, would be represented in BRAINCIRC as
name: R_1 type: MA1
left: 1.0, S_1 right: 1.0, S_2 rates: k
and the rate terms in Equation 3.2 would be automatically generated when the model is compiled. If a reaction includes chemicals in more than one compartment, this must be made explicit in its definition by specifying, for each chemical involved, the volume of the compartment relative to the volume of the native compartment of the reaction. In the second case therefore, with the two substances in different compartments, the reaction definition must be changed to
name: R_1 type: MA1 left: 1.0, S_1 right: 1.0, S_2 rates: k
comps: 2
compsleft: 1.0
compsright: (c_2 / c_1)
in order for the correct rate equations to be generated.
The BRAINCIRC method is simpler than the SBML method for models with only one compartment. It also gives greater flexibility; but in practice this flexibility is rarely taken advantage of, and can lead to mistakes. In its current form therefore, the method for specifying chemical reactions in BRAINCIRC is better thought of as a short-cut for writing differential equations rather than a comprehensive specification of reactions. An improvement would be to adopt a system more similar to SBML in which the compart-ment of a chemical is defined.
Also, currently in BRAINCIRC, every variable involved in a chemical reaction is taken to be a true variable, even if (as is commonly the case) some can be expressed in terms of others as temporary variables. Identifying sets of true variables and temporary variables, would simplify the simulation process. This is not straightforward however, because the stoichiometry may be expressed in terms of parameters, the values of which are not defined until runtime. A proper analysis of the reactions would therefore require symbolic mathematics.
Another difficulty in translating BRAINCIRC models into SBML models is the feature
Mathematical expressions in SBML are defined using a restricted subset of Content MathML. MathML is a recommendation of the W3C group for describing mathematics [191]. It can be used for describing both the presentation of displayed mathematics, and the content of mathematical expressions. It is not possible to translate any C function which can be used in BRAINCIRC into MathML. For example, reading and writing to files is possible in BRAINCIRC parameter functions. There is currently very limited support for translating these explicitly defined C functions into MathML (where this is possible) and this could be improved. However, the BrainPiglet model does not use these functions at all, and BrainSignals uses them only for very simple conditional statements, which can be translated by the exporter.
The final difference between SBML and BRAINCIRC that will be discussed is units.
The use of units in SBML is optional. However, if a model involves quantities with units, describing them is useful for avoiding mistakes, and vital if the model is to be understood by others. BRAINCIRC does not have any methods of dimensional analy-sis, but does allow for optional descriptions of each model quantity, which can include units. The exporter has an option to include SBML units.
The libSBML library has functions to check the consistency of models, which can in-clude checking units are dimensionally consistent. To make it easier to achieve consis-tent dimensions, SBML allows a unit attribute to be added to numbers in mathematical expressions. This is useful for numbers in expressions which are not dimensionless, such as multiplying by 100 to convert between metres and centimetres. The alternative, to define these numbers as parameters, would in general overcomplicate the model.
There is no equivalent to this in BRAINCIRC, and so these units must be added manu-ally to an exported SBML model to create a model with consistent units. This was car-ried out for both the BrainSignals and BrainPiglet models, and the units were checked for consistency.