Usage - Installation and usage - Exploring design principles of cellular information processing

Chapter 7 Conclusion

A.2 Installation and usage

A.2.3 Usage

Workspace creation

Depending on your specific application, BioJazz will require some customized configuration and scoring functions. Also, during a single design runs, BioJazz will generate large number of files. For this reason, the user must create a properly configured workspace which will contain the appropriate configuration files, scoring functions, and design files. To facilitate this, BioJazz can create the workspace for you and populate it with the required directories and with template files to get you started. To do this, run the following command:

1 b i o j a z z −−command=’ c r e a t e w o r k s p a c e (” b j a z z ”) ’

This will create the directory bjazz and various sub-directories including config and custom. Your configuration files go in the config directory, while your custom scoring functions go in the custom directory. At this point, the user should familiarize him/herself with some the template files that are provided, and try to run BioJazz.

The example file will try to design a network which contains a signalling cascades, and demonstrates how to use some functions available to the user.

2 l e s s c o n f i g / u l t r a s e n s i t i v e . c f g # u l t r a s e n s i t i v e c o n f i g u r a t i o n f i l e 3 l e s s c o n f i g / U l t r a s e n s i t i v e . pm # u l t r a s e n s i t i v e a p p l i c a t i o n−s p e c i f i c

s c o r i n g f u n c t i o n

Running BioJazz

After installing the required Perl modules, it is time to run BioJazz. Thecluster type

andcluster size arguments override the specification contained in the configuration file, and will launch both slave nodes of the cluster on your machine.

1 b i o j a z z −−c o n f i g=c o n f i g / t e m p l a t e . c f g −−t a g= f i r s t t r y −−c l u s t e r t y p e=”

LOCAL” −−c l u s t e r s i z e =2

This will evolve the network for only a couple generations. The tag argument is very important. In BioJazz, each design attempt is associated with a specific, user- specified tag. BioJazz will create a directory in your workspace containing all the results and other files generated during the optimization. This allows the user to attempt several optimizations simultaneously without fear of accidental loss of files. The name of the design’s working directory iswork dir/tag. Thework dir parameter is specified in your configuration file (and has a value of template in this example). The results of the above run are contained in the directoryultrasensitive/first try.

1 [ u s e r @ h o s t b j a z z ]\$ l s −l a u l t r a s e n s i t i v e / f i r s t t r y / 2 t o t a l 168 3 drwx−−−−−− 5 u s e r group 4096 2013−06−03 1 4 : 5 3 . 4 drwx−−−−−− 3 u s e r group 4096 2013−06−03 1 4 : 5 1 . . 5 drwx−−−−−− 2 u s e r group 4096 2013−06−03 1 4 : 5 3 matlab 6 drwx−−−−−− 2 u s e r group 4096 2013−06−03 1 4 : 5 3 o b j 7 drwx−−−−−− 1 u s e r group 4096 2013−06−03 1 4 : 5 3 r e p o r t 8 drwx−−−−−− 1 u s e r group 4096 2013−06−03 1 4 : 5 3 s t a t 9 drwx−−−−−− 2 u s e r group 4096 2013−06−03 1 4 : 5 1 s o u r c e 2 0 1 3 −06₋03₋14:51:58

Theobj directory contains all the genomes generated in a machine-readable form. Thematlab contains the models generated by ANC, and the Matlab scripts

generated by Facile. The stat contains the output information of each genome in each generation in.csv files. Thesource* directory is a snapshot of the source code used for that run such as your configuration and custom scoring files. Now you can try modifying the configuration file to use other available workstations and run BioJazz again.

Workspace directory structure

1 b j a z z # workspace home

2 c o n f i g # c o n f i g u r a t i o n f i l e s

3 custom # a p p l i c a t i o n−s p e c i f i c modules and

f u n c t i o n s ( i n c l . s c o r i n g f u n c t i o n ) 4 t e s t/ custom # recommended l o c a t i o n f o r t e s t r e s u l t s o f custom modules 5 t e s t/ modules # B i o J a z z module t e s t r e s u l t s 6 u l t r a s e n s i t i v e # a p p l i c a t i o n−s p e c i f i c d i r e c t o r y 7 f i r s t t r y # r e s u l t s d i r e c t o r y f o r run with TAG=08 ju n 0 1

8 matlab # ANC genome models , eqn f i l e s ,

and matlab f i l e s 9 o b j # genome o b j e c t s i n b i n a r y form 10 r e p o r t # p o s t e v o l u t i o n a n a l y s i s 11 s t a t # i n f o r m a t i o n about i n d i v i d u a l genome i n each g e n e r a t i o n Initial Generation

The initial generation can be either generated randomly or loaded from disk, as specified by the initial generation parameter of the configuration file. In the random case, the user can also specify the number of individuals to create (parameter

inum genomes) and the genome length (parameter – currently fixed at 5000). Load- ing from disk is useful to resume work on a partially completed design starting from the last generation created, or to load hand-crafted seed designs. The following

shows some examples for each case: 1 i n i t i a l g e n o m e = random # random g e n e r a t i o n 2 i n i t i a l g e n o m e = l o a d t e s t / modules / U l t r a s e n s i t i v e . o b j # l o a d a hand−c r a f t e d network 3 i n i t i a l g e n o m e = l o a d u l t r a s e n s i t i v e / t e s t / o b j / G427 I∗. o b j # l o a d a l l i n d i v i d u a l s o f g e n e r a t i o n 427 o f p r e v i o u s run

Regardless of how the initial generation is created, each network is stored under the following name in the working directory of the design:

1 $DESIGN WORK/ o b j /G∗∗∗ I %%.o b j

Where*** is the generation number and%% is the individual number.

In document Exploring design principles of cellular information processing (Page 142-145)