Project management system - Supporting study documentation

4.3 Supporting study documentation

4.3.1 Project management system

The approach taken to unify the documentation across every step of the workflow to the data analysis part is to develop a web-based tool that offers access and support to every contrib- utor involved in a metabolomics experiment capturing all pieces of information necessary to document a study. As the omics technologies are constantly evolving, the developed tool requires a robust and flexible data structure as well as a modular design to be adaptable and scalable. Indeed, as the technologies mature, new information might need to be captured at every step of the workflow. In order to be flexible, the structure of the project management system follows the same design pattern as PiMP presented in chapter 3. The tool presented here was primarily developed to support metabolomics experiment documentation based on the metabolomics workflow shown in Figure 4.1; it was then expanded to support the documentation and data capture of three other omics: genomics, transcriptomics and proteomics.

Data structure

The data structure needs to support two types of users: (i) the principal investigator or lab scientist in charge of the study (collaborator), (ii) the metabolomics technologists, bioin- formaticians or other scientists (staff) contributing to the workflow. The main difference between these two user types is that the staff contributors use the tool to record information about the work being done, while the PI of the study uses the tool to access this information. The action that needs to be supported for the PI is therefore limited to accessing the information recorded by the staff users and attached to his projects. The staff users need greater support to be able to capture the information at every step of the workflow. The data structure supports the two different types of user using the table “User”, its field “is staff”, and the table “Collaborator”. A staff user will, therefore, have an entry in the User table with the field “is staff” set to True. A collaborator user would have an entry in the User and Collaborator table with the field “is staff” set to False. This flexible design allows one user to either be staff, collaborator or both.

The “Group” table in the authentication module is used to separate the staff users into the different omics; one user can be involved in one or more omics fields. This information can then be used for various purposes such as knowing what staff user can contribute to a specific omics project.

The “Project” table is the main table that organises the information that needs to be captured for a study. The “Genomics”, “Proteomics” and “Metabolomics” table inherit from the “Project” table; they, therefore, share the fields defined by the “Project” table, and also define their own fields in their respective table. As shown in the diagram in Figure 4.2, a project is divided into a set of tasks, each task being assigned to a staff user. The “Task”

4.3. Supporting study documentation 77 table, therefore, contains all information about every task performed during a study such as its status and date of completion. The note field can contain free text entry to give further details about a particular task, and the “Comment” table allows staff users to capture any issue or comments they may have encountered while performing a task.

Figure 4.2: Database structure of the project management system. The structure define three modules, each of them used to store different type of data. The authentication module is used to store user’s information and authentication details. The profile module store users’ preferences. The project module store and organise the information related to experiments. The inheritance design of the project table makes it extendable to any other omics.

4.3. Supporting study documentation 78 Three omics are currently supported and shown in the data structure in Figure 4.1, but the inheritance design enable an easy extension of the structure to other omics. The metabolomics table supports LCMS and GCMS metabolomics experiments, allowing the capture of every component of the workflow such as instruments and columns used, as well as the organism studied, the number of samples or the storage of files related to the experiment such as forms and reports. The proteomics table follows the same pattern as the metabolomics table with different fields specific to proteomics experiments. Finally, the genomics table supports DNA and RNA sequencing experiments allowing the capture of all information related to this omics technology.

As the management system was primarily developed to support LCMS metabolomics, the data captured by the tool is general enough to support other metabolomics laboratories. The genomics and proteomics parts of the system were, however, designed for Glasgow Poly- omics platforms only and are too specific to be transferred to other omics laboratories.

Web-enabled tool

The aim of the project management system is to support all contributors of an omics study, and more specifically a metabolomics study, in documenting their experiments within a uni- fied environment. Two main requirements were drawn to address this objective, the tool and data captured needs to be readily available for all contributors, and the users need to be guided during the documentation process. A web-enabled tool following the same MVT design as PiMP presented in chapter 3 was chosen to address the first requirement. The second requirement was addressed by structuring the data capture task. The documentation process requires the capture of different types of information that can be separated into two groups: (i) Static information, (ii) dynamic information. The static information is usually captured at a particular time of the project and is not or rarely changed after. Most of the static information such as the organism studied, the number of samples or the instrument chosen to perform the experiment is set at the beginning of the project, but other information such as quality control reports on the data acquired can be recorded later in the project. The second type of information which is dynamic evolve during the project, this type of information relates to a specific step of the workflow involving an action such as the sample preparation, data acquisition or data analysis.

The two different types of data to capture are reflected in the data structure by simple fields in the project table for the static information, and the task table for dynamic information. The task, therefore, supports the capture of complex information such as time stamp, completion status or the staff member performing the task. Every step of the metabolomics or other omics workflow requiring an action is therefore translated into a task to store all the information related to it.

4.3. Supporting study documentation 79 The user interface is accessible through a web browser and is developed using the same web standard as PiMP. Figure 4.3 shows one page of the user interface.

Figure 4.3: Screen shot of the user interface of the management system. This picture shows the form to be filled in to create a new project. The navigation bar at the top allows the user to access the project list, the client list, its assigned tasks, and its account and preferences. The client users only have access to ”my projects” page to visualise the details of their own projects and their progress. The management system user interface is developed using the same web technologies as PiMP (Django, html5, CSS3, javascript)

4.3. Supporting study documentation 80

In document Supporting analysis, visualisation and biological interpretation of metabolomics datasets (Page 78-82)