From collaboration tool to
semantic e-record: The evolving
role of the Electronic
Laboratory Notebook (ELN)
James D. Myers1, Charles E. Arp2, Tara Talbott3, and
Michael Peterson3
1National Center for Supercomputing Applications 2Battelle
3Pacific Northwest National Laboratory
The ELN
•
Secure, shared Web/Java-based system•
HierarchicalChapters/Pages/Notes
•
Editors including file upload, sketch, text, equations, forms, screen capture, (instruments, Word, …)•
Interactive views of data•
Add/View/Search Notes•
Extensible via Editor/Viewer APIsELN in Nature
http://www.nature.com/news/2005/050704/full/436020a.html
A Scientific Content Repository
Vision
•
Notebooks today are just one view of the scientific record•
Applications contribute data, metadata, andrelationships directly
We should aspire to go beyond the current paper notebook paradigm and develop systems that can reintegrate notes, data, literature, derived
•
From the DOE Data Management Workshops Report:•
“… the data management challenge for systems-oriented research is not simply about data volume. More critical is the fact that the data involved are produced by multiple techniques, at multiplelocations, in different formats and then analyzed under differing assumptions and according to
different theoretical models.”
•
Local sample prep, use of a national beam line, comparison with a HPC model…Scientific Annotation Middleware
•
A layered middleware designed to manage data annotations and semantic relationships.•
Built on the Jakarta Slide content management system which uses the webDAV protocol for managing data and metadata.•
Metadata Services Layer– Property Generation from binary/ASCII/xml files – Dynamic Virtual Translations
– Server generated Properties and Relationships
•
Semantic Services Layer– RDF/GXL Pedigree Generation
•
Notebook Services LayerELN as part of a comprehensive system with simple on-ramps
•
Binary Æ XML Æ Properties•
Translation of Chemistry Data•
SAM-based Electronic Notebook•
CMCS Portal/Pedigree Browser Fortran Application ‘Local Disk’ DataGrid DAV DAV+ JMS ELN ECCENotebooks and Portals on
•
Distributed Scientific Content Systems•
Lab/Community Information Management Systems•
Semantic GridsClient-Server adaptations for
Notebook functionality
•
Encapsulation of the ELN’s client-server functionality•
Dynamic determination of server information•
Notebook retrieval and submission•
Notebook configuration•
Leveraging SAM– Relationships/ data provenance – DASL based Search
– JMS services, email notifications – Viewer API == Sam translations – Instrument API == webDAV
Digital Signatures @ PNNL
•
Entrust used in a growing number of desktop/web applications:– Secure Email – Timecards
– Travel reporting – …
XML Digital Signature Standard
•
Implemented by Entrust•
Integrated into ELN•
Enables signature validation by other softwareTotal Records Information
Management(TRIM) @ PNNL
• This electronic document and records management system manages the full lifecycle (creation to final disposition) of record and non-record information.
• Access Control
– The TRIM application is available to all PNNL staff. Although staff can install TRIM, through the use of access control and security measures, they cannot get to the data stored in the database without being granted specific access and
permissions.
• TRIM Metadata
– As the document is registered, TRIM automatically applies metadata tags such as the date created, who created it, who is responsible for the document, a retention schedule, a file classification and other searchable fields.
• Current TRIM Usage
– At the beginning of CY04 there were over 250 users and 1.1 million documents in TRIM. This encompasses over 10 departments and more than 20 projects.
• Web Services
– PNNL and collaborators have developed web service interfaces for TRIM enabling other applications to directly archive information
ELN/TRIM integration
•
SAM records notebook metadata containing necessary project info.•
SAM tracks changes to notebook and translation configuration.•
Export – XML Document(s) containing all notebook data/metadata as well as important server information.•
Many discussions about archiving model– Current plan is to only archive notebook when it is complete and finalized.
Records Mgmt interest @ PNNL/Battelle
•
History of engagement in the Environmental Molecular Sciences Laboratory (EMSL)•
CENSA Experience•
Ongoing technical discussions with PNNL IT (operations) department•
Recent interest from Battelle Records Mgmt Office (via Charlie)– E-notebook tech background research
– Industry surveys (CENSA LAGER, Atrium) – Legal issues, mock trial? …
Summary
– What has been important to get to this point?
• Tech experience& mgmt experience are key
• Existing ELN use as a collaboration/productivity tool in EMSL/PNNL/world
• Issues abound but teamwork works (DOE SciDAC/NSF Cyberinfrastructure)
• Combination of cost savings/efficiency and better science arguments
– What’s still hard?
• Legal uncertainty
• Market uncertainty
• Broad definition of notebooks
• Coupling to specific science domains/specific databases/archives
• Lack of ‘basic research’ support
• Lack of standards (not necessarily e-notebook specific) and open source implementations
Resources/Acknowledgments
• Collaboratory.pnl.gov
• www.sicdac.org/SAM/
• Eln.sourceforge.net
• Sam.sourceforge.net
• Re-Integrating The Research Record, James D. Myers, Alan R. Chappell, Matthew Elder, Al Geist, Jens Schwidder, Computing in Science and Engineering, May/June 2003
• SAM coauthors and colleagues
• U.S. Department of Energy
–
• Pacific Northwest National Laboratory
• NCSA/National Science Foundation