Title Page
IS1200
EMC Celerra FLR Retention Manager
User and Configuration Guide
Version 4.3.3
Last updated March, 2010
EMC Part Number 300-010-779
EMC SourceOne eDiscovery - Kazeon IS1200 EMC Celerra FLR Retention Manager User Guide
Version 4.3.3, 2010
Copyright © 2010 Kazeon Systems, Inc. All Rights Reserved.
This notice is intended as a precaution against inadvertent publication and does not imply any waiver of
confidentiality. Information in this document is subject to change without notice. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or information storage or retrieval systems, for any purpose without the express written permission of Kazeon Systems.
The software described in this document is furnished under a license agreement or nondisclosure agreement. It is against the law to copy the software onto any medium except as specifically allowed in the license or nondisclosure agreement.
Kazeon™ is the trademark of Kazeon Systems, Inc. All other trademarks and copyrights referred to are the property of their respective owners.
The text and drawings set forth in this document are the exclusive property of Kazeon Systems. Unless otherwise noted, all names of companies, products, street addresses, and persons contained in the scenarios are designed solely to document the use of Kazeon System products.
The Kazeon Information Server software is based in part on software licenses from the following: Outside In® Content Access © 1991-2010, Chicago, Inc.
The software is based in part on the work of the Independent JPEG Group.
Code from Inxight Software, Inc. Copyright © 1996-2010. All rights reserved. www.inxight.com. Certain icons used by the Kazeon Web applications come from the Silk Icon set
(http://www.famfamfam.com/lab/icons/silk/)
licensed under the Creative Commons Attribution 2.5 license (http://creativecommons.org/licenses/by/2.5/).
Kazeon Systems 1161 San Antonio Road Mountain View, CA 94043 Copyright Information
Table of Contents
Title Page
... i
Preface
... v
Audience...v Related Documentation...v Customer Support...vChapter 1: Introduction
... 1
About the IS1200...1
Extending Server Functionality...1
Celerra FLR Functionality...1
The Celerra FLR Retention Manager...2
Supported Celerra FLR Configurations...2
Celerra FLR Retention Manager Implementation Overview...2
Celerra FLR Retention Manager Implementation In New Installations...2
Celerra FLR Retention Manager Implementation In Existing Installations...3
Chapter 2: Registering Celerra FLR Repositories
... 5
Supported Configurations...5
Celerra FLR Repository Registration Requirements...5
Creating a Metadata Repository for a Celerra FLR Repository...5
Registering Celerra FLR Repositories...6
Registering Repositories Using Web-Admin...6
Registering Repositories From the CLI...8
Chapter 3: Workflow Changes
... 9
Celerra FLR Retention Capabilities...9
Celerra FLR Legal Hold Capabilities...10
Celerra FLR Collection Options...10
Administration...10
Searching...10
Reporting...10
Glossary
... 13
Preface
The EMC SourceOne eDiscovery - Kazeon IS1200 EMC Celerra FLR Retention Manager User Guide describes how to configure the IS1200 to use the optional Celerra FLR Retention Manager module.
To use the Celerra FLR Retention Manager module the appropriate optional add-on module licence must first be purchased and installed.
Audience
This guide is intended for Administrators, Business Analysts, and Compliance Auditors.
Related Documentation
IS1200 Installation and Quick Start Guide - describes installing and configuring the IS1200.
IS1200 Web-Admin (ECS, FRM, or SA) User and Configuration Guide- describes how to use the web-based Administration Interface to setup and manage IS1200 clusters.
IS1200 Web Search Interface Guide- describes how to use the web-based Search Interface to perform basic and advanced search.
IS1200 Web Reports Interface Guide- describes how to use the web-based Reports Interface to perform basic and advanced reports.
IS1200 eDiscovery Case Manager Administrator’s Guide- for legal representatives, a primer of all the web-based Interfaces above for performing eDiscovery.
IS1200 Command Line Interface Reference Guide - describes the IS1200 Command Line Interface.
Customer Support
You can contact Kazeon with questions or comments at [email protected] , or call from the US and Canada toll-free: +1.877.529.3668 (+1.877.kazeon8) and Internationally: +1.650.641.8196.
Chapter 1:
Introduction
This guide is provided as a companion to the IS1200 Web-Admin User and
Configuration Guide whichshould be read first as it contains most of the basic IS1200 server setup and maintenance information on which this guide builds.
About the IS1200
The IS1200 is an integrated hardware and software system that provides information management solutions enabling organizations to efficiently and cost effectively classify, manage, and retrieve data. They provide consistent information visibility and control across distributed files, minimize the risk of un-managed files, integrate seemlessly with existing infrastructure, and scale to support billions of files for searching, reporting, backup search and recovery, and file migration and archiving.
Extending Server Functionality
The standard IS1200 uses clustering to offer a scalable solution for classifying, searching, reporting on, and applying Actionable Services to search and report results found on registered data repositories. Data repository types include NFS, CIFS, and many other vendor-specific servers such as Microsoft Exchange, and Microsoft SharePoint servers.
The IS1200’s standard functionality, and the types of servers it can access, can be expanded with add-on modules like the Celerra FLR Retention Manager. The Celerra FLR Retention Manager requires an additional license for the IS1200 and allows that IS1200 to register and manage Celerra FLR Servers.
Celerra FLR Functionality
The main objective of the EMC Celerra FLR (file-level retention) server is to protect files from deletion or modification until a specified retention date. FLR allows creating a permanent, unalterable set of files and directories, and ensures the integrity of the data they contain for a controllable retention period.
This server prevents users from deleting or modifying files that are locked and protected. This is highly useful during legal issues where files must be preserved
Chapter 1: Introduction
without modification for the entire course of a legal case, and for organizations like health service providers that must retain medical records for fixed periods determined by legal mandates.
The Celerra FLR is designed to provide file retention at both the self-regulated and government-regulated levels.
The Celerra FLR Retention Manager
While the standard IS1200 can register standard Celerra shares that have been exported as CIFS or NFS, the Celerra FLR Retention Manager allows the IS1200 to register Celerra FLR servers and work with their FLR-specific retention capabilities. Once a Celerra FLR server has been registered as a data repository, and classified, it can be searched, reported on, and have Actionable Services such as Retention applied to search and report results.
Adding the Celerra FLR Retention Manager also enables the IS1200 to provide additional Celerra FLR-specific search options, retention reports, and new Action options. The IS1200 can transfer files to Celerra FLR servers, and access those files to classify and search them.
Most importantly, the Celerra FLR Retention Manager allows the IS1200 to help manage all the files on Celerra FLR servers including file settings such as retention times. Additionally, IS1200 standard retention reports such as “Expired Files”, “Retention by Aging”, and “UpForExpiry” allows identifying files that have passed their retention dates and Actionable Services such as Lock, Retention, Copy, Move, and Delete, allows report results (files) to have retention periods modified or extended, or to efficiently archive or delete expired files.
Supported Celerra FLR Configurations
The Celerra FLR Retention Manager supports EMC Celerra FLR versions 5.6.4.3 and above.
Celerra FLR Retention Manager Implementation Overview
If Celerra FLR Retention Manager capabilities are ordered with the original appliance purchase, the appropriate optional module license is automatically included in the server master license and is routinely installed along with the server license.If the Celerra FLR Retention Manager is purchased after the original installation, the Celerra FLR Retention Manager license key must be added to the installation before Celerra FLR repositories can be registered and managed. See the Installing License Keys chapter of the IS1200 Web-Admin User and Configuration Guide for details on obtaining and installing optional modules licenses for existing installations.
Celerra FLR Retention Manager Implementation In New Installations
The Celerra FLR Retention Manager software is automatically installed with all IS1200 installations (ECS, FRM, and SA), the Celerra FLR Retention Manager license key simply activates it.
Celerra FLR Retention Manager Implementation Overview
To use Celerra FLR Retention Manager purchased with a new installation, simply follow the standard hardware and software installation and configuration instructions in the IS1200 Installation and Quickstart Guide and then use the rest of this guide to configure the connectors.
Celerra FLR Retention Manager Implementation In Existing
Installations
If the Celerra FLR Retention Manager license key is obtained after the original IS1200 installation, use the following general steps to implement Celerra FLR Retention Manager capabilities.
1. Use the current Web-Admin application to add the Celerra FLR Retention Manager license key to the IS1200, and then quit and relaunch the Web-Admin
application. See the Installing License Keys chapter of the IS1200 Web-Admin User and Configuration Guide for details on obtaining and installing optional modules licenses for existing installations.
2. Register your Celerra FLR servers as IS1200 data repositories. 3. Classify your Celerra FLR repositories.
Chapter 2:
Registering Celerra FLR Repositories
Before Celerra FLR repositories can be registered, classified, and searched and reported on, they must be registered with the IS1200 and have a metadata repository assigned.
Supported Configurations
The Celerra FLR Retention Manager supports all Celerra FLR versions 5.6.4.3 and above. The Connector supports registering up to sixteen Celerra FLR repositories per IS1200 node (the same IS1200 support supplied for all NFS or CIFS repositories). Each Celerra FLR repository can consist of multiple Centera nodes.
Celerra FLR Repository Registration Requirements
To register and access Centera repositories, the IS1200 must needs the following: z A appropriate metadata repository to associate with the repository when it isregistered.
z If the Celerra FLR repository is exported as a CIFS share, an identity is needed that has complete access to the repository to be registered.
Creating a Metadata Repository for a Celerra FLR Repository
Before registering a Celerra FLR repository, one or more metadata repositories must be registered to store the extracted metadata. For information on adding metadata repositories, see the Repository Registration and Management chapter of the IS1200 Web-Admin User and Configuration Guide. For optimal performance, use metadata repositories on NFS systems.Dedicated IS1200 metadata repositories should be assigned to each Celerra FLR server. Because a Celerra FLR server typically host many terabytes of data, multiple metadata repositories may be needed to store the metadata for a single Celerra FLR server. Metadata repositories can use mixed NFS and CIFS protocols. If you find that
Chapter 2: Registering Celerra FLR Repositories
the number of metadata repositories is inadequate, register more metadata repositories.
Note: Metadata repositories may not be shared between two registered Celerra FLR servers or between a Celerra FLR cluster and any other registered data repository.
Registering Celerra FLR Repositories
Celerra FLR repositories may be registered with the IS1200 using either the CLI or
Web-Admin.
Registering Repositories Using Web-Admin
To register a Celerra FLR server as a data repository from Web-Admin, do the following: 1. In Web-Admin, select Repository View under Repositories in the left-navigation
menu. The Repositories tab opens:
2. In the Repository tab tool-bar, click Add Repository, the Add Repository tab opens. 3. Select NFS or CIFS from the Repository Type drop-down menu, depending on how the Celerra FLR repository you want to register was shared. One of the following dialogs appears.:
Registering Celerra FLR Repositories
Name. Enter a reference name (for the IS1200 to use when listing this Centera cluster in menus, for instance when specifying classification targets). Data repository names must be unique.
Metadata File System From the drop-down menu, select the metadata repository you created for the Centera cluster, or let the IS1200 auto select one.
Server. Enter the name of the NFS/CIFS file server hosting the repository to add. This may already be entered (and unchangeable) if registering a “discovered” data repository.
If you are registering an NFS repository:
Mount Path. Enter the mount path of the NFS repository on the host file server.
If you are registering a CIFS repository:
Share Name: Enter the share name, or mount point, of the CIFS repository to register.
If you are registering a CIFS repository:
Identity: Select a pre-defined user identity—from the drop-down list—to use when the IS1200 accesses this filer. If an appropriate identity is not available, click the Create Identity button to add one. For more information on identities, see The Identity Vault chapter of the IS1200 Web-Admin User and Configuration Guide. Specify Use: Select one of the following:
| Source Repository. Register this repository as a source, this includes the
reference name (specified above) in all dialogs where a repository can be chosen as a source, for example when doing a Collection in either Web-Admin
or the eDiscovery Case Manager.
| Target Repository. Register this repository as a target, this includes the
reference name (specified above) in all dialogs where a repository can be chosen as a target.
| Source and Target Repository. Register this repository as a both a source
and a target for dialogs.
Read Only. Check to indicate the filer being registered is Read Only to the IS1200. The option should be used if the repository (being registered) is exported or shared as Read Only. If this option is not set, the IS1200 assumes the filer being added is Read Write.
Repository Vendor. Check the EMC Celerra FLR checkbox.
WARNING! Once the EMC Celerra FLR checkbox is set, and the repository is registered, the option cannot be un-set! The only way to uncheck this option is to offline the repository, delete it, and re-add it without the option checked.
Storage Tier. Optionally, specify the storage tier where the data repository is located. The storage tier can be any number between 0 and 255. Default is 0.
Force add on errors. Select to force adding this device to the registered list in spite of errors.
Chapter 2: Registering Celerra FLR Repositories
5. Submit. Click to register the data repository.
Registering Repositories From the CLI
To register a Celerra FLR server from the Command Line Interface, use the following general Command Line Interface command:
add datafs <referenceName> mount <mountPoint> as <identity>
attributesHere
Where:
<referenceName> is the name the IS1200 should use when displaying this repository in repository selection menus.
<mountPoint> is the mount path for the repository.
<identity> is the name of an an identity (already stored in the IS1200 Identity Vault) to use to access the CIFS repository. <identity> is only required when adding CIFS repositories.
attributesHere is the keyword attributes followed by a comma separated list of attributes appropriate to the repository.
Examples follow:
To add an NFS repository from a Celerra FLR server and make it a source repository:
add datafs celerra_nfs mount celerraflr_server:/nfs1 attributes celerraflr=yes,fs_preserve_timestamp=no,source_repository=ye s,target_repository=no
To add an NFS repository from a Celerra FLR server and make it a source repository and a target repository:
add datafs celerra_nfs mount celerraflr_server:/nfs2 attributes celerraflr=yes,fs_preserve_timestamp=no,source_repository=y es,target_repositoy=yes
When adding CIFS repositories, an identity must be available containing credentials allowing complete access to the repository. Assume the identity celerraflr_identity
is available for the following examples.
To add a CIFS repository from a Celerra FLR server and make it a source repository:
add datafs celerra_nfs mount //celerraflr_server/cifs1 as celerraflr_identity attributes
celerraflr=yes,fs_preserve_timestamp=no,source_repository=ye s,target_repository=no
To add a CIFS repository from a Celerra FLR server and make it a source repository and a target repository:
add datafs celerra_nfs mount //celerraflr_server/cifs2 as celerraflr_identity attributes
celerraflr=yes,fs_preserve_timestamp=no,source_repository=y es,target_repositoy=yes
Chapter 3:
Workflow Changes
This chapter describes how installing a Celerra FLR Retention Manager license on the IS1200 changes the standard workflow procedures.
Note: Before continuing, be sure a valid Celerra FLR Retention Manager license key is installed on all nodes of your IS1200 cluster. See the Installing License Keys
chapter of the IS1200 Web-Admin User and Configuration Guide for details obtaining and installing a license keys if you have not already installed the keys.
Celerra FLR Retention Capabilities
When a Celerra Retention Manager license is added to the IS1200, new retention options become available in all screens, pages, or tabs that move or copy files to a Celerra FLR repository. The new options generally look like the following:
The new options work as follows:
Retention Date Selection: Check this box to determine the retention date to use:
| Absolute Date/Time: Select this radio button to set an absolute retention
date. Use the ( ) Time drop-down menu and the ( ) Calendar tool to determine the specific retention time and date.
| Relative Date: Select this radio button to set a relative retention date,
that is, a date determined using the following formula:
retention date = <soMany> <timeUnits> from <fileTimeAttribute> Use the fields to the right of the Relative Date radio button to set up the formula:
z use the first empty field to set the <soMany> number
z use the middle drop-down menu to set the <timeUnits>
z use the left drop-down menu to set which <fileTimeAttribute> to base the date on.
Chapter 3: Workflow Changes
Retention Class Selection: This radio button is present whenever an EMC repository is selected, but is not available (grayed out) for Celerra FLR repositories.
Celerra FLR Legal Hold Capabilities
Only Legal Hold withOUT Enforcement is available on Celerra FLR servers.
More specifically, while a metadata tag for legal hold may be set for files on registered Celerra FLR repositories, the standard IS1200 Actionable Services legal hold option “Enforce Legal Hold at the repository level” is not available for Celerra FLR files. The IS1200 cannot change the file privileges on registered Celerra FLR repositories to prevent users from moving, changing, or deleting files, even those with the legal hold metadata tag set.
Celerra FLR Collection Options
Collections done from either Web-Admin or the eDiscovery Case Manager to Celerra FLR repositories include options for setting the retention options on the target. See
“Celerra FLR Retention Capabilities” on page 9 for details on setting these options.
Administration
Web-Admin can register, classify, and do Single-step Collections to Celerra FLR repositories.
Searching
A new metadata namespace called “Retention” contains the following new fields:
retentionlock, retentiondate, retentionsetdate, retentionsetuser, retentionreportdate
All these fields are viewable using the Show Metadata icon from search results, but only retentionsetuser is routinely indexed and can be searched for.
Actionable Services like Copy, Move, that are applied to search results from Celerra FLR repositories contain new interfaces for setting retentions options on their targets. A new Actionable Service tab called Retention, allows extending retention settings. See “Celerra FLR Retention Capabilities” on page 9 for details on setting these new retention options.
Reporting
Web-Reports contains a report category called Retention reports. Retention reports list expired, soon-to-expire, and locked files allowing administrators to manage these files with Actions that reset retention settings or delete expired files.
Reporting
Retention reports are only for repositories such as Celerra FLR repositories that implement specific retention features for the files they contain. Retention reports run on filers without retentions capabilities return empty reports.
Retention reports prefixed by "Snaplock" are designed for SnapLock repositories. Running these on Celerra FLR repositories will return empty reports or errors. Actions like Copy or Move, that are applied to report results from Celerra FLR repositories contain new interfaces for setting retentions options on their targets. See
“Celerra FLR Retention Capabilities” on page 9 for details on setting these new retention options.
Glossary
Active Directory (AD)
A technology created by Microsoft that provides a variety of network services, including: LDAP-like directory services, Kerberos-based authentication, and DNS-based naming and other network information.
Actions, Actionable Services
Are procedures such as copy, move, delete, tagging, etc, that can be applied to search and report results. They allow the IS1200 to be and effective file management tool for registered repositories.
Access Control List (ACL)
A file system level data file that specifies how users or groups may access resources on a computer or network, like an application, file or printer, and the rights they have to it, for example read access, write access, and so forth.
Advanced Search
A search made from the Advanced Search link. Allows searching for extracted metadata by tag-value pairs, and allows multiple variable and boolean searches.
Assignment Rule
A classification rule that tags files with metadata and assigns files to policy groups.
Auditing
A service that allows the IS1200 to record all system events according to who did what, when, and the event result. This data is especially useful to Legal Service Providers when providing an audit trail for responsive data produced during eDiscovery.
Authorization Rule
A policy that filters search results to ensure that the assigned files can only be viewed by authorized users.
Glossary
Authentication
The process of identifying users based on user name and password to ensure that only authorized users can access the Kazeon Information Server.
Basic Search
A search made from the Search page using only the Search field. Searches only the content found in the fullText field populated during classifications.
CAS Device
EMC’s Content Addressed Storage (CAS) devices are cluster-able archival devices that host archival business file content such as email, office productivity files (like word processing and spreadsheet files), images, and other file documents.
CASID
A unique IS1200 ID for each classified file that the system generates during basic classification.
Centera Server
The EMC Centera server is a networked storage system specifically designed to store and provide fast, easy access to fixed content (information in its final form). It is a CAS device providing long-term retention and assured integrity designed to store and manage data that require or have legally mandated retention periods, for example medical records and files relevant to legal matters.
Celerra Server
An EMC server designed to store and manage archival data. The Celerra File Level Retention (FLR) server also allows enforcing enterprise or
governmental retention policies.
Classification Rule
Rules that the system implements during data classification to extract metadata, tag files, and assign files to policy groups. The two types of classification rules are extraction rules and assignment rules.
Classification Service
An IS1200 service that accesses registered repositories and extracts and records their metadata. Sometime called a “crawl”.
Cluster
A set of appliance nodes working as an Kazeon Information Server unit. A cluster can contain a maximum of four nodes.
CSV, Comma Separated Values
A file type used to transfer data between applications such as databases and spreadsheets.
Glossary
Command Line Interface (CLI)
The CLI is a traditional command line interface that allows direct
communications with the IS1200 “backend” using a the set of commands defined in the IS1200 Command Line Interface Reference Guide.
Common Internet Filing System (CIFS)
A protocol used by Microsoft to access computer systems and directories over the internet.
Container (file/object)
A file that contains other files, such as a ZIP, JAR, or PST file.
Custodian
A legal term used by Legal Service Providers (LSP) and other legal personnel to describe the owners or responsible parties for electronic documents pertinent (responsive) to a legal matter.
Data
A file of any type and size such as a short email, a word processor document, or a large spreadsheet.
Data Classification
The process during which the Kazeon Information Server reads data on the data file system. During basic classification, the Kazeon Information Server extracts file system metadata. During deep classification, it extracts custom metadata and assigns files to policy groups.
Datamap
A report that lists the electronic storage locations of all possible sources of relevant ESI. This can include standard file servers, groupware servers, email servers—and their backup and archive systems—as well as custodian’s desktop and laptop computers.
Data-Mount
The NFS file system that is accessed by the Kazeon Information Server to parse data and extract metadata.
Data Server
The file server that exports an NFS or CIFS file system so that the Kazeon Information Server can classify data on the file system to create metadata.
Data-Share
The CIFS file system to be accessed by the Kazeon Information Server to extract metadata.
Data Repository
A networked file system registered with the IS1200 so it can be classified, search, and reported on.
Glossary
Data Verification
Builds on Auditing and is only available when system auditing is enabled. For job services like Actionable Services Copy or Move, Legal Hold Copy, and Single-Step Collections, Data Verification generates an audit trail proving that files were not altered during these actions. This is especially valuable in eDiscovery situations.
Deduplication
A process that identifies file duplicates based on their digest values. Digests are numerical values that are calculated based on file contents and are unique for all unique file objects. Digest values allow allow file objects to be
compared very quickly. Digests are calculated differently for standard files, emails, and container objects. For standard files, a digest is computed for the entire file much like a hash value, for email objects the textual content of the file and certain specific addresses are combined and a digest value is
calculated from that, container objects are hashed both as a complete file and as individual sub-objects.
Deduplication may be applied to export processes (Actionable Services like Download, Export, and Copy) and classification and collection processes to move only unique copies of files to a destination. When deduplication is used a “manifest” list or report may be produced to list and identify all the
duplicates that were ignored in the export process, allowing all those duplicates to be identified later.
If any object, or sub-object, cannot be properly parsed during classification, no digest is produced for that object preventing duplicates of those objects from being identified.
Reduplication may also be applied to tagging processes to allow a single file copy to be reviewed, but apply metadata tags to all its copies.
Domino Sever (Lotus)
A Lotus server providing groupware solutions and storage.
Documentum Sever (EMC)
The EMC Documentum server manages business content including
documents, photos, video, medical images, e-mail, Web pages, fixed content, XML-tagged documents, etc. The Documentum core is a repository that stores content securely under compliance rules and appears as a unified environment, even though content may reside on multiple servers and physical storage devices within a distributed environment.
eDiscovery
The process of reviewing electronic files to determine their relevances and responsiveness to a legal matter or case.
eDiscovery Case Manager
A IS1200 web-application that facilitates eDiscovery for Legal Service Providers.
Glossary
Enterprise Vault
A Symantec networked repository for archived email.
Extended Attributes
User-defined keywords that are extracted during data classification.
Extraction Rule
Extracts user-defined keywords (custom metadata) to add to the metadata file.
Exchange Server (Microsoft)
A microsoft server designed to store and manage email.
Federation
A defined group of member-clusters on a Federation server that can be managed, searched, and reported on as a group. Member-clusters are referred to as Federated clusters.
Federation Server
An single-node IS1200 server, with a Federation license, that allows consolidated searching and reporting of up to eight Federated member-clusters of its defined Federation.
Filer
A file server that exports its file systems using NFS or CIFS protocol.
fullText
The “content” portion of a file, this is the textual content of word processing files and message content for emails. The extraction fulltext rule is used to save file content to metadata during classifications and saves up to 10 megabytes of content by default. This default may be changed, but it is not recommended. Fulltext extraction is required for the Previewer pane to work and to generate Concepts in the Results Grouping pane.
Groupware
Collaborative software designed to help people involved in common tasks achieve their goals. Incorporates services such as email, calendaring, text chat, wiki, web-sharing, document control, and advanced search.
Hash Values
Hash values are used to compare one file with another for duplicates. An extremely simplified description of hashing is that the numeric values of all bytes in a file are added into a grand total. The chances of two different files yielding the same result (hash value) are remotely small, so hash values can be used to identify duplicate files, or compare files with the same name to decide if they have been modified.
Computing hash on an entire file is called a full-hash, and computing hash on a portion of the file is called a partial-hash. A “partial hash” may also be used to increase classification speed and “hashing” can be turned on, or off to increase classification speed.
Glossary
Identity Vault
An encrypted database of usernames and passwords the IS1200 uses to store the credentials used to access registered data repositories, send email notifications, and work with authentication services.
Kazeon EVAgent
A IS1200 service installed on the Enterprise Vault server.
Kaz-mount
The NFS file system that is the IS1200 metadata repository. on which the Kazeon Information Server stores metadata.
Kazeon Query Language
A programming language used in classification and assignment rules to identify files that should received specified metadata tags.
Kaz-server
The file server where the metadata repository is located.
Kaz-share
The CIFS file system on which the Kazeon Information Server stores metadata.
Kaz Schema
Defines the set of metadata fields used to build a Search Index for registered data repositories (file systems).
Legal Hold
Files placed on legal hold are either copied to a secure secondary location where they can preserved for later use, or are locked in their original locations against further change until a legal matter is resolved.
Legal Service Provider (LSP)
A lawyer or trained legal professional that provides legal services for a fee.
Logging rule
Logging rules audit user actions on files such as file access, creation, modification, and deletion.
Local
Refers to the local resources (usually the metadata repository) of the Federation server.
Logging rule
Logging rules audit user actions on files such as file access, creation, modification, and deletion.
Member-cluster
Glossary
Metadata
Data about data. Metadata is used to search for information and to create reports. Metadata can be file system or custom metadata that the Kazeon Information Server extracts from files during classification. file system metadata includes file type, and file path extracted during basic classification. Custom metadata is generated during deep classification.
Metadata Repository
A registered respiratory the IS1200 uses exclusively to record the metadata extracted from registered data repositories during classification services.
Network File System (NFS)
A protocol used primarily by Unix based computers for accessing computer systems and filers over the internet.
Network Information System (NIS)
A network naming, administration, and authentication system for smaller networks that was developed by Sun Microsystems and is used primarily by Unix systems.
Node
A single Kazeon Information Server appliance.
PEA Files
A Pool Entry Authorization (PEA) file is generated by the Centera server administrator. A PEA file defines what applications and users can perform read, write, delete, query, copy, or hold operations for Centera objects.
Policy Group
Associates one or more authorization rule and logging rule with one or more files to protect information and audit user actions on files.
PST Files
Personal STorage files are generally used by email programs like Microsoft Outlook to store user email locally. PST files are also called “composite” files, because they are packages meant to efficiently store a number of smaller related files. Another example of a composite file is a ZIP storage file
Retention
The process of enforcing corporate or legal standards for how long certain kinds of files must be preserved for access. Examples of retained files include files responsive to legal matters and medical records.
Roles
All IS1200 have roles, either admin, auditor, or end-user. If a legal license is installed, there are also legalAdmin, legalSupervisor, legalReviewer, and custodian roles. All users have a role, and the role determines what parts of the IS1200 interface can be seen, and how much of search and report results are viewed.
Glossary
Search Index
An IS1200 database that stores and indexes the file content metadata (including extended attributes, and fullText) for standard and custom user-defined metadata produced by extraction rules during classifications.
SharePoint Sever (Microsoft)
A Microsoft server in the groupware category.
SourceOne Server (EMC)
The EMC SourceOne server is a comprehensive, policy-based system that automatically collects, organizes, indexes and retains messages and associated attachments and stores them in designated archives connected to shared storage. SourceOne provides indexed searching that works with both EMC storage and other brands such as IBM or NetApp.
Tags
The names of metadata fields. Tags are always associated with a value. For example, the metadata tag “filename” for any given file is always followed by a value (a text string) containing the actual filename.
Web-Admin
An IS1200 web application used by IT personnel to administer the server itself, and when the IS1200 is used to help administer other IT resources.
Web-Admin is the preferred interface for administering the server.
Web-Reports
An IS1200 web application that provides advanced reporting capabilities based on IS1200 metadata.
Web-Search
An IS1200 web application that provides basic, advanced, and specialized email searches against IS1200 metadata.
XML, eXtensible Markup Language file
A file type that uses the XML language to define and describe data that can be transferred between applications like databases and spreadsheets.
Index
A
Absolute Date/Time 9
Access Control List 13
ACL 13
Actionable Services 13
Actions 13
Active Directory 13
AD 13
Advanced Search 13
Assignment Rule 13
assignment rules 13
definition 13, 14
Audience v
Auditing 13
authentication
definition 14
authorization rules 13
B
Basic Search 14
C
CAS Devices 14
CASID 14
Celerra 14
Centera 14
CIFS 15
classification rule 14
Classification Service 14
CLI 15
clusters 14
Comma Separated Values 14
Command Line Interface 15
Command Line Interface Reference Guide
v
Common Internet Filing System 15
Container 15
CSV 14
Custodian 15
customer support v
D
data 15
data classification 15
Data Repository 15
data server 15
Data Verification 16
Datamap 15
data-mount 15
data-share 15
Documentation
other v
Documentum 16
Domino 16
E
eDiscovery 16
eDiscovery Case Manager 16
eDiscovery Case Manager Administrator’s
Guide v
Enterprise Vault 17
Exchange 17
extended attributes 17
eXtensible Markup Language 20
extraction rules 17
F
Federation 17
Federation Server 17
filer 17
fullText 17
G
Groupware 17
Index
H
Hash Values 17
I
Identity Vault 18
Installation and Quick Start Guide v
K
Kaz Schema 18
Kazeon EVAgent 18
Kazeon Query Language 18
kaz-mount 18
kaz-server 18
kaz-share 18
L
Legal Hold 18
Legal Service Provider 18
local 18
logging rule 18
LSP 18
M
Member-cluster 18
metadata 19
Metadata Repository 19
N
Network File System 19
Network Information System 19
NFS 19
NIS 19
nodes 19
P
policy groups 19
PST 19
R
Related Documentation v
Relative Date 9
Rention Class 10
Retention 19
Retention Date 9
Roles 19
S
Search Index 20
SharePoint 20
SourceOne Server 20
T
Tags 20
W
Web Reports v
Web Search v
Web-Admin v, 20
Web-Reports 20
Web-Search 20
X
XML 20
© 2011 - 2013 EMC Corporation. All Rights Reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR
WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY
DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United State and other countries.