• No results found

EMC Kazeon-eDiscovery

N/A
N/A
Protected

Academic year: 2021

Share "EMC Kazeon-eDiscovery"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.EMC.com

EMC

®

Kazeon-eDiscovery

Version 4.8.0

IS1200 EMC Celerra FLR Retention Manager

User and Configuration Guide

(2)

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Adobe and Adobe PDF Library are trademarks or registered trademarks of Adobe Systems Inc. in the U.S. and other countries. All other trademarks used herein are the property of their respective owners.

The IS1200 software is based in part on software licenses from the following: Outside In® Content Access © 1991-2015, Chicago, Inc.

Open Source code from www.java2s.com called the itext.asian.jar available at: http://www.java2s.com/Code/Jar/GHI/itext-asian.jar.htm

Copyright 2009 - 12 Demo Source and Support. All rights reserved In part on the work of the Independent JPEG Group.

Code from Inxight Software, Inc. Copyright © 1996-2015. All rights reserved. www.inxight.com. Certain icons used by the Kazeon Web applications come from the Silk Icon set

(http://www.famfamfam.com/lab/icons/silk/)

licensed under the Creative Commons Attribution 2.5 license (http://creativecommons.org/licenses/by/2.5/).

(3)

Contents iii

Figures...v

Tables ... ...vii

Preface ... ...ix

Chapter 1

Introduction

About the IS1200 ... 2

Extending Server Functionality ...2

Celerra FLR Functionality ...2

The Celerra FLR Retention Manager ...3

Supported Celerra FLR Configurations... 4

Celerra FLR Retention Manager Implementation Overview ... 5

Celerra FLR Retention Manager Implementation in New Installations ...5

Celerra FLR Retention Manager Implementation In Existing Installations ...5

Chapter 2

Registering Celerra FLR Repositories

Supported Configurations... 8

Celerra FLR Repository Registration Requirements... 8

Creating a Metadata Repository for a Celerra FLR Repository ... 9

Registering Celerra FLR Repositories ... 10

Registering Repositories Using Web-Admin... 10

Registering Repositories from the CLI ... 14

Contents

(4)

Chapter 3

Workflow Changes

Celerra FLR Retention Capabilities... 18

Celerra FLR Legal Hold Capabilities ... 19

Celerra FLR Collection Options... 19

Administration... 19

Searching... 19

Reporting ... 20

Glossary ... ...21

(5)

Figures v Title Page

1 The Web-Admin Repositories Page... 10

2 The Web-Admin Add Repository Tab for NFS ... 11

3 The Web-Admin Add Repository Tab for CIFS ... 12

4 Retention Options for Celerra FLR Data Repositories... 18

Figures

(6)
(7)

Tables vii Title Page

1 Revision History Details... xii 2 Stop Words... 40

Tables

(8)
(9)

Preface ix

Preface

As part of an effort to improve its product lines, EMC periodically releases revisions of its software and hardware. Therefore, some functions described in this document may not be supported by all versions of the software or hardware currently in use.The product release notes provide the most up-to-date information on product features.

Contact your EMC technical support professional if a product does not function properly or does not function as described in this document.

Note: This document was accurate at publication time. Go to EMC Online Support (https://support.emc.com) to ensure that you are using the latest version of this document.

Audience This guide is intended for Administrators, Business Analysts, and Compliance Auditors that need to setup and configure an EMC Kazeon - eDiscovery server to work with EMC Celerra repositories, and then search and produce reports on those repositories.

Related Documentation

IS1200 Installation and Quickstart Guide

- describes installing and configuring the IS1200 server software.

IS1200 Web-Admin User and Configuration Guide

- describes using Web-Admin to setup and manage Kazeon clusters.

IS1200 Web-Search User Guide

- describes using Web-Search to perform basic and advanced searches.

IS1200 Web-Reports User Guide

(10)

IS1200 eDiscovery Case Manager Administrators and Supervisors Guide

- for legal representatives, a primer of all the web-based Interfaces above for performing eDiscovery.

IS1200 Command Line Interface Reference Guide

- describes the IS1200 Command Line Interface and all its commands. Follow these steps to download IS1200 document from the web: 1. Go to

https://support.emc.com

and click the SUPPORT BY

PRODUCT option in the home page.

2. In the Find a Product field, enter Kazeon. From the product selection list, choose one of the sub-headers (such as Kazeon ECS) and click the Find button.

3. Kazeon ECS window is displayed. Click the link for Documentation.

4. In the left-navigation menu, choose a version level to display the available documents.

Conventions used in this document

EMC uses the following conventions for special notices:

DANGER indicates a hazardous situation which, if not avoided, will result in death or serious injury.

WARNING indicates a hazardous situation which, if not avoided, could result in death or serious injury.

CAUTION, used with the safety alert symbol, indicates a hazardous situation which, if not avoided, could result in minor or moderate injury.

NOTICE is used to address practices not related to personal injury. Note: A note presents information that is important, but not hazard-related.

(11)

Preface xi Preface

IMPORTANT

An important notice contains information essential to software or hardware operation.

Typographical conventions

EMC uses the following type style conventions in this document.

Where to get help EMC support, product, and licensing information can be obtained as follows.

Normal Used in running (nonprocedural) text for:

• Names of interface elements (such as names of windows, dialog boxes, buttons, fields, and menus) • Names of resources, attributes, pools, Boolean expressions, buttons, DQL statements, keywords,

clauses, environment variables, functions, utilities

• URLs, pathnames, filenames, directory names, computer names, filenames, links, groups, service keys, file systems, notifications

Bold Used in running (nonprocedural) text for:

• Names of commands, daemons, options, programs, processes, services, applications, utilities, kernels, notifications, system calls, man pages

Used in procedures for:

• Names of interface elements (such as names of windows, dialog boxes, buttons, fields, and menus) • What user specifically selects, clicks, presses, or types

Italic Used in all text (including procedures) for: • Full titles of publications referenced in text • Emphasis (for example a new term) • Variables

Courier Used for:

• System output, such as an error message or script

• URLs, complete paths, filenames, prompts, and syntax when shown outside of running text Courier bold Used for:

• Specific user input (such as commands)

Courier italic Used in procedures for: • Variables on command line • User input variables

< > Angle brackets enclose parameter or variable values supplied by the user

[ ] Square brackets enclose optional values

| Vertical bar indicates alternate selections - the bar means “or”

{ } Braces indicate content that you must specify (that is, x or y or z)

(12)

EMC product, and licensing information can be obtained as follows. Product information —For documentation, release notes, software updates, or for information about EMC products, licensing, and service, go to the EMC Online Support at:

https://support.emc.com

Technical Support —Go to EMC Online Support and click Service Center. You will see several options for contacting EMC Technical Support. Note that to open a service request, you must have a valid support agreement. Contact your EMC sales representative for details about obtaining a valid support agreement or with questions about your account.

Documentation Feedback

Your suggestions help us continue to improve the accuracy,

organization, and overall quality of the user publications. Please send your comments or opinions on this document to:

[email protected]

Revision History

Table 1 Revision History Details Revision Date Description

September 2015 Updated the Deduplication section in “Glossary”.

August 2014 Added an update to the mixed mode support by Kazeon in “Creating a Metadata Repository for a Celerra FLR Repository” on page 9. May 2014 • Updated information about support for mixed mode in “Creating a

Metadata Repository for a Celerra FLR Repository” on page 9. • Changed any instances of Exchange connector to FLR connector. December 2013 Initial Publication

(13)

Introduction 1

1

This guide is provided as a companion to the IS1200 Web-Admin User and Configuration Guide which should be read first as it contains most of the basic IS1200 server setup and maintenance information on which this guide builds.

Topics include:

◆ About the IS1200 ... 2

◆ Extending Server Functionality ... 2

◆ Celerra FLR Functionality... 2

◆ The Celerra FLR Retention Manager... 3

◆ Supported Celerra FLR Configurations... 4

◆ Celerra FLR Retention Manager Implementation Overview ... 5

◆ Celerra FLR Retention Manager Implementation in New Installations... 5

◆ Celerra FLR Retention Manager Implementation In Existing Installations... 5

(14)

About the IS1200

The IS1200 is an integrated hardware and software system that provides information management solutions enabling organizations to efficiently and cost effectively classify, manage, and retrieve data. They provide consistent information visibility and control across distributed files, minimize the risk of un-managed files, integrate seamlessly with existing infrastructure, and scale to support billions of files for searching, reporting, backup search and recovery, and file migration and archiving.

Extending Server Functionality

The standard IS1200 uses clustering to offer a scalable solution for classifying, searching, reporting on, and applying Actionable Services to search and report results found on registered data repositories. Data repository types include NFS, CIFS, and many other

vendor-specific servers such as Microsoft Exchange, and Microsoft SharePoint servers.

The IS1200’s standard functionality, and the types of servers it can access, can be expanded with add-on modules like the FLR

Connector. The FLR Connector requires an additional license for the IS1200 and allows that IS1200 to register and manage Celerra FLR Servers.

Celerra FLR Functionality

The main objective of the EMC Celerra FLR (file-level retention) server is to protect files from deletion or modification until a specified retention date. FLR allows creating a permanent, unalterable set of files and directories, and ensures the integrity of the data they contain for a controllable retention period.

This server prevents users from deleting or modifying files that are locked and protected. This is highly useful during legal issues where files must be preserved without modification for the entire course of a legal case, and for organizations like health service providers that must retain medical records for fixed periods determined by legal mandates.

The Celerra FLR is designed to provide file retention at both the self-regulated and government-regulated levels.

(15)

About the IS1200 3 Introduction

The Celerra FLR Retention Manager

While the standard IS1200 can register standard Celerra shares that have been exported as CIFS or NFS, the FLR Connector allows the IS1200 to register Celerra FLR servers and work with their

FLR-specific retention capabilities. Once a Celerra FLR server has been registered as a data repository, and classified, it can be searched, reported on, and have Actionable Services such as Retention applied to search and report results.

Adding the FLR Connector also enables the IS1200 to provide additional Celerra FLR-specific search options, retention reports, and new Action options. The IS1200 can transfer files to Celerra FLR servers, and access those files to classify and search them. Most importantly, the FLR Connector allows the IS1200 to help manage all the files on Celerra FLR servers including file settings such as retention times. Additionally, IS1200 standard retention reports such as “Expired Files”, “Retention by Aging”, and “UpForExpiry” allows identifying files that have passed their retention dates and Actionable Services such as Retention, Copy, Move, and Delete, allows report results (files) to have retention periods modified or extended, or to efficiently archive or delete expired files.

(16)

Supported Celerra FLR Configurations

The FLR Connector supports EMC Celerra FLR versions 5.6.4.3 and above.

(17)

Celerra FLR Retention Manager Implementation Overview 5 Introduction

Celerra FLR Retention Manager Implementation Overview

If FLR Connector capabilities are ordered with the original appliance purchase, the appropriate optional module license is automatically included in the server master license and is routinely installed along with the server license.

If the FLR Connector is purchased after the original installation, the FLR Connector license key must be added to the installation before Celerra FLR repositories can be registered and managed. See the

Installing License Keys chapter of the IS1200 Web-Admin

User and Configuration Guide for details on obtaining and installing optional modules licenses for existing installations.

Celerra FLR Retention Manager Implementation in New Installations

The FLR Connector software is automatically installed with all IS1200 installations (ECS and FI), the FLR Connector license key simply activates it.

To use the FLR Connector purchased with a new installation, simply follow the standard hardware and software installation and

configuration instructions in the

IS1200 Installation and Quickstart Guide and then use the rest of this guide to configure the connectors.

Celerra FLR Retention Manager Implementation In Existing Installations

If the FLR Connector license key is obtained after the original IS1200 installation, use the following general steps to implement FLR Connector capabilities.

1. Use the current Web-Admin application to add the FLR

Connector license key to the IS1200, and then quit and

relaunch the Web-Admin application. See the Installing

License Keys chapter of the IS1200 Web-Admin

User and Configuration Guide for details on obtaining and

installing optional modules licenses for existing

installations.

2. Register your Celerra FLR servers as IS1200 data

repositories.

(18)

3. Classify your Celerra FLR repositories.

4. Search, report, and apply Actions to your Celerra FLR

servers as necessary.

(19)

Registering Celerra FLR Repositories 7

2

This chapter discusses registering Celerra FLR network shares as

EMC Kazeon - eDiscovery data repositories. Topics include:

◆ Supported Configurations ... 8 ◆ Celerra FLR Repository Registration Requirements... 8 ◆ Creating a Metadata Repository for a Celerra FLR Repository .... 9 ◆ Registering Celerra FLR Repositories ... 10 ◆ Registering Repositories Using Web-Admin ... 10 ◆ Registering Repositories from the CLI... 14

Registering Celerra FLR Repositories

(20)

Before Celerra FLR repositories can be classified, or searched and reported on, they must be registered with the IS1200 as data repositories and have a metadata repository assigned.

Supported Configurations

The Celerra FLR Retention Manager supports all Celerra FLR versions 5.6.4.3 and above. The Connector supports registering up to sixteen Celerra FLR repositories per IS1200 node (the same IS1200 support supplied for all NFS or CIFS repositories). Each Celerra FLR repository can consist of multiple Celerra servers.

Celerra FLR Repository Registration Requirements

To register and access Celerra repositories, the IS1200 must needs the following:

◆ A appropriate metadata repository to associate with the repository when it is registered.

◆ If the Celerra FLR repository is exported as a CIFS share, an identity is needed that has complete access to the repository to be registered.

(21)

Creating a Metadata Repository for a Celerra FLR Repository 9 Registering Celerra FLR Repositories

Creating a Metadata Repository for a Celerra FLR Repository

Before registering a Celerra FLR repository, one or more metadata repositories must be registered to store the extracted metadata. For information on adding metadata repositories, see the Repository Registration and Management chapter of the IS1200 Web-Admin User and Configuration Guide. For optimal performance, use metadata repositories on NFS systems.

Dedicated IS1200 metadata repositories should be assigned to each Celerra FLR server. Because a Celerra FLR server typically host many terabytes of data, multiple metadata repositories may be needed to store the metadata for a single Celerra FLR server. If you find that the number of metadata repositories is inadequate, register more

metadata repositories.

Kazeon supports only basic crawl on data repositories that use mixed mode. For the mixed-mode environment to work, the user mapping should be already configured. For performing this mapping, refer to the respective storage documentation. For example, Isilon, Celerra, and so on.

If the user mapping is not performed correctly, you may not get the correct custodian or owner information of the object. For more information about mixed mode support, see the IS1200 4.8 Web-Admin User Guide.

Note: Metadata repositories may not be shared between two registered Celerra FLR servers or between a Celerra FLR cluster and any other registered data repository.

(22)

Registering Celerra FLR Repositories

Celerra FLR repositories may be registered with the IS1200 using either the CLI or Web-Admin.

Registering Repositories Using Web-Admin

To register a Celerra FLR server as a data repository from Web-Admin, do the following:

1. In Web-Admin, select

Repository View

under Repositories in

the left-navigation menu. The Repositories tab opens:

Figure 1 The Web-Admin Repositories Page

(23)

Registering Celerra FLR Repositories 11 Registering Celerra FLR Repositories

3. Select

NFS

or

CIFS

from the Repository Type drop-down

menu, depending on how the Celerra FLR repository you

want to register was shared.

One of the following two tabs appears:

(24)

Figure 3 The Web-Admin Add Repository Tab for CIFS

4. Fill in the fields on the screen you see as follows:

Name. Enter a reference name for this repository. The IS1200 uses reference names, instead of the full repository filepaths, in all menus where a user must choose a repository, for example when choosing a data repository for a classification or collection. Reference names must be unique. Reference names are limited to 127 bytes (127 ASCII characters, or as few as 31 four-byte UTF-8 encoded characters). Reference names may include some special characters, see “Special Characters” on page 31 for details. Metadata File System From the drop-down menu, select the metadata repository you created for the Celerra cluster, or let the IS1200 auto select one.

Server. Enter the name of the NFS/CIFS file server hosting the repository to add. This may already be entered (and

unchangeable) if registering a “discovered” data repository.

If you are registering an NFS repository:

Mount Path. Enter the mount path of the NFS repository on the host file server.

(25)

Registering Celerra FLR Repositories 13 Registering Celerra FLR Repositories

If you are registering a CIFS repository:

Share Name: Enter the share name, or mount point, of the CIFS repository to register.

Identity: Select a pre-defined user identity—from the drop-down list—to use when the IS1200 accesses this filer. If an appropriate identity is not available, click the Create Identity button to add one. For more information on identities, see The Identity Vault

chapter of the IS1200 Web-Admin User and Configuration Guide.

Specify Use: Select one of the following:

Source. Register this repository as a source, this includes the reference name (specified above) in all dialogs where a repository can be chosen as a source, for example when doing a Collection in either Web-Admin or the

eDiscovery Case Manager.

Target. Register this repository as a target, this includes the reference name (specified above) in all dialogs where a repository can be chosen as a target.

Source and Target. Register this repository as a both a source and a target for dialogs.

Read Only. Check to indicate the filer being registered is Read Only to the IS1200. The option should be used if the repository (being registered) is exported or shared as Read Only. If this option is not set, the IS1200 assumes the filer being added is Read Write. Repository Vendor. Check the EMC Celerra FLR checkbox.

Once the EMC Celerra FLR checkbox is set, and the repository is registered, the option cannot be un-set! The only way to uncheck this option is to offline the repository, delete it, and re-add it without the option checked.

Storage Tier. Optionally, specify the storage tier where the data repository is located. The storage tier can be any number between 0 and 255. Default is 0.

Force add on errors. Select to force adding this device to the registered list in spite of errors.

Preserve Access Time. This option is unavailable for Celerra FLR repositories.

(26)

5. Click

Submit

to register the data repository.

Registering Repositories from the CLI

To register a Celerra FLR server from the Command Line Interface, use the following general Command Line Interface command:

add datafs <referenceName> mount <mountPoint> as <identity> <attributesHere>

Where:

<referenceName> is the name the IS1200 should use when displaying this repository in repository selection menus.

<mountPoint> is the mount path for the repository.

<identity> is the name of an an identity (already stored in the IS1200 Identity Vault) to use to access the CIFS repository. <identity> is only required when adding CIFS repositories.

<attributesHere> is the keyword attributes followed by a comma separated list of attributes appropriate to the repository. Examples follow:

To add an NFS repository from a Celerra FLR server and make it a source repository:

add datafs celerra_nfs mount celerraflr_server:/nfs1 attributes

celerraflr=yes,fs_preserve_timestamp=no,source_reposit ory=yes,target_repository=no

To add an NFS repository from a Celerra FLR server and make it a source repository and a target repository:

add datafs celerra_nfs mount celerraflr_server:/nfs2 attributes

celerraflr=yes,fs_preserve_timestamp=no,source_reposit ory=yes,target_repositoy=yes

When adding CIFS repositories, an identity must be available containing credentials allowing complete access to the repository. Assume the identity celerraflr_identity is available for the following examples.

To add a CIFS repository from a Celerra FLR server and make it a source repository:

(27)

Registering Celerra FLR Repositories 15 Registering Celerra FLR Repositories

add datafs celerra_nfs mount //celerraflr_server/cifs1 as celerraflr_identity attributes

celerraflr=yes,fs_preserve_timestamp=no,source_reposit ory=yes,target_repository=no

To add a CIFS repository from a Celerra FLR server and make it a source repository and a target repository:

add datafs celerra_nfs mount //celerraflr_server/cifs2 as celerraflr_identity attributes

celerraflr=yes,fs_preserve_timestamp=no,source_repository=yes,target_r epositoy=yes

(28)
(29)

Workflow Changes 17

3

This chapter describes how installing a Celerra FLR Retention Manager license on the IS1200 changes the standard workflow procedures.

Note: Before continuing, be sure a valid Celerra FLR Retention Manager license key is installed on all nodes of your IS1200 cluster. See the Installing License Keys chapter of the IS1200 Web-Admin User and Configuration Guide

for details on obtaining and installing a license key if you have not already installed the key.

Topics include:

◆ Celerra FLR Retention Capabilities ... 18 ◆ Celerra FLR Legal Hold Capabilities ... 19 ◆ Celerra FLR Collection Options ... 19 ◆ Administration ... 19 ◆ Searching ... 19 ◆ Reporting... 20

Workflow Changes

(30)

Celerra FLR Retention Capabilities

When a Celerra FLR Retention Manager license is added to the IS1200, new retention options become available in all screens, pages, or tabs that move or copy files to a Celerra FLR repository. The new options generally look like the following:

Figure 4 Retention Options for Celerra FLR Data Repositories The new options work as follows:

Retention Date Selection: Check this box to determine the retention date to use:

Absolute Date/Time: Select this radio button to set an absolute retention date. Use the ( ) Time drop-down menu and the ( ) Calendar tool to determine the specific retention time and date.

Relative Date: Select this radio button to set a relative retention date,

that is, a date determined using the following formula:

retention date = <soMany> <timeUnits> from <fileTimeAttribute>

Use the fields to the right of the Relative Date radio button to set up the formula:

• use the first empty field to set the <soMany> number • use the middle drop-down menu to set the <timeUnits>

• use the left drop-down menu to set which <fileTimeAttribute>

to base the date on.

Retention Class Selection: This radio button is present whenever an EMC repository is selected, but is not available (grayed out) for Celerra FLR repositories.

(31)

Celerra FLR Legal Hold Capabilities 19 Workflow Changes

Celerra FLR Legal Hold Capabilities

Only Legal Hold withOUT Enforcement is available on Celerra FLR servers.

More specifically, while a metadata tag for legal hold may be set for files on registered Celerra FLR repositories, the standard IS1200

Actionable Services legal hold option “Enforce Legal Hold at the repository level” is not available for Celerra FLR files. The IS1200 cannot change the file privileges on registered Celerra FLR

repositories to prevent users from moving, changing, or deleting files, even those with the legal hold metadata tag set.

Celerra FLR Collection Options

Collections done from either Web-Admin or the

eDiscovery Case Manager to Celerra FLR repositories include options for setting the retention options on the target. See “Celerra FLR Retention Capabilities” on page 18 for details on setting these options.

Administration

Web-Admin can register, classify, and do Single Step Collections to Celerra FLR repositories.

Searching

A new metadata namespace, “Retention”, contains the following new fields:|

retentionlock, retentiondate, retentionsetdate, retentionsetuser, retentionreportdate

All these fields are viewable using the Show Metadata icon from search results, but only retentionsetuser is routinely indexed and can be searched for.

Actionable Services like Copy, and Move, that are applied to search results from Celerra FLR repositories contain new interfaces for setting retentions options on their targets. A new Actionable Service

(32)

tab called Retention, allows extending retention settings. See “Celerra FLR Retention Capabilities” on page 18 for details on setting these new retention options.

Reporting

Web-Reports contains a report category called Retention reports. Retention reports list expired, soon-to-expire, and locked files allowing administrators to manage these files with Actions that reset retention settings or delete expired files.

Retention reports are only for repositories such as Celerra FLR repositories that implement specific retention features for the files they contain. Retention reports run on filers without retentions capabilities return empty reports.

Retention reports prefixed by "Snaplock" are designed for SnapLock repositories. Running these on Celerra FLR repositories will return empty reports or errors.

Actions like Copy or Move, that are applied to report results from Celerra FLR repositories contain new interfaces for setting retentions options on their targets. See “Celerra FLR Retention Capabilities” on page 18 for details on setting these new retention options.

(33)

21

Glossary

This glossary contains terms related to disk storage subsystems, networks, file management, and eDiscovery. Many of these terms are used in this manual.

A

active case In eDiscovery situations, a company may have more than one legal issue (case) in progress at a time. Often it is advantageous to limit job or search scope to just one case. When the user interface scope is limited to a particular single case, that case is the active case.

Active Directory (AD) A technology created by Microsoft that provides a variety of network services, including: LDAP-like directory services, Kerberos-based authentication, and DNS-based naming and other network information.

Actions, Actionable Services

Services such as copy, move, delete, tagging, and so on, that can be applied to search and report results and allow the IS1200 to be an effective file management tool for registered repositories.

Access Control List (ACL)

A file system level data file that specifies how users or groups may access resources on a computer or network, like an application, file or printer, and the rights they have to it, for example read access, write access, and so forth. For more information on how the IS1200 may use ACLs, see the Controlling ACL Checking section of the Configuration Files and Utilities appendix of any IS1200 User Guide for details.

(34)

Advanced Search A search made from the IS1200 Advanced Search link. Allows searching for extracted metadata by tag-value pairs, and allows multiple variable and boolean searches.

Agents See “connectors” on page 25.

Assignment Rules An assignment rule is a type of classification rule. It tags files with metadata and assigns files to policy groups. Assignment rules are contained in Assignment Rule Sets (ASRs). See the Policies: Classification, Extraction and Assignment Rules chapter of the any

IS1200 User Guide for more details.

Auditing A service that allows the IS1200 to record all system events according to who did what, when, and the event result. This data is especially useful to Legal Service Providers when providing an audit trail for responsive data produced during eDiscovery. Complete details are available in the Auditing and Data Verification chapter of any IS1200 User Guide for details.

Authorization Rule A policy rule that filters search results to ensure that the assigned files can only be viewed by authorized users. IS1200 authorization policies may be used to add additional levels of security to the Access Control Lists (ACLs) for file objects found in registered data repositories. See the Policy Groups: Authorization Policies chapter of any IS1200 User Guide for more details.

Authentication The process of identifying users based on user name and password to ensure that only authorized users can access the IS1200.

B

Basic Search A search made from the Search page using only the Search field. Searches only the content found in the fullText field populated during classifications.

C

CAS Device EMC’s Content Addressed Storage (CAS) devices are cluster-able archival devices that host archival business file content such as email, office productivity files (like word processing and spreadsheet files), images, and other file documents.

CASID

A unique IS1200 ID for each classified file that the system generates during basic classification.

(35)

23 Glossary

Centera Server The EMC Centera server is a networked storage system specifically designed to store and provide fast, easy access to fixed content (information in its final form). It is a CAS device providing long-term retention and assured integrity designed to store and manage data that require or have legally mandated retention periods, for example medical records and files relevant to legal matters.

Celerra Server An EMC server designed to store and manage archival data. The Celerra File Level Retention (FLR) server also allows enforcing enterprise or governmental retention policies.

checkpoints, checkpointing

Checkpoints and checkpointing allow IS1200 jobs and services to resume more efficiently if the job or service is paused or stopped before it completes. Basically, the IS1200 records “bookmarks” about what file or object was last processed. This allows the IS1200 to skip to the bookmark—the checkpoint—when the job or service is resumed, and avoid reprocessing all the files and objects already processed.

However, checkpoints are not set for every file accessed, instead most jobs divide file processing into “batches” and the checkpoints indicate where batches started. Consequently, when a job restarts at a checkpoint, some objects may be reprocessed again and—in cases such as a 'Copy' service with 'enable-versioning' option

selected—duplicate versioned files will be created on the target repository when those objects are reprocessed.

Classification Rule Rules that the system implements during data classification to extract metadata, tag files, and assign files to policy groups. The two types of classification rules are extraction rules and assignment rules.

Classification Service Sometimes called a “crawl”. An IS1200 service that accesses job-specified registered repositories and extracts and records their metadata to later facilitate comprehensive and cross-repository searches. Classifications extract metadata according to extraction rules, compute digests for all objects, and assigns files to policy groups according to assignment rules. See “Assignment Rules” on page 22, “Extraction Rules” on page 29, “Hash Values” on page 30, and “Policy Groups” on page 34 for more details.

Classifications may be “full”, every object in the specified repositories is parsed and its metadata repopulated in the indexes and databases, or they may be “differential”, see “Differential Classifications” on page 27 of more details.

(36)

Cluster A set of IS1200 appliance nodes working as a unit. A cluster can contain a maximum of four nodes. A cluster can be used to control other clusters, see “Information Center Server” on page 31 for details.

CAS Content Addressable Storage

Rather than address data objects by a file name, at a physical location, a CAS device uses a content address (hash-code identifiers) based on file contents to store file objects in a flat file system that maximizes storage efficiency. This returns a unique identifier (Content Address) used to store and retrieve data objects.

CSV Comma Separated Values

A file type used to transfer data between applications such as databases and spreadsheets.

CLI Command Line Interface

The CLI is a traditional command line interface that allows direct communications with the IS1200 “backend” using a the set of commands defined in the IS1200 Command Line Interface Reference Guide.

Concepts Search The standard IS1200 software supports keyword exploration. However, in the initial stages of the legal discovery process (often called eDiscovery), keyword search alone may not be as concise or as time-efficient as required by standard legal timetables.

Concepts augments standard keyword searching by automatically suggesting filters based on the results of a current search. By default it looks for concepts based on persons, countries, noun groups,

organizations, company names, and products.

Concepts Search is an optional module that requires an additional license key for each IS1200 cluster node. See the IS1200 Concepts Search User and Configuration Guide for complete details.

conceptfinder Ruleset The conceptfinder ruleset is an assignment ruleset that extracts the

concepts listed in the Review/Analysis Results Grouping Concepts pane, which is only available when a valid Concepts license is installed on the IS1200. The conceptfinder ruleset must be used in deep

classifications to get the best results in Review/Analysis

from the

Concepts

heading of the

Results Grouping

pane.

The ConceptFinder_DWF assignment ruleset combines both the

conceptfinder ruleset and the DocsWithoutFullText ruleset. See “DocsWithoutFullText Assignment Ruleset” on page 28 for more details.

(37)

25 Glossary

connectors Connectors are IS1200 optional modules that allow an IS1200 to work with repository types beyond the standard CIFS and NFS

repositories. See “optional modules” on page 34 for more details. Optional module connectors require separate licenses to be purchased and installed on all nodes of an IS1200 cluster. For a complete list of optional modules available, see the Introduction chapter of any IS1200 User Guide.

Some connectors, such as the Microsoft Exchange Server Connector, require agents. Agents are additional server platforms, usually Windows servers, that provide the additional CPU cycles and network staging the IS1200 needs to work with the repository types they connect to.

All connectors have their own user guides which can be accessed from the Kazeon Documentation link on the IS1200 Manager page

(https://<yourIS1200Name>/manager).

Container file/object A file (object) that contains other files (sub-objects), such as a ZIP, TAR, JAR, and PST or NSF files. The container file is often called the “parent” and the contained objects are called “children”. Container objects should not be confused with files that have embedded objects, such as Microsoft Word files that have embedded charts or graphics (OLE).

Custodian A legal term used by Legal Service Providers (LSP) and other legal personnel to describe the owners or responsible parties for electronic documents pertinent (responsive) to a legal matter.

D

Data

A file of any type and size such as a short email, a word processor document, or a large spreadsheet.

Datamap A report that lists the electronic storage locations of all possible sources of relevant ESI. This can include standard file servers, groupware servers, email servers—and their backup and archive systems—as well as custodian’s desktop and laptop computers.

Data-Mount The NFS file system that is accessed by the IS1200 to parse data and extract metadata.

Data Server The file server that exports an NFS or CIFS file system so that the IS1200 can classify data on the file system to create metadata.

(38)

Data-Share The CIFS file system to be accessed by the IS1200 to extract metadata.

Data Repository A networked file system registered with the IS1200 so it can be classified, searched, and reported on. Data repositories created on the IS1200 itself (sometimes called localdatafs) are strongly

discouraged!

Data Verification Builds on Auditing and is only available when system auditing is enabled. For job services like Actionable Services Copy or Move, Legal Hold Copy, and Single Step Collections, Data Verification generates an audit trail proving that files were not altered during these actions. This is especially valuable in eDiscovery situations. Complete details are available in the Auditing and Data Verification

chapter of any IS1200 User Guide

Deduplication A process that identifies file or email object and sub-object duplicates based on their digest values (See “Digest Values” on page 27 for details).

In the 4.7.0 and prior versions of the IS1200 software, deduplication was only available for export actions (Actionable Services such as Download, Legal Export, and Copy). This allowed exporting only the unique files and email objects from a set of search results. With IS1200 version 4.8.0, deduplication's functionality is expanded and is automatically applied during case collections and processing to allow displaying deduplicated search results. Note that when deduplication is applied to display of search results, duplicates are only suppressed from display, however duplicates are physically removed from exported file sets.

Deduplication is available only in the ECS version of IS1200 and is applicable only in case context.

DeDuplication view is configurable as deduplication and

non-deduplication view. This allows to view whether any object has got duplicates in search results and the duplicate of the Original (in the search results).

Besides the automatic deduplication of collections and processing, deduplication may also be started manually from the IS1200's case dashboard.

(39)

27 Glossary

Deduplication reports describing how a particular job or service applied deduplication are available. The reports can be accessed from the IS1200 case dashboard as well as from web search. Reports can list all results, only unique (deduplicated) results, or percentages of unique and duplicates.

Reduplication is a process that allows the duplicates of unique files to be identified so tagging processes can apply metadata tags to the unique files as well as all its copies. Legal Tags reduplication can be done after documents are added to the case.

Differential Classifications

Differential classifications do not re-classify all file objects in the selected repositories. Instead, they examine the metadata from previous crawls, and if there is no previous metadata (indicating the object is new since the last classification) or the metadata has changed (based on atime, or mtime changes), then the object is parsed and its metadata re-populated in the database.

Note: System classification configuration settings default to using mtime to determine if files have changed for differential classifications. If atime is desired instead, see the Using atimes for Differential Crawls section of the

Configuration Files and Utilities appendix of any IS1200 User Guide for details on resetting the default to atime.

Additionally, atime may be applied only to selected classifications by initiating them from the Command Line Interface, see the add service deep-classification command and the crawl-atime-check-enabled option in the IS1200 Command Line Interface Reference Guide for details.

Digest Values Digests are numerical values calculated based on file and email content and are unique for all unique objects. Digest values allow file objects to be compared very quickly. Digests are calculated during basic and deep classifications or during collections or processing when indexing is enabled.

Digests are calculated differently for standard files, emails, and container objects. For standard files, a physical digest is computed for the entire file much like a hash value.

For email objects, just the subject, the message content (including attachments), and certain specific addresses are combined and an email digest value is calculated from the combination. Container objects, like ZIP or PST files, and their sub-objects have digests calculated both as complete objects and as individual sub-objects.

(40)

Note: Calculating email digests requires access to the email object's fullText and only classifications that include the fullText rule can produce email digests. Emails classified without the fullText rule receive the same physical digest that other files do. Consequently, identical emails on different repositories, one classified with and one without the fullText rule, will not be identified as duplicates.

Domino Sever (Lotus) A Lotus server providing groupware solutions and storage.

Domino XML Language (DXL)

A Lotus version of eXtensible Markup Language (XML) used to import and export Lotus email files.

DocsWithoutFullText Assignment Ruleset

Some file objects, such as graphics files (examples are.jpeg, .gif, or .bmp files) contain no text, and hence will have no fullText

extracted by the FullTextRuleset, see “fullText” on page 30 for more details. In legal cases, these files may still contain responsive information, but not textual information that can be located by text searches. The DocsWithoutFulltext assignment rules identifies these files and adds the metadata tag and value

“DocWithoutFulltext=true” to all files that contain no searchable text. This allows these files to be easily searched for later, and inspected for legal responsiveness by non-search methods. The ConceptFinder_DWF assignment ruleset combines both the

DocsWithoutFullText ruleset with the conceptfinder ruleset. See “conceptfinder Ruleset” on page 24 for more details.

Note: Parent file objects that don’t contain text (such as .zip, .tar, and .pst files) are not tagged with the DocWithoutFulltext tag.

Documentum Sever (EMC)

The EMC Documentum server manages business content including documents, photos, video, medical images, e-mail, Web pages, fixed content, XML-tagged documents, and so on. The Documentum core is a repository that stores content securely under compliance rules and appears as a unified environment, even though content may reside on multiple servers and physical storage devices within a distributed environment.

E

eDiscovery The process of reviewing electronic files to determine their relevances and responsiveness to a legal matter or case.

(41)

29 Glossary

eDiscovery Case Manager

An IS1200 tab that facilitates eDiscovery for Legal Service Providers.

Electronic Discovery Reference Model (EDRM)

The EDRM was a Project created to provide standards and guidelines for the electronic discovery market. The model defines a common, flexible and extensible framework for the development, selection, evaluation and use of electronic discovery products and services.

Enterprise Vault A Symantec networked repository for archived email.

eth1, eth2 Most IS1200 platforms require two ethernet connections for proper deployment. These connections are called eth1 and eth2, must each have unique IP addresses, and must be GigaBit, or 1GB/sec or faster, connections.Additionally, all network segments between eth1 and all registered metadata and data repositories must be gigabit

eth1 is used to communicate between the IS1200 and its registered repositories. The IS1200 hostname should be DNS mapped to the eth1 IP address.

eth2 must be connected to a private network between the IS1200 nodes and is used to coordinate and balance system wide operations.

eth2 IP address should not be DNS mapped.

Extended Attributes User-defined keywords that are extracted during data classification.

Extraction Rules Extraction rules are a type of classification rule. They extract

user-defined keywords (custom metadata) to add to the metadata file. Extraction rules are grouped into Extraction Rule Sets (ERSs). See the

Policies: Classification, Extraction and Assignment Rules chapter of any

IS1200 User Guide for more details.

Exchange Server (Microsoft)

A Microsoft server designed to store and manage email.

F

Federation A defined group of member-clusters on a Federation server that can be managed, searched, and reported on as a group. Member-clusters are referred to as Federated clusters.

Federation Server A single-node IS1200 server, with a Federation license, that allows consolidated searching and reporting of up to eight Federated member-clusters of its defined Federation.

(42)

fullText fullText is the “content” portion of a file, for example this is the textual content of word processing files and the message body of emails.

fulltext is an extraction rule that is used to save file textual content as metadata to the Search Index during classifications. It saves up to 10 megabytes of content by default. This default may be changed, but it is not recommended. Fulltext extraction is required by

Review/Analysis for the Previewer pane to work and to generate

Concepts in the Results Grouping pane.

fulltext, is extracted differently for container objects and sub-objects, and for files with embedded objects.

Container objects (such as ZIP or PST files) and their sub-objects are classified individually and the fulltext of the parent container file, and for each child sub-object, is extracted and added to the relevant metadata repository separately.

Files with embedded objects (such as a Microsoft Word file with and embedded spreadsheet), are classified together. The fulltext of the embedded object is included in the fulltext of its parent object and not collected separately.

For more details on fullText, see Chapter 1 of the IS1200 Metadata Reference Guide.

G

Groupware Collaborative software designed to help people involved in common tasks achieve their goals. Incorporates services such as email,

calendaring, text chat, wiki, web-sharing, document control, and advanced search.

H

Hash Values Hash values are used to compare one file with another for duplicates. An extremely simplified description of hashing is that the numeric values of all bytes in a file are added into a grand total. The chances of two different files yielding the same result (hash value) are remotely small, so hash values can be used to identify duplicate files, or compare files with the same name to decide if they have been modified.

(43)

31 Glossary

Computing hash on an entire file is called a full-hash, and computing hash on a portion of the file is called a partial-hash. A “partial hash” may also be used to increase classification speed and “hashing” can be turned on, or off to increase classification speed.

I

identity A single entry in the Identity Vault database. The identity contains a single username and password that the IS1200 can retrieve when it needs to access a registered data or metadata repository or other server like and authentication service.

Identity Vault An encrypted database of usernames and passwords the IS1200 uses to store the credentials used to access registered data repositories, send email notifications, and work with authentication services.

Information Center Server

The standard IS1200 server offers clustering as a scalable solution for classifying, searching, and reporting on registered network

repositories. While clustering is ideal for scaling to large numbers of files on a LAN, it is not a viable solution for WANs. Enterprises with multiple IS1200 clusters deployed, or IS1200 clusters deployed in remote offices need the ability to setup and manage unified reports and searches across all their clusters. The IS1200 Information Center server provides this solution.

Each Federation server supports one federation. A Federation may have up to eight clusters (with four nodes each) included in it. Once a federation is established, it becomes a central management point allowing classifications, search, and reports to be setup or managed on all the federations members from the Information Center server. See the IS1200 Information Center User and Configuration Guide for complete details.

Intelligent Platform Management Interface (IPMI)

IS1200 clusters may contain more than one node. Normally each node communicates with the others to share information and workload. The IS1200 appliance includes an Intelligent Platform Management Interface (IPMI) to shut down nodes when individual nodes or software errors would degrade the overall cluster performance. The IPMI is an autonomous micro-controller—installed in all cluster nodes—used by the cluster’s “leader” node to power down nodes with errors or performance problems. The IPMI requires its own unique IP address, but communicates over the eth1 port, see “eth1, eth2” on page 29 for more details.

(44)

K

Kazeon EVAgent An IS1200 service, installed on the Enterprise Vault server, that allows the IS1200 to directly open and access Enterprise Vault email for classification services.

Kaz-mount The NFS file system that is the IS1200 metadata repository. on which the IS1200 stores metadata.

Kazeon Query Language (KQL)

A programming language used in classification and assignment rules to identify files that should receive specified metadata tags.

KQL Reserved Words The KQL language reserves the following words. Consequently, they are not allowed to be searched for, or used as tags or aliases.

"ADD", "ALL", "ALTER", "AND", "ANY", "AS", "ASC", "AVG", "BETWEEN", "BY", "CASCADE", "CHECK", "COLUMN", "COUNT", "DESC", "DISTINCT", "ESCAPE", "EXISTS", "FROM", "FULL", "GRANT", "GROUP", "HAVING", "IN", "INTO", "IS", "JOIN", "KEY", "LEFT", "LIKE", "MAX", "MIN", "NOT", "NULL", "ON", "OR",

"ORDER", "OUTER", "REVOKE", "RIGHT", "SELECT", "SET", "SUM", "UNION", "UNIQUE", "UPDATE", "VALUES", "VIEW", "WHERE"

Kaz-server The file server where the metadata repository is located.

Kaz-share The CIFS file system on which the IS1200 stores metadata.

Kaz Schema Defines the set of metadata fields used to build a Search Index for registered data repositories (file systems).

L

Legal Hold Files placed on legal hold are either copied to a secure secondary location where they can preserved for later use, or are locked in their original locations against further change until a legal matter is resolved.

Legal Service Provider (LSP)

A lawyer or trained legal professional that provides legal services for a fee.

Local Refers to the local resources (usually the metadata repository) of the Federation server.

localdatafs A data repository created on the IS1200 itself. This practice is not recommended.

(45)

33 Glossary

localkazfs A metadata repository created on the IS1200 itself. This practice is not recommended.

Logging rule Logging rules audit user actions on files such as file access, creation, modification, and deletion.

M

Manifest Reports Manifests are reports that summarize the results of an IS1200 job or service. Manifests are produced for Collections (from either

Administration or the Case Mgmt) and for some Actionable Services. Collection Manifests summarize what files were, or were not collected during a collection. Actionable Service Manifests reconcile Actionable Services object-counts with the search result object-counts they are performed on because processes such as deduplication can result in the two counts not matching. The reports details the count of differences and the reasons for the differences. For more information, see Manifests in the IS1200 Web-Search User Guide.

Note: Collection manifests are available ONLY for collections done from v4.6.0 or later, earlier versions did not generate collection manifests.

Member-cluster Any of the clusters registered to a particular Federation.

Metadata Data about data. Metadata is used to search for information and to create reports. Metadata can be file system or custom metadata that the IS1200 extracts from files during classification. File system metadata includes file type, and file path extracted during basic classification. Custom metadata is generated during deep classification.

Metadata Repository A registered repository the IS1200 uses exclusively to record the metadata extracted during classification services on the registered data repository the metadata repository is mapped to.

The primary metadata repository is the host of the repository registration database, the report results database, Environment Discovery job results, Auditing and Data Verification databases, and miscellaneous databases the cluster requires for routine operation. Collectively these are called the Cluster Data Base.

Metadata repositories created on the IS1200 itself (sometimes called

(46)

N

Namespaces IS1200 software, versions 4.0 and higher, organize metadata fields into hierarchy defined by namespaces. Namespaces group similar sets of tags, for example all the file level tags such as FileType, FileSize,

aTime, and cTime are grouped together in the System namespace. See the IS1200 Metadata Reference Guide for complete details.

Network File System (NFS)

A protocol used primarily by Unix based computers for accessing computer systems and filers over the internet.

Network Information System (NIS)

A network naming, administration, and authentication system for smaller networks that was developed by Sun Microsystems and is used primarily by Unix systems.

Node A single IS1200 appliance.

Notes Storage File (NSF)

A standardized storage file format used by Lotus to store email, attachments, notes, calendars, and so on.

O

optional modules The standard IS1200 license provides a default set of features that allows the IS1200 to register, classify, and search and report on CIFS and NFS data repositories. Optional modules are additional software licenses that can add further capabilities, such as being able to work with repository types other than CIFS and NFS, or providing Concepts Search capabilities, or applying legal hold. Some optional modules require connectors, see “connectors” on page 25 for more details. For a complete list of available optional modules, see the

Introduction chapter of any IS1200 User Guide.

P

PEA Files A Pool Entry Authorization (PEA) file is generated by the Centera server administrator. A PEA file defines what applications and users can perform read, write, delete, query, copy, or hold operations for Centera objects.

Policy Groups Associates one or more authorization rule and logging rule with one or more files to protect information and audit user actions on files.

(47)

35 Glossary

PST Files Personal STorage files are generally used by email programs like Microsoft Outlook to store user email locally. PST files are also called “composite” files, because they are packages meant to efficiently store a number of smaller related files. Another example of a composite file is a ZIP storage file

R

Retention The process of enforcing corporate or legal standards for how long certain kinds of files must be preserved for access. Examples of retained files include files responsive to legal matters and medical records.

Roles All IS1200 users have a role, either admin, auditor, or end-user. If a legal license is installed, there may also be legaladmin,

legalsupervisor, legalreviewer, or a custodian. Roles

determines what parts of the IS1200 interface may be seen, and how much of search and report results are displayed.

S

Search Analytics

Pre-Processing

Search Analytics Pre-processing was introduced in release 4.5.0 to minimize search results display time and improve the overall efficiency of eDiscovery culling. Analytics Pre-processing is an integral, automatic, post-processing job performed after any job that modifies the Search Index. Analytics Pre-processing trades an increased post-job indexing period for significantly reduced search results display times after the affected jobs complete.

A variety of jobs requires Search Index changes and therefore require Analytics Pre-processing. These include Collections, Classifications, Delete, and Tagging jobs. The time required by Analytics

Pre-processing is determined primarily by the number of objects in the affected data repository, the number of distinct analytic (result filter grouping) attributes (such as custodians, mail senders, mail recipients, sender domains, recipient domains and so on.), and the read/write performance of the metadata repository associated with the data repository.

Additionally, once any Analytics Pre-processing job is launched, all subsequent Analytics Pre-processing jobs (that might be required by other concurrent jobs-in-progress) wait for the current Analytic Pre-processing job to finish. However, before beginning any Analytics Pre-processing job for a particular data repository, the IS1200 checks

(48)

all other jobs-in-progress for that repository to see if they might also require Analytics Pre-processing. If other jobs are found, the IS1200 waits for all these jobs to finish in order to launch a single Analytics Pre-processing job for all the jobs that affected the Search Index for that data repository.

Therefore, there are two best practices suggested for scheduling jobs that affect the Search Index:

• Schedule large classifications or collections such that both they, and the Analytics Pre-Processing they require, can both fully complete before starting any other job. This allows the IS1200 to most efficiently schedule the required processing resources. Large jobs are those that affect data repositories with tens of thousands of objects or terabytes of data.

• Schedule small jobs (such as incremental collections, or

post-search tagging operations) to run concurrently so the IS1200 can identify their common Analytics Pre-processing requirements and group them into a single job.

Note: IS1200’s that are upgraded to v4.5.0 may need some additional

configuration to make the most efficient use of Analytics Pre-Processing. See the Configuring the IS1200 To Use Proactive Indexing section of the Configuration Files and Utilities appendix of any IS1200 User Guide for complete details.

Search Index An IS1200 database that stores and indexes the file content metadata (including extended attributes, and fullText) for standard and custom user-defined metadata produced by extraction rules during

classifications.

SharePoint Sever (Microsoft)

A Microsoft server in the groupware category.

snippets A snippet is a sub-set of a document’s actual content. Snippets are only displayed if they are enabled in Review/Analysis Preferences, and only in Paragraph View immediately under the first line of the result listing.

After a keyword search completes, result snippets are created as small standard size chunks of data taken from the text surrounding a search query hit. For example, if a search is made for “medicine”, the snippet will contain about 300 bytes of the text surrounding the paragraph where the word “medicine” was found. If multiple search

(49)

37 Glossary

hits are found, the most relevant hit is used to create the snippet. For searches made without keywords, snippets are simply the first 300 bytes of file text.

Snippet size is configurable, see the Configuration Files and Utilities

appendix of any IS1200 User Guide for details on setting snippet size. In all cases, snippets are taken from the result file’s fullText.

SourceOne Archive Server (EMC)

The EMC SourceOne server is a comprehensive, policy-based system that automatically collects, organizes, indexes and retains messages and associated attachments and stores them in designated archives connected to shared storage. EMC SourceOne provides indexed searching that works with both EMC storage and other brands such as IBM or NetApp.

Special Characters The IS1200 supports alphanumeric ASCII and UTF-8 characters.

Non-alphanumeric ASCII characters are defined as Special Characters

and include the following:

‘ “ - _ \ / ! @ # $ % ^ & * + = { } [ ] ( ) < > | : ; , . ? ~ ` Special characters are not universally supported in the IS1200 interfaces. The following limitations must be noted:

Search Queries and Special Character, Special characters pose a searching challenge. Because the IS1200 tokenization removes special characters from indexed text as it is classified, special characters are never entered into the IS1200 metadata indexes. Consequently, special characters may not be directly searched for. For more details see Tokenisation and Stemming in the IS1200 Web-Search User Guide. While special characters may not be directly searched for, the text they are included in can be searched. For example, the string

"-ACME-" is tokenized on the hyphens and recorded in the metadata only as "ACME". Consequently, searching for the string with the hyphens (-) will NOT work. However, you can search for “?ACME?” (using the question mark wildcard) which gives the result as

“!ACME!”, “@ACME.”, and so on. See the IS1200 Web-Search User Guide for more details on wildcards.

Note: The question mark character ( ? ) may not be searched for in filepaths,

References

Related documents

 FTE allocation – Basic Skills FTE are funded at the Tier 2 rate. The FTE allocation is supported through a combination of Federal and State funds. These baseline values

This study found that adoption of the OASI Care Bundle across 16 UK units was influenced by four main factors: (1) the way in which the intervention was introduced and implemented

The third conclusion of this study is that women student affairs leaders learn to advance and succeed within the context of higher education senor leadership teams by understanding

In 1863, the Philippine Bureau of Education ordered that an investigation be conducted on how to expand primary education, and also called for the establishment of

This endorsement is used to provide professional liability coverage for bodily injury, property damage, personal injury and advertising injury arising out of the rendering of

Asia, Islamic Middle East, Level 1 Explore examples of geometric and floral pattern in the Islamic Middle East gallery, including one of the largest carpets in the world,

Furthermore, different studies have shown that sports players subject to this type of motivational climate enjoy psychological well-being, increase their enjoyment of practicing

[r]