IBM Spectrum Discover Update Version Release

(1)

(2)

Advanced Technology Group – Storage experts cover a variety of technical topics.

Audience: Clients who have or are considering acquiring IBM Storage solutions. Business Partners and IBMers are also welcome.

To automatically receive announcements of upcoming Accelerate with IBM Storage webinars, Clients, Business Partners and IBMers are welcome to send an email request to [email protected].

Accelerate with IBM Storage Support Site:https://www.ibm.com/support/pages/node/1125513

Accelerate with IBM Storage Technical Webinar Series

2021 Upcoming Webinars:

February 11 - TS7700 Best Practices

Register Here: https://ibm.biz/Bdfxei

February 18 – IBM Cloud Object Storage Level 201 Plus Register Here: https://ibm.biz/BdfFD9

February 25 - DS8900F Management GUI Demo Register Here: https://ibm.biz/BdfFbY

(3)

Please take a moment to share your feedback with our team!

You can access this 5-question survey via Menti.comwith code 15 75 27 5 or Direct link https://www.menti.com/mkg7a2x6q8

Or

(4)

Agenda

• User Interface and User Experience Improvements • Native deployment on OpenShift

• Backup and Restore

• Importing tags from external sources

• Data Management Policies, beyond Scale ILM/HSM

• AFM data movement

• Moonwalk data movement

(5)

1. Modernized look

2. Usability improvements

3. Navigation reorganization

(6)

Modernized Look and Usability Improvements:

- Login screen

- Show password toggle

- Store last user’s name to quickly login

- Dashboard

- Single page load

- Performance increase

- Transition indication (loading state)

- About section

- Displays current version

- Link to Help (Knowledge Center)

- Tables

- Consistency across all tables

- Notification

- Larger notification

- Change password

- User summary available

- Change password available through all the UI

(7)

- Our users need a way to reach across data, to find data that they need

- Conducts user’s search efforts through an intuitive workflow

- Leveraging complex queries without the need to know a querying language

A search bar, that would build queries in real time as a user clicked to add items to the query. This bar is visible throughout the user’s journey, and

(8)

ANIMATED SLIDE

(9)

New

Spectrum Discover GUI Navigation

• Better organization and visually pleasing

- New hamburger menu

- Able to expand and collapse sections

- Role based view

- No longer takes a portion of the UI

(10)

ANIMATED SLIDE

New

Query Builder in Spectrum Discover 2.0.4

Press “Add” to enter the Tag Value Chart for each tag desired

Query summary

• A summary of the

user’s tag value selection count.

Potential record count

• Displays an estimate of the total number of

records for the current tag value selection.

• Allowing users to fine tune or broaden their

query selection without having to perform a full search query.

• Especially useful on potential long running

queries.

• Record counts estimated below 1000

(11)

Query Builder in Spectrum Discover 2.0.4

: Tag Value Chart

(12)

ANIMATED SLIDE

Query Builder in Spectrum Discover 2.0.4 Continued…

→ Selected options are now in the search bar

 Can edit or clear tags already adjusted

(13)

 Select desired options and

Query Builder in Spectrum Discover 2.0.4 Continued…

Equivalent to cancel if “Update query” was not pressed.

(14)

ANIMATED SLIDE

→ Selected options are now in the search bar

 Can edit or clear tags already adjusted  Add more tags to the Query Builder

 Can edit or clear tags already adjusted

Query Builder in Spectrum Discover 2.0.4 Continued…

(15)

→

NOTE: “Select All” will put a check in each option.

However, this would not be added to the

search string since

(16)

Query Builder in Spectrum Discover 2.0.4 Continued…

Alphanumeric Tag

Value Selector

• Owner values can be filtered

using alphanumerical filter bar or search textbox

• Two views available

 Alphanumerical view

(17)

→ The search string for the query built

Copy the search string into the cut buffer



When ready, press “View results” to see the results of the search

(18)

Query Builder in Spectrum Discover 2.0.4: SQL Query

Copy the search string into the cut buffer

Display previous searches

Similar to 2.0.3 search textbox. SQL WHERE query can be entered into the large text-area.

Query grouping is available via the ‘Group by’ dropdown menu.

Query limits can also be specified using the ‘Limit results’ textbox.

Custom SQL ordering can be applied using the ‘Sort by’ textbox.

Expected syntax is

‘column1 desc, column2 asc…’

(19)

Query Builder in Spectrum Discover 2.0.4 – Condensed Results

Just like before, the initial results of the search are displayed condensed

(20)

ANIMATED SLIDE

Query Builder in Spectrum Discover 2.0.4 – Individual Results

The “Individual” button in the “Results view” section will show the individual records for all items in the search results

(21)

Query Builder in Spectrum Discover 2.0.4 – View Individual Records

Selecting a line, then pressing “View individual records”

only displays individual records of selected line(s)

(22)

ANIMATED SLIDE

Query Builder in Spectrum Discover 2.0.4 : Table Columns

Table columns displayed are now under the gear box Tap the gear box again to make the list disappear

(23)

(24)

Spectrum Discover 2.0.4: Account Settings

• The Account Settings now make it easy •

• To find the User’s role • To find the User’s domain • To find the User’s e-mail

• Change password

(25)

Now

Red Hat OpenShift

applications can easily catalog

and index data, bringing

additional

optimization to AI

workflows

and faster data

classification

•

Easier to deploy in a multicloud configuration

•

Up to 50% less memory resources used,

lowering costs vs. current VM deployment

a p p lica tio n a p p lica tio n a p p lica tion

IBM Spectrum Scale + IBM Spectrum Discover + Red Hat OpenShift all integrated

(26)

Benefits of OpenShift deployment

• For customers:

- No requirement for VMware

- Scalability

- Control of platform (e.g., OS updates)

• IBM:

- Platform supported by the customer (OS, CRI-O, RHOS)

What’s New?

✓ Network Layer / Service Mesh

✓ Different Db2 implementation (Db2 common container vs Db2u)

✓ Everything is containerized – no external scripts (backup/restore, maintenance mode) ❖ Not necessarily file-system access outside of containers

➢ “podman” vs. “docker” ➢ “oc” vs “kubectl”

(27)

(28)

(29)

• Removed Spectrum Scale

• Overkill for a single node OVA deployment • Issues with Spectrum Protect for Virtual

Environments

• Preferred backup and restore on OVA

• VMware Snapshot

• Secondary backup / restore method on OVA

• Previous method of backing up the database.

• Need to support running in a purely

containerized environment

(30)

Preferred Backup and Restore on OVA

• VMware Backup

• Set Spectrum Discover to maintenance mode

• kubectl patch SpectrumDiscover

spectrumdiscover --type merge -p '{"spec":{"maintenance":"on"}}'

• Suspend Db2wh container

• docker exec -it Db2wh write-suspend

• docker exec -it Db2wh stop

• Snapshot

• Right Click on the VM

• Snapshots - Uncheck the “Snapshot the virtual

machine’s memory

(31)

• VMware Restore

• Restore

• Right Click VM

• Snapshots

• Manage Snapshots

• Choose your snapshot

• Revert To

• Bring Spectrum Discover out of

maintenance mode

• kubectl patch SpectrumDiscover

spectrumdiscover --type merge -p '{"spec":{"maintenance":”off"}}'

(32)

• Like secondary OVA option except it’s run within a pod

• oc exec -it spectrum-discover-backup-restore-74dcf47c9c-9228n – bash

• python3 initialSetup.py

• python3 backup.py

• python3 restore.py -r “2020-11-17”

• Runs ansible underneath the covers for all the Kubernetes interactions

• Additional log: /opt/ibm/metaocean/backup-restore/logs/ansible.log

(33)

Importing tag information from external

analytics jobs.

(34)

Import Tags Application

What is it?

• Allows a Data Administrator to

import a set of

externally-curated tag values into

Spectrum Discover

Motivations for this function

• Supports Data Accelerator for AI and Analytics (DAAA)

• An external analytics job might generate tag information for a set of S3/COS objects.

• This information can be utilized in an Import Tags policy to

merge these tags into Spectrum Discover, extending its records with new information.

• This new information can then be used as part of a larger AI/Analytics pipeline

• Other potential use cases for this function might include:

• Extending Discover records with Scale user-defined attributes or S3 x-amz-meta tag information

(35)

IBM Spectrum Scale IBM Cloud Object Storage IBM Spectrum Discover IBM Spectrum Scale

(36)

Import Tags Application: Requirements and Restrictions

• Import Tags policies created via the IBM Spectrum Discover REST API

• Limited support in the GUI

• Import Tags policies managed by Data Administrator user

• This is initially to support DAAA use case, managed by Data Administrator

• Only a single Import Tags policy can run at a time

(37)

• Identify the COS or S3 data connection name of the

data source connection

• Identify the vault name (COS) or bucket name (S3)

of the data source connection

• Scan the S3/COS data source that is associated with

the external tags

• Creates data source records in the Discover database

• Copy external tag comma-separated-value (CSV) file

onto the vault/bucket so that Discover can access it

• This file can be copied before or after the scan

• Define tags in Discover

(38)

Import Tag File Format & Content Requirements

Format: comma-separated values (CSV)

Content:

•First row in the file is a header row.

• The value in 1st _{column in this row is ignored by Discover}

• The 2nd _{through N}th_{columns in this row must be existing}

Discover tag names (defined before running the policy)

•The first column is the object full path name prefixed

with the bucket (S3) or vault (COS) name and ‘/’

• For example, if the vault (COS) name is bucket01, and

object name is car1/image1.png, then the first column entry is bucket01/car1/image1.png.

•Subsequent columns in the CSV file contain the tag

values to be imported into Spectrum Discover for the associated object records

objectname,bus,tree,stop_sign,red_light,yellow_li ght,green_light,pedestrian bucket01/car1/image1.png,1,3,0,1,1,0,1 bucket01/car2/image1.png,1,6,0,0,0,0,12 bucket01/car2/image2.png,1,3,0,2,1,0,1 bucket01/car3/image1.png,1,3,0,2,1,0,2

Given the preceding file, the following tags should be defined in Discover:

bus tree stop_sign red_light yellow_light green_light pedestrian

Example CSV file:

and Discover should have records for each of these objects in the datasource “bucket01:”

(39)

(40)

Defining an Import Tags Policy

Policies are defined in JSON format, such as in the following example:

{ "pol_id": ”import_bucket01", "action_id": "IMPORT_TAGS", "action_params": { "agent": "ImportTags", "source_connection": ”cos_bucket01",

"tag_file_path": ”bucket01/A2D2_labels.csv", "tag_file_type":"csv" }, "schedule": "NOW", "pol_state": "active", "pol_filter": "datasource=‘bucket01'" }

User-defined policy name

vault name (COS) or bucket name (S3)

Discover stores this value in the datasource tag

connection name

User-defined filter

Import Tag File

Must reside on prefixed vault or bucket

NOTE:

• JSON specified on system making the REST API call

(41)

In Discover v2.0.4, Import Tags policies must be created, executed using the Discover REST API.

Curl example (TOKEN obtained via Discover REST API, SDHOST is IP or FQDN of Discover system): $ curl –k -H "Authorization: Bearer ${TOKEN}” \

https://${SDHOST}/policyengine/v1/policies \

-d @importTags.json -H "Content-type:application/json” –X POST Successful response (http code 201):

(42)

Managing a running Import Tags policy cont.

(43)

Import tags policies can be paused/resumed/stopped/deleted via the IBM Spectrum Discover GUI

or REST API calls

For more information, visit the IBM Spectrum Discover documentation in IBM knowledge center

https://www.ibm.com/support/knowledgecenter/SSY8AC

(44)

Data movement with Discover:

Moving data between

Scale file system and

cloud with AFM

(45)

User creates data management policy

• Specify data policy type: COPY, MOVE, or TIER

(determined by applications registered)

• Define filter to identify files or objects to be managed

• Identify Application to be used for data management

• Specify source connection, destination, etc.

Policy execution

• Spectrum Discover policy engine generates job request

message(s)

• Job request messages placed on Kafka egress topic

• Data movement Application reads from topic

• Application performs the required data movement

operation Data Mover Application CES/NFS SMB NFS S3 Job Request Msg Job Response Msg

IBM Spectrum Discover (VMware, KVM, or OCP)

Policy Engine * Data Management Policy Egress

Topic Ingress

(46)

Application Registration

• Data mover applications must be registered with Spectrum Discover

• Spectrum Discover application registration is a mechanism, by which a specific

application shares its definition with Spectrum Discover

• Once the application is registered, Spectrum Discover policy engine creates application

specific Kafka topics (work and completion) which are used to interface with the

application. After the application is successfully registered with Spectrum Discover, it

will be available for Policy creation

• The response to a successful registration message contains Kafka Broker IP, Port and

the names of Kafka work and completion queues/topics created for the particular

application

(47)

Data Management Movement Applications Data Management Movement Applications NEW NEW

(48)

Spectrum Discover and Data Movement

Third Party Storage Systems

Spectrum Scale IBM COS,

Red Hat Ceph

Spectrum Archive Tape

Spectrum Discover metadata and policies deliver more effective data movement

Spectrum Scale

ILM

HSM AFM-S3

Third Party Data Mover (Moonwalk)

Scenarios

NFS SMB

S3

IBM Spectrum Discover Data Management

(49)

Spectrum Discover has a new built-in (data mover) application ‘ScaleAFM’ which supports

• COPY operation with

• Source: IBM COS or S3 connection

• Destination: Spectrum Scale connection

• Scan source connection before creating data management policy

to update Discover DBMS for policy execution

• Enabling live events for the Spectrum Scale connection aids in

keeping the Spectrum Discover DBMS updated Prerequisites

• Spectrum Scale cluster at version 5.1 or later

• Spectrum Scale fileset with AFM

(50)

Discover Driving Scale/AFM Data Movement Steps

In Spectrum Scale, create a new AFM/S3 fileset

Servers with CPUs & GPUs

Shared NVMe Storage

Prerequisites

• Spectrum Scale v5.1+

• Fileset created with new AFM/S3 options • Caching policy is set appropriately

(all options supported – AFM function will handle copying changes back/forth if needed) • Enable cache eviction settings for Fileset

AFM relationship IBM Spectrum Scale IBM Cloud Object Storage

(51)

In Spectrum Discover, Create a Data Connection to Scale and Scan it

AFM relationship IBM Spectrum Scale IBM Cloud Object Storage

(52)

Discover Driving Scale/AFM Data Movement

In Spectrum Discover, Create a Data Connection to the IBM COS Bucket and Scan it

AFM relationship

Scale connection

COS data connection

IBM Spectrum Scale IBM Cloud Object Storage

(53)

In Spectrum Discover, Import Tags for Dataset (Optional)

AFM relationship

• Discover updates existing data source connection from tag file

IBM Spectrum

IBM Cloud

(54)

Discover Driving Scale/AFM Data Movement

In Spectrum Discover, Create & Run a Data Mover Policy

AFM relationship COS connection IBM Spectrum Scale IBM Cloud Object Storage

(55)

In Discover, Check for Policy Completion

Servers with

(56)

Discover Driving Scale/AFM Data Movement – More Information, Details

• Refer to Scale version 5.1 documentation:

https://www.ibm.com/support/knowledgecenter/STXKQY_5.1.0/ibmspectrumscale510_welcome.html

• See “Planning for AFM to cloud object storage” in Scale Concepts, Planning, and Installation Guide, and

Configuring AFM to cloud object storage in Scale Administration Guide

• Once Scale, AFM configured, refer to Discover version 2.0.4 documentation

https://www.ibm.com/support/knowledgecenter/SSY8AC_2.0.4/isd204_welcome.html

(57)

(58)

Moonwalk

Moonwalk is first third party data mover certified with Spectrum Discover

• Extends ability to move data between third party

storage and IBM storage

• Also for data movement between IBM storage

systems if Spectrum Scale ILM data movement is not suitable

• Supports data movement into IBM Spectrum

Scale* or IBM COS

• Support for Spectrum Scale via NFS today

• Native Spectrum Scale support available Q3/4

• Moonwalk is NOT included with Spectrum Discover

• Operates as a plug-in from Discover perspective

• Moonwalk purchased separately

Moonwalk value

• Ability to move data across diverse storage systems,

object stores and cloud endpoints

• Policy engine scans the storage system and can take

actions based on basic metadata attributes such as last access timestamp, size, file type, owner etc.

• REST API for extension/integration

• Massively scalable with no middleware

• Stateless architecture with no imposed limits on

file/object count or capacity

Spectrum Discover value

• Metadata catalog for smarter decision making on what

data to move

• Ability to select datasets for movement based on

system and user-defined, custom metadata as well as content

• Sp. Discover has tighter integration with Spectrum

Scale, IBM COS, and Red Hat Ceph and is able to build indexes based on live events without having to scan the

(59)

User creates data management policy

• Specify data policy type: COPY, MOVE, or TIER

(determined by applications registered)

• Define filter to identify files or objects to be managed

• Identify Application to be used for data management

• Specify source connection, destination, etc.

Policy execution

• Spectrum Discover policy engine generates job request

message(s)

• Job request messages placed on Kafka egress topic

• Data movement Application reads from topic

• Application performs the required data movement

operation Data Mover Application CES/NFS SMB NFS S3 Job Request Msg Job Response Msg

IBM Spectrum Discover (VMware, KVM, or OCP)

Policy Engine * Data Management Policy Egress

Topic Ingress

(60)

Architecture: Spectrum Discover with Moonwalk

Data Mover_{Data Mover}

Data Mover Control Path

Data Path Native API, SMB, NFS Native API, SMB, NFS IBM Cloud Object Storage IBM Spectrum Scale Reporting Dashboard Search Move/Copy/Tier Policy Engine:

IBM Spectrum Discover

NFS SMB

(61)

• The capabilities of Spectrum Discover data movement can be used to transfer data between data

sources, based on wide range of system and user-defined metadata

• Note: COPY & MOVE takes care of object to/from file conversion and vice versa

• The currently supported data movement operations with Spectrum Discover and Moonwalk are the

following:

Operation Source Type Destination Type

MOVE NFS S3 COS SMB/CIFS NFS S3 COS SMB/CIFS COPY NFS S3 COS SMB/CIFS NFS S3 COS SMB/CIFS

(62)

Moonwalk: System Overview and Components

• Moonwalk drivers or gateways perform data

operations as directed by defined policies

• Data operations include tiering, move, copy and

recall, as well as a range of operations to assist disaster recovery

• Data is streamed directly between endpoints and

storage without any intermediary staging on disk

• When installed in a Gateway configuration,

Moonwalk software may function as a plugin container which provides extended support to enable access to third-party protocols and special devices. Device specific configuration details

(such as sensitive encryption keys and

authentication details) are contained and isolated from the source file server platforms and/or

(63)

The Moonwalk application registers with Spectrum Discover via Moonwalk Admin Center:

(64)

(65)

(66)

(67)

(68)

(69)

1. COPY from SMB (or NFS) to NFS. This can be leveraged to show how customers can move from "other" vendor storage on to IBM Spectrum Scale (with Scale file system as the storage for target NFS server connection).

Metadata: Product code, project identifier to accommodate restructuring Options: Preservation of original namespace hierarchy in target file system

2. MOVE from SMB (or NFS) to COS. Similar story - how to move data from other storage systems onto COS. Metadata: Searching for concerning information, content-search for “trigger” strings, if-then-else logic Options: Preservation of original namespace hierarchy in COS

3. TIER from NetApp to S3 – Identify inactive (cool > cold) datasets on expensive, high performance NetApp filers and tier (archive) content to S3 or S3-compatible cloud storage

Metadata: Searching for inactive files, obsolete (but compliance-related) files, etc. – metadata as well as content

(70)

Getting Started

Further information & evaluation software: ibm.moonwalkinc.com/spectrum-discover Email the Moonwalk IBM Team: [email protected]

(71)

Contact your IBM seller, IBM Business Partner, or Moonwalk for

• More information,

• Demonstrations, or

• Proof-of-Concept possibilities with the

(72)

Please take a moment to share your feedback with our team!

You can access this 5-question survey via Menti.comwith code 15 75 27 5 or Direct link https://www.menti.com/mkg7a2x6q8

Or

QR Code

(73)

• NetApp® _{is a registered trademark of the NetApp}® _company.

• Isilon® is a registered trademark owned by Dell Inc. or its subsidiaries. • Windows® is a registered trademark of Microsoft.

(74)

IBM Spectrum Discover Update Version Release

Accelerate with IBM Storage Technical Webinar Series

Agenda

1. Modernized look

2. Usability improvements

3. Navigation reorganization

Modernized Look and Usability Improvements:

*New*

Spectrum Discover GUI Navigation

*New*

Query Builder in Spectrum Discover 2.0.4

Query summary

Potential record count

Query Builder in Spectrum Discover 2.0.4

Query Builder in Spectrum Discover 2.0.4 Continued…

Query Builder in Spectrum Discover 2.0.4 Continued…

Query Builder in Spectrum Discover 2.0.4 Continued…

Query Builder in Spectrum Discover 2.0.4 Continued…

Alphanumeric Tag

Value Selector

Query Builder in Spectrum Discover 2.0.4: SQL Query

Query Builder in Spectrum Discover 2.0.4 – Condensed Results

Query Builder in Spectrum Discover 2.0.4 – Individual Results

Query Builder in Spectrum Discover 2.0.4 – View Individual Records

Query Builder in Spectrum Discover 2.0.4 : Table Columns

Spectrum Discover 2.0.4: Account Settings

Now

Red Hat OpenShift

applications can easily catalog

and index data, bringing

additional

optimization to AI

workflows

and faster data

classification

Easier to deploy in a multicloud configuration

Up to 50% less memory resources used,

lowering costs vs. current VM deployment

Benefits of OpenShift deployment

•

For customers:

•

IBM:

What’s New?

•

Removed Spectrum Scale

•

Preferred backup and restore on OVA

•

Secondary backup / restore method on OVA

•

Need to support running in a purely

containerized environment

Preferred Backup and Restore on OVA

•

VMware Backup

•

VMware Restore

• Like secondary OVA option except it’s run within a pod

• Runs ansible underneath the covers for all the Kubernetes interactions

• Additional log: /opt/ibm/metaocean/backup-restore/logs/ansible.log

Importing tag information from external

analytics jobs.

Import Tags Application

What is it?

•

Allows a Data Administrator to

import a set of

externally-curated tag values into

Spectrum Discover

Motivations for this function

Import Tags Application: Requirements and Restrictions

•

Import Tags policies created via the IBM Spectrum Discover REST API

•

Limited support in the GUI

•

Import Tags policies managed by Data Administrator user

•

This is initially to support DAAA use case, managed by Data Administrator

New

New