Advanced Technology Group – Storage experts cover a variety of technical topics.
Audience: Clients who have or are considering acquiring IBM Storage solutions. Business Partners and IBMers are also welcome.
To automatically receive announcements of upcoming Accelerate with IBM Storage webinars, Clients, Business Partners and IBMers are welcome to send an email request to [email protected].
Accelerate with IBM Storage Support Site:https://www.ibm.com/support/pages/node/1125513
Accelerate with IBM Storage Technical Webinar Series
2021 Upcoming Webinars:
February 11 - TS7700 Best Practices
Register Here: https://ibm.biz/Bdfxei
February 18 – IBM Cloud Object Storage Level 201 Plus Register Here: https://ibm.biz/BdfFD9
February 25 - DS8900F Management GUI Demo Register Here: https://ibm.biz/BdfFbY
Please take a moment to share your feedback with our team!
You can access this 5-question survey via Menti.comwith code 15 75 27 5 or Direct link https://www.menti.com/mkg7a2x6q8
Or
Agenda
• User Interface and User Experience Improvements • Native deployment on OpenShift
• Backup and Restore
• Importing tags from external sources
• Data Management Policies, beyond Scale ILM/HSM
• AFM data movement
• Moonwalk data movement
1. Modernized look
2. Usability improvements
3. Navigation reorganization
Modernized Look and Usability Improvements:
- Login screen
- Show password toggle
- Store last user’s name to quickly login
- Dashboard
- Single page load
- Performance increase
- Transition indication (loading state)
- About section
- Displays current version
- Link to Help (Knowledge Center)
- Tables
- Consistency across all tables
- Notification
- Larger notification
- Change password
- User summary available
- Change password available through all the UI
- Our users need a way to reach across data, to find data that they need
- Conducts user’s search efforts through an intuitive workflow
- Leveraging complex queries without the need to know a querying language
A search bar, that would build queries in real time as a user clicked to add items to the query. This bar is visible throughout the user’s journey, and
ANIMATED SLIDE
*New*
Spectrum Discover GUI Navigation
• Better organization and visually pleasing
- New hamburger menu
- Able to expand and collapse sections
- Role based view
- No longer takes a portion of the UI
ANIMATED SLIDE
*New*
Query Builder in Spectrum Discover 2.0.4
Press “Add” to enter the Tag Value Chart for each tag desired
Query summary
• A summary of the
user’s tag value selection count.
Potential record count
• Displays an estimate of the total number of
records for the current tag value selection.
• Allowing users to fine tune or broaden their
query selection without having to perform a full search query.
• Especially useful on potential long running
queries.
• Record counts estimated below 1000
Query Builder in Spectrum Discover 2.0.4
: Tag Value ChartANIMATED SLIDE
Query Builder in Spectrum Discover 2.0.4 Continued…
→ Selected options are now in the search bar
Can edit or clear tags already adjusted
Select desired options and
Query Builder in Spectrum Discover 2.0.4 Continued…
Equivalent to cancel if “Update query” was not pressed.
ANIMATED SLIDE
→ Selected options are now in the search bar
Can edit or clear tags already adjusted Add more tags to the Query Builder
Can edit or clear tags already adjusted
Query Builder in Spectrum Discover 2.0.4 Continued…
→
NOTE: “Select All” will put a check in each option.
However, this would not be added to the
search string since
Query Builder in Spectrum Discover 2.0.4 Continued…
Alphanumeric Tag
Value Selector
• Owner values can be filtered
using alphanumerical filter bar or search textbox
• Two views available
Alphanumerical view
→ The search string for the query built
Copy the search string into the cut buffer
When ready, press “View results” to see the results of the search
Query Builder in Spectrum Discover 2.0.4: SQL Query
Copy the search string into the cut buffer
Display previous searches
Similar to 2.0.3 search textbox. SQL WHERE query can be entered into the large text-area.
Query grouping is available via the ‘Group by’ dropdown menu.
Query limits can also be specified using the ‘Limit results’ textbox.
Custom SQL ordering can be applied using the ‘Sort by’ textbox.
Expected syntax is
‘column1 desc, column2 asc…’
Query Builder in Spectrum Discover 2.0.4 – Condensed Results
Just like before, the initial results of the search are displayed condensedANIMATED SLIDE
Query Builder in Spectrum Discover 2.0.4 – Individual Results
The “Individual” button in the “Results view” section will show the individual records for all items in the search results
Query Builder in Spectrum Discover 2.0.4 – View Individual Records
Selecting a line, then pressing “View individual records”
only displays individual records of selected line(s)
ANIMATED SLIDE
Query Builder in Spectrum Discover 2.0.4 : Table Columns
Table columns displayed are now under the gear box Tap the gear box again to make the list disappear
Spectrum Discover 2.0.4: Account Settings
• The Account Settings now make it easy •• To find the User’s role • To find the User’s domain • To find the User’s e-mail
• Change password
Now
Red Hat OpenShift
applications can easily catalog
and index data, bringing
additional
optimization to AI
workflows
and faster data
classification
•
Easier to deploy in a multicloud configuration
•
Up to 50% less memory resources used,
lowering costs vs. current VM deployment
a p p lica tio n a p p lica tio n a p p lica tion
IBM Spectrum Scale + IBM Spectrum Discover + Red Hat OpenShift all integrated
Benefits of OpenShift deployment
•
For customers:
- No requirement for VMware
- Scalability
- Control of platform (e.g., OS updates)
•
IBM:
- Platform supported by the customer (OS, CRI-O, RHOS)
What’s New?
✓ Network Layer / Service Mesh
✓ Different Db2 implementation (Db2 common container vs Db2u)
✓ Everything is containerized – no external scripts (backup/restore, maintenance mode) ❖ Not necessarily file-system access outside of containers
➢ “podman” vs. “docker” ➢ “oc” vs “kubectl”
•
Removed Spectrum Scale
• Overkill for a single node OVA deployment • Issues with Spectrum Protect for Virtual
Environments
•
Preferred backup and restore on OVA
• VMware Snapshot
•
Secondary backup / restore method on OVA
• Previous method of backing up the database.
•
Need to support running in a purely
containerized environment
Preferred Backup and Restore on OVA
•
VMware Backup
• Set Spectrum Discover to maintenance mode
• kubectl patch SpectrumDiscover
spectrumdiscover --type merge -p '{"spec":{"maintenance":"on"}}'
• Suspend Db2wh container
• docker exec -it Db2wh write-suspend
• docker exec -it Db2wh stop
• Snapshot
• Right Click on the VM
• Snapshots - Uncheck the “Snapshot the virtual
machine’s memory
•
VMware Restore
• Restore
• Right Click VM
• Snapshots
• Manage Snapshots
• Choose your snapshot
• Revert To
• Bring Spectrum Discover out of
maintenance mode
• kubectl patch SpectrumDiscover
spectrumdiscover --type merge -p '{"spec":{"maintenance":”off"}}'
• Like secondary OVA option except it’s run within a pod
• oc exec -it spectrum-discover-backup-restore-74dcf47c9c-9228n – bash
• python3 initialSetup.py
• python3 backup.py
• python3 restore.py -r “2020-11-17”
• Runs ansible underneath the covers for all the Kubernetes interactions
• Additional log: /opt/ibm/metaocean/backup-restore/logs/ansible.log
Importing tag information from external
analytics jobs.
Import Tags Application
What is it?
•
Allows a Data Administrator to
import a set of
externally-curated tag values into
Spectrum Discover
Motivations for this function
• Supports Data Accelerator for AI and Analytics (DAAA)
• An external analytics job might generate tag information for a set of S3/COS objects.
• This information can be utilized in an Import Tags policy to
merge these tags into Spectrum Discover, extending its records with new information.
• This new information can then be used as part of a larger AI/Analytics pipeline
• Other potential use cases for this function might include:
• Extending Discover records with Scale user-defined attributes or S3 x-amz-meta tag information
IBM Spectrum Scale IBM Cloud Object Storage IBM Spectrum Discover IBM Spectrum Scale
Import Tags Application: Requirements and Restrictions
•
Import Tags policies created via the IBM Spectrum Discover REST API
•
Limited support in the GUI
•
Import Tags policies managed by Data Administrator user
•
This is initially to support DAAA use case, managed by Data Administrator
•
Only a single Import Tags policy can run at a time
• Identify the COS or S3 data connection name of the
data source connection
• Identify the vault name (COS) or bucket name (S3)
of the data source connection
• Scan the S3/COS data source that is associated with
the external tags
• Creates data source records in the Discover database
• Copy external tag comma-separated-value (CSV) file
onto the vault/bucket so that Discover can access it
• This file can be copied before or after the scan
• Define tags in Discover
Import Tag File Format & Content Requirements
Format: comma-separated values (CSV)Content:
•First row in the file is a header row.
• The value in 1st column in this row is ignored by Discover
• The 2nd through Nthcolumns in this row must be existing
Discover tag names (defined before running the policy)
•The first column is the object full path name prefixed
with the bucket (S3) or vault (COS) name and ‘/’
• For example, if the vault (COS) name is bucket01, and
object name is car1/image1.png, then the first column entry is bucket01/car1/image1.png.
•Subsequent columns in the CSV file contain the tag
values to be imported into Spectrum Discover for the associated object records
objectname,bus,tree,stop_sign,red_light,yellow_li ght,green_light,pedestrian bucket01/car1/image1.png,1,3,0,1,1,0,1 bucket01/car2/image1.png,1,6,0,0,0,0,12 bucket01/car2/image2.png,1,3,0,2,1,0,1 bucket01/car3/image1.png,1,3,0,2,1,0,2
Given the preceding file, the following tags should be defined in Discover:
bus tree stop_sign red_light yellow_light green_light pedestrian
Example CSV file:
and Discover should have records for each of these objects in the datasource “bucket01:”
Defining an Import Tags Policy
Policies are defined in JSON format, such as in the following example:
{ "pol_id": ”import_bucket01", "action_id": "IMPORT_TAGS", "action_params": { "agent": "ImportTags", "source_connection": ”cos_bucket01",
"tag_file_path": ”bucket01/A2D2_labels.csv", "tag_file_type":"csv" }, "schedule": "NOW", "pol_state": "active", "pol_filter": "datasource=‘bucket01'" }
User-defined policy name
vault name (COS) or bucket name (S3)
Discover stores this value in the datasource tag
connection name
User-defined filter
Import Tag File
Must reside on prefixed vault or bucket
NOTE:
• JSON specified on system making the REST API call
In Discover v2.0.4, Import Tags policies must be created, executed using the Discover REST API.
Curl example (TOKEN obtained via Discover REST API, SDHOST is IP or FQDN of Discover system): $ curl –k -H "Authorization: Bearer ${TOKEN}” \
https://${SDHOST}/policyengine/v1/policies \
-d @importTags.json -H "Content-type:application/json” –X POST Successful response (http code 201):
Managing a running Import Tags policy cont.
Import tags policies can be paused/resumed/stopped/deleted via the IBM Spectrum Discover GUI
or REST API calls
For more information, visit the IBM Spectrum Discover documentation in IBM knowledge center
https://www.ibm.com/support/knowledgecenter/SSY8AC
Data movement with Discover:
Moving data between
Scale file system and
cloud with AFM
User creates data management policy
• Specify data policy type: COPY, MOVE, or TIER
(determined by applications registered)
• Define filter to identify files or objects to be managed
• Identify Application to be used for data management
• Specify source connection, destination, etc.
Policy execution
• Spectrum Discover policy engine generates job request
message(s)
• Job request messages placed on Kafka egress topic
• Data movement Application reads from topic
• Application performs the required data movement
operation Data Mover Application CES/NFS SMB NFS S3 Job Request Msg Job Response Msg
IBM Spectrum Discover (VMware, KVM, or OCP)
Policy Engine * Data Management Policy Egress
Topic Ingress
Application Registration
•
Data mover applications must be registered with Spectrum Discover
•
Spectrum Discover application registration is a mechanism, by which a specific
application shares its definition with Spectrum Discover
•
Once the application is registered, Spectrum Discover policy engine creates application
specific Kafka topics (work and completion) which are used to interface with the
application. After the application is successfully registered with Spectrum Discover, it
will be available for Policy creation
•
The response to a successful registration message contains Kafka Broker IP, Port and
the names of Kafka work and completion queues/topics created for the particular
application
Data Management Movement Applications Data Management Movement Applications NEW NEW
Spectrum Discover and Data Movement
Third Party Storage Systems
Spectrum Scale IBM COS,
Red Hat Ceph
Spectrum Archive Tape
Spectrum Discover metadata and policies deliver more effective data movement
Spectrum Scale
ILM
HSM AFM-S3
Third Party Data Mover (Moonwalk)
Scenarios
NFS SMB
S3
IBM Spectrum Discover Data Management
Spectrum Discover has a new built-in (data mover) application ‘ScaleAFM’ which supports
• COPY operation with
• Source: IBM COS or S3 connection
• Destination: Spectrum Scale connection
• Scan source connection before creating data management policy
to update Discover DBMS for policy execution
• Enabling live events for the Spectrum Scale connection aids in
keeping the Spectrum Discover DBMS updated Prerequisites
• Spectrum Scale cluster at version 5.1 or later
• Spectrum Scale fileset with AFM
Discover Driving Scale/AFM Data Movement Steps
In Spectrum Scale, create a new AFM/S3 fileset
Servers with CPUs & GPUs
Shared NVMe Storage
Prerequisites
• Spectrum Scale v5.1+
• Fileset created with new AFM/S3 options • Caching policy is set appropriately
(all options supported – AFM function will handle copying changes back/forth if needed) • Enable cache eviction settings for Fileset
AFM relationship IBM Spectrum Scale IBM Cloud Object Storage
In Spectrum Discover, Create a Data Connection to Scale and Scan it
Servers with CPUs & GPUs
Shared NVMe Storage
AFM relationship IBM Spectrum Scale IBM Cloud Object Storage
Discover Driving Scale/AFM Data Movement
In Spectrum Discover, Create a Data Connection to the IBM COS Bucket and Scan it
Servers with CPUs & GPUs
Shared NVMe Storage
AFM relationship
Scale connection
COS data connection
IBM Spectrum Scale IBM Cloud Object Storage
In Spectrum Discover, Import Tags for Dataset (Optional)
Servers with CPUs & GPUs
Shared NVMe Storage
AFM relationship
• Discover updates existing data source connection from tag file
IBM Spectrum
IBM Cloud
Discover Driving Scale/AFM Data Movement
In Spectrum Discover, Create & Run a Data Mover Policy
Servers with CPUs & GPUs
Shared NVMe Storage
AFM relationship COS connection IBM Spectrum Scale IBM Cloud Object Storage
In Discover, Check for Policy Completion
Servers with
Discover Driving Scale/AFM Data Movement – More Information, Details
•
Refer to Scale version 5.1 documentation:
https://www.ibm.com/support/knowledgecenter/STXKQY_5.1.0/ibmspectrumscale510_welcome.html
• See “Planning for AFM to cloud object storage” in Scale Concepts, Planning, and Installation Guide, and
Configuring AFM to cloud object storage in Scale Administration Guide
•
Once Scale, AFM configured, refer to Discover version 2.0.4 documentation
https://www.ibm.com/support/knowledgecenter/SSY8AC_2.0.4/isd204_welcome.html
© Copyright IBM Corporation 2021
Moonwalk
Moonwalk is first third party data mover certified with Spectrum Discover
• Extends ability to move data between third party
storage and IBM storage
• Also for data movement between IBM storage
systems if Spectrum Scale ILM data movement is not suitable
• Supports data movement into IBM Spectrum
Scale* or IBM COS
• Support for Spectrum Scale via NFS today
• Native Spectrum Scale support available Q3/4
• Moonwalk is NOT included with Spectrum Discover
• Operates as a plug-in from Discover perspective
• Moonwalk purchased separately
Moonwalk value
• Ability to move data across diverse storage systems,
object stores and cloud endpoints
• Policy engine scans the storage system and can take
actions based on basic metadata attributes such as last access timestamp, size, file type, owner etc.
• REST API for extension/integration
• Massively scalable with no middleware
• Stateless architecture with no imposed limits on
file/object count or capacity
Spectrum Discover value
• Metadata catalog for smarter decision making on what
data to move
• Ability to select datasets for movement based on
system and user-defined, custom metadata as well as content
• Sp. Discover has tighter integration with Spectrum
Scale, IBM COS, and Red Hat Ceph and is able to build indexes based on live events without having to scan the
User creates data management policy
• Specify data policy type: COPY, MOVE, or TIER
(determined by applications registered)
• Define filter to identify files or objects to be managed
• Identify Application to be used for data management
• Specify source connection, destination, etc.
Policy execution
• Spectrum Discover policy engine generates job request
message(s)
• Job request messages placed on Kafka egress topic
• Data movement Application reads from topic
• Application performs the required data movement
operation Data Mover Application CES/NFS SMB NFS S3 Job Request Msg Job Response Msg
IBM Spectrum Discover (VMware, KVM, or OCP)
Policy Engine * Data Management Policy Egress
Topic Ingress
Architecture: Spectrum Discover with Moonwalk
Data MoverData Mover
Data Mover Control Path
Data Path Native API, SMB, NFS Native API, SMB, NFS IBM Cloud Object Storage IBM Spectrum Scale Reporting Dashboard Search Move/Copy/Tier Policy Engine:
IBM Spectrum Discover
NFS SMB
• The capabilities of Spectrum Discover data movement can be used to transfer data between data
sources, based on wide range of system and user-defined metadata
• Note: COPY & MOVE takes care of object to/from file conversion and vice versa
• The currently supported data movement operations with Spectrum Discover and Moonwalk are the
following:
Operation Source Type Destination Type
MOVE NFS S3 COS SMB/CIFS NFS S3 COS SMB/CIFS COPY NFS S3 COS SMB/CIFS NFS S3 COS SMB/CIFS
Moonwalk: System Overview and Components
• Moonwalk drivers or gateways perform data
operations as directed by defined policies
• Data operations include tiering, move, copy and
recall, as well as a range of operations to assist disaster recovery
• Data is streamed directly between endpoints and
storage without any intermediary staging on disk
• When installed in a Gateway configuration,
Moonwalk software may function as a plugin container which provides extended support to enable access to third-party protocols and special devices. Device specific configuration details
(such as sensitive encryption keys and
authentication details) are contained and isolated from the source file server platforms and/or
The Moonwalk application registers with Spectrum Discover via Moonwalk Admin Center:
1. COPY from SMB (or NFS) to NFS. This can be leveraged to show how customers can move from "other" vendor storage on to IBM Spectrum Scale (with Scale file system as the storage for target NFS server connection).
Metadata: Product code, project identifier to accommodate restructuring Options: Preservation of original namespace hierarchy in target file system
2. MOVE from SMB (or NFS) to COS. Similar story - how to move data from other storage systems onto COS. Metadata: Searching for concerning information, content-search for “trigger” strings, if-then-else logic Options: Preservation of original namespace hierarchy in COS
3. TIER from NetApp to S3 – Identify inactive (cool > cold) datasets on expensive, high performance NetApp filers and tier (archive) content to S3 or S3-compatible cloud storage
Metadata: Searching for inactive files, obsolete (but compliance-related) files, etc. – metadata as well as content
Getting Started
Further information & evaluation software: ibm.moonwalkinc.com/spectrum-discover Email the Moonwalk IBM Team: [email protected]
Contact your IBM seller, IBM Business Partner, or Moonwalk for
•
More information,
•
Demonstrations, or
•
Proof-of-Concept possibilities with the
Please take a moment to share your feedback with our team!
You can access this 5-question survey via Menti.comwith code 15 75 27 5 or Direct link https://www.menti.com/mkg7a2x6q8
Or
QR Code
• NetApp® is a registered trademark of the NetApp® company.
• Isilon® is a registered trademark owned by Dell Inc. or its subsidiaries. • Windows® is a registered trademark of Microsoft.