• No results found

HP RA for SAS Visual Analytics on HP ProLiant BL460c Gen8 Servers running Linux

N/A
N/A
Protected

Academic year: 2021

Share "HP RA for SAS Visual Analytics on HP ProLiant BL460c Gen8 Servers running Linux"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

HP RA for SAS® Visual Analytics on

HP ProLiant BL460c Gen8 Servers

running Linux

Performance results with a concurrent workload of 5 light users and 1 heavy user

accessing a 112GB dataset

Table of contents

Executive summary ... 2

Introduction ... 2

Overview ... 2

Solution components ... 4

Capacity and sizing ... 4

Workload description ... 5

Workload data/results ... 5

Analysis and recommendations ... 6

Configuration guidance ... 6

Bill of materials ... 7

Summary ... 9

Implementing a proof-of-concept ... 9

(2)

Executive summary

We live and work in a new era of extreme business speed with heightened customer, partner, and employee expectations. To better compete and grow, businesses demand more innovation, speed, and flexibility from their data centers.

Additionally, SAS Visual Analytics™ (VA) customers require a hardware/software configuration that can deliver data analysis results quickly and accurately. To meet that need, enterprise customers demand reliable and fast servers that can scale to meet their business analytics requirements. This paper details a reference architecture for SAS VA that provides the performance required by SAS VA on a cluster of servers, in a Multiple Parallel Processing (MPP) environment.

This paper highlights the key findings from running SAS VA using the Visual Analytics workload suite running on eight HP ProLiant BL460c Gen8 servers.

SAS ran the tests and has certified that the test suite was run correctly, and that the performance results meet the needs of a typical user.

Target audience: The target audience for this Reference Architecture is the IT community studying solutions for their environments. Business users and IT professionals who are interested in implementing a SAS VA solution may find this paper useful for a sample SAS configuration and a demonstration of the HP ProLiant server’s scalability.

Document purpose: The purpose of this document is to describe a recommended architecture, highlighting recognizable benefits to technical audiences.

This white paper describes testing performed in August through October 2013.

Introduction

SAS Visual Analytics allows your organization to explore all relevant data quickly and easily. You can look at more options, uncover hidden opportunities, identify key relationships and make more precise decisions faster than ever before.

One of the keys to being able to successfully deploy SAS Visual Analytics is having servers that can economically provide the robust performance required to enable SAS VA. The BL460c Gen8 server is just such a server. With an ideal balance of performance and scalability, this server blade offers a simpler way to manage a data center. It is engineered with enhanced memory and storage capacities and the next-generation HP Integrated Lights-Out (iLO) Management Engine. With features that give it improved flexibility and simplified management, the HP ProLiant BL460c Gen8 Server Blade is an ideal choice for data center computing.

This SAS Visual Analytics Explorer reference architecture is intended to provide an understanding of the expected performance and system resources required to support ad hoc data exploration and reporting.

Overview

The HP ProLiant BL460c Gen8 Server Blade is a dual-socket server blade engineered for unprecedented performance, enhanced flexibility and simplified management which makes it the standard for data center computing. It packs in more performance with a 33% increase1 in memory DIMM count and Intel® Xeon® E5-2600 v2 Processors. In addition, it is also more flexible with the HP FlexibleLOM which provides the ability to customize server networking today and the ability to meet future needs without overhauling server hardware.

SAS VA is delivered in two types of formats. The first is a single server implementation, also referred to as SMP. The second, which is the topic of this paper, is multiple parallel processing, also referred to as MPP. The HP factory pre-installs both the operating system software as well as SAS Visual Analytics and integrates hardware and software accelerating the time to deployment.

The HP ProLiant BL460c Gen8 server is a SAS VA approved configuration offering the type of performance required to successfully enable SAS VA at an economical price point.

SAS VA MPP has very prescriptive configuration rules.

Beginning with SA VA 6.1, the minimum number of cores per processor is eight, Additionally, with the introduction of the Intel Xeon E5 v2 processors, 10-core and 12-core processors are now available, meaning that a total of twenty or twenty four cores per server is now available and is approved by SAS. Each of the cores must have a clock speed of at least 2.6GHz.

(3)

Memory must be at least 16GB per core. As a result, if using two eight-core Intel Xeon E5 processors, the minimum memory must be 256GB per server. If using 10-core processors, this means that you must have at least 320GB of memory. And if using 12-core processors, the minimum amount of memory is 384GB. Additionally memory must function at a minimum of 1600MHz. Faster memory is approved and recommended because it will improve performance. In order to provide 256GB of memory per server, we need to use the 16GB RDIMM. If needing to go above 256GB of memory, 32GB LRDIMMs running at 1866MHz are required. The 32GB LRDIMMs running at 1866MHz are only available when specifying E5-2600 v2

processors. Thirty two GB LRDIMMs are available for the original E5-2600 processors used for this testing as specified in the bill of materials, but they are not approved because they do not meet the clock speed specification of 1600MHz.

The part number for the BL460c Gen8 server running with E5-2600 v2 processors is different than the part number for the BL460c Gen8 servers running with the original E5-2600 processors that is specified in the bill of materials.

Figure 1. Reference Architecture with eight HP ProLiant BL460c Gen8 servers

Bay 1 Bay 9 DS Module PS 1 Bay 8 Bay 16 PS 6 HP BladeSystemc7000 Enclosure HP ProLiant BL460c Gen8 UID HP ProLiant BL460c Gen8 UID HP ProLiant BL460c Gen8 UID HP ProLiant BL460c Gen8 UID HP ProLiant BL460c Gen8 UID HP ProLiant BL460c Gen8 UID HP ProLiant BL460c Gen8 UID HP ProLiant BL460c Gen8 UID

Data Nodes

Data Nodes

Head Node

(4)

Solution components

The hardware and operating system software used to achieve the results were: • HP BladeSystem c7000 Platinum Enclosure

• 8 X BL460c Gen8 Server Blades – each having – 2 Intel Xeon E5-2680 8-core 2.7GHz processors* – 256GB of memory (16 @ 16GB RDIMMs)

– 1 X Dual port FlexibleLOM 10GbE Ethernet interface, of which two ports were used on the head node and one port was used on the data nodes

– 2 X 600GB 10K RPM Internal SAS (Serial Attached Storage) Disk Drives – Red Hat® Enterprise Linux 6.2

* Note: The systems that were tested included the E5-2680 8-core processors running at 2.7GHz. The minimum specification for SAS VA requires E5-2670 8-core processors running at 2.6GHz which is what is reflected in the bill of materials.

The software components used to achieve the results were: • SAS™ 9.4

• SAS Visual Analytics Server™ 6.2 • SAS LASR Analytic Server™ • SAS Metadata Server™

• SAS Visual Analytics Middle Tier™

Capacity and sizing

SAS VA is an in-memory analytics engine. This means that the data to be analyzed must fit in memory during the analysis. Using a 256GB memory size per blade and having 7 data nodes means that the total size of the data to be analyzed cannot exceed 896GB. Multiple concurrent datasets may make up the in-memory footprint. Datasets that have been loaded into memory may be unloaded to make room for new datasets. The restriction is on the total size of all datasets simultaneously in memory at any given time.

The configuration rules are relatively stringent. If a customer requires concurrently loaded data models in excess of this limitation, they will need to add servers to this configuration. Likewise, if a concurrent customer dataset(s) requires less memory, server blades may be reduced down to a minimum supported number of 3 data nodes in addition to the 1 head node, for a total of 4 total nodes. If even less memory is required, customers may opt for a single Symmetric

Multiprocessing node.

Important

This test scenario can provide a benchmark for comparing hardware and/or software products; it is not intended to be used as a sizing guideline. In the real world, server performance is highly dependent upon the application design and workload profiling. Further, the HP Converged Infrastructure for SAS High-Performance Visual Analytics (HP CI for SAS H-PVA) allows for multiple choices on many of the components, so that while tests and specifications focus on the baseline or minimum configuration requirements, customer requirements may dictate faster processors or processors with more cores and/or additional memory. Disk drives will then need to be adjusted to be larger, in a similar fashion.

As with any laboratory testing, the performance metrics quoted in this paper are idealized. In a production environment, these metrics may be impacted by a variety of factors.

As a matter of best practice for all deployments, HP recommends implementing a proof-of-concept using a test

environment that matches as closely as possible the planned production environment. In this way, appropriate performance and scalability characterizations can be obtained. For help with a proof-of-concept, contact an HP Solution Design Services (SDS) representative at sastech@hp.com.

(5)

Workload description

The scenario is designed to generate a heavy load on the server. The goal is to demonstrate CPU usage characteristics and server response time to users’ ad-hoc analytical requests.

In this scenario, business analysts use SAS Visual Analytics Server to explore their company’s sales and operational data to quickly discover trends. The goal is to rapidly reveal opportunities that can improve revenue or operational efficiency. The scenario is designed to approximate the types of usage that would occur during a monthly, quarterly or annual reporting cycle with a mixture of users: ones needing summary reports or graphs and analytical users who need quick answers to questions posed by management.

The data for this test is made up of the following:

• The table has 417 million rows, 46 columns, and is 112GB.

• There are more than 6 years of daily detail at the product description level. • The geography hierarchy includes region, state, city and facility.

• The product hierarchy includes brand, line, category, and description. • The time hierarchy includes year, month, and date.

• The measures include revenue, expense (Capex, material, operational staffing), employee counts, profit, product quality, and unit capacity.

The concurrent user base is made up of 2 types of users, light and heavy. Light users typically perform less CPU intensive activities such as generating summarized reports or simple univariate statistics and graphs. Heavy users perform more advanced statistical analyses such as correlations.

For this scenario, 5 concurrent light users perform the following actions: • Log on to SAS Visual Analytics Explorer.

• Select the data table, report type, and variables to include for their analysis. Report types include: bar charts, line charts, box plots, cross-tabulations, and heat maps.

• There is 5-15 seconds of think time between drag-and-drop actions. • After displaying the report, there is 1-3 minutes of think time. • The user creates a total of three reports following this process.

• After displaying reports, users log off and back on after random delays of 60-90 seconds. For this scenario, 1 heavy user performs the following actions:

• Log on to SAS Visual Analytics Explorer. • Select the data table for their analysis. • Select 10 variables for a correlation analysis.

• There is 5-15 seconds of think time between drag-and-drop actions. • After displaying the report, there is 1-3 minutes of think time. • The user creates a total of three reports following this process.

• After displaying reports, users log off and back on after random delays of 60-90 seconds.

Workload data/results

HP LoadRunner is used to drive a one hour, 6-user scenario. Users enter the processing queue at 10-second intervals, so all users are active within 1 minute. Users are engaged in report design and exploratory data analysis activities. As a user’s processing cycle completes, the session logs off and is replaced by a new user session, maintaining a full level of concurrency. The test runs for 60 minutes at full concurrency and ramps down over 1 minute.

During the 60-minute scenario, the response times were as follows:

• Box plots, bar charts, line charts, cross-tabulations, and heat maps complete in 10 seconds or less on average. • Correlations for 10 variables complete in 13 seconds or less on average.

(6)

Figure 2. CPU utilization during the test run

Analysis and recommendations

HP recommends the HP ProLiant BL460c Gen8 servers, as configured for SAS Visual Analytics environments where customers memory requirements dictate a multi-server environment.

Frequently asked questions.

Q: What if I’m looking to build an environment that is half the size of this RA. Could I use any of the data presented here? A: Yes. You would simply reduce the number of data nodes so that the memory footprint meets the concurrent model memory requirement.

Q: What if I’m looking to build an environment that is twice the size of this RA. Could I use any of the data presented here? A: Yes. You may add additional data nodes on which to scale up the memory footprint commensurate with the needs of your concurrent models memory requirements.

Q: What if my workload is slightly different? How would I leverage this document for my purposes?

A: This test scenario can provide a benchmark for comparing hardware and/or software products; it is not intended to be used as a sizing guideline. In the real world, server performance is highly dependent upon the application design and workload profiling.

As with any laboratory testing, the performance metrics quoted in this paper are idealized. In a production environment, these metrics may be impacted by a variety of factors.

As a matter of best practice for all deployments, HP recommends implementing a proof-of-concept using a test

environment that matches as closely as possible the planned production environment. In this way, appropriate performance and scalability characterizations can be obtained. For help with a proof-of-concept, contact an HP Solution Design Services (SDS) representative at sastech@hp.com.

Configuration guidance

As mentioned before, the hardware and operating system software configuration for SAS VA is very prescriptive. Customers must purchase processors with a minimum of eight cores. Those processors must also have a minimum clock speed of 2.6GHz. Ten and twelve core processors are approved as long as their clock speed meets the minimum requirement. Memory must be at least 16GB per core and run at a minimum clock speed of 1600MHz. All storage is internal to the server on Serial Attached Storage (SAS) disk drives with at least 10K RPM speed. All nodes require two 600GB 10K RPM drives. It is

0 10 20 30 40 50 60 70 80 90 100

CPU

Uti

lization

Test Duration

(7)

Two logical networks are required.

The first is between the head node and all of the data nodes. For configurations smaller than 16 server blades, this communication happens inside of the BladeSystem enclosure. As configurations grow, inter-enclosure communications may utilize DAC cables within the same rack. When configurations are sufficiently large that they require more than one rack, top-of-rack switches are needed to facilitate the communication.

The second network attachment is between the head node and the users. Typically this is accomplished using one 10GbE connection, but customers that have not deployed 10GbE may use four 1GbE connections bonded to provide a higher level of aggregate throughput. Even Infiniband is a viable alternative, although not listed in the bill of materials. Infiniband use will necessitate a Network Solutions Architect to work with the HP CI for SAS H-PVA team in order to assure a successful implementation.

Bill of materials

Note

Part numbers are at time of publication and subject to change. The bill of materials does not include complete support options or other rack and power requirements. If you have questions regarding ordering, please consult with your HP Reseller or HP Sales Representative for more details. hp.com/large/contact/enterprise/index.html

Table 1. Bill of materials

Qty Part Number Description Rack Infrastructure

1 BW904A HP 642 1075mm Shock Intelligent Rack

1 BW904A#001 HP Factory Express Base Racking Service

1 HA454A1-000 HP Factory Express Solution Package 4 SVC

1 HC784S HP SAS Visual Analytics SW Inst SVC

1 BW932A HP 600mm Rack Stabilizer Kit

1 BW930A HP Air Flow Optimization Kit

1 BW906A HP 42U 1075mm Side Panel Kit

4 H5M60A HP 8.3kVA 208V 36out NA bPDU

Blade System Enclosure

1 681844-B21 HP BLc7000 CTO 3 IN LCD Plat Enclosure

1 HA454A1-003 HP Fctry Express Blade Svr Pkg 4 SVC

2 638526-B21 HP BLc VC Flex-10/10D Module Option

1 517521-B21 HP 6X 2400W Gold Ht Plg FIO Pwr Sply Kit

1 456204-B21 HP BLc7000 DDR2 Encl Mgmt Option

1 517520-B21 HP BLc 6X Active Cool 200 FIO Fan Opt

1 433718-B21 HP BLc7000 10K Rack Ship Brkt Opt Kit

1 677595-B21 HP BLc 1PH Intelligent Power Mod FIO Opt

(8)

Table 1. Bill of materials (continued)

Qty Part Number Description

BL460c Gen8 Servers – Head Node

1 641016-B21 HP BL460c Gen8 10Gb FLB CTO Blade

1 662064-L21 HP BL460c Gen8 E5-2670 FIO Kit**

1 662064-B21 HP BL460c Gen8 E5-2670 Kit**

16 672631-B21 HP 16GB 2Rx4 PC3-12800R-11 Kit

2 652583-B21 HP 600GB 6G SAS 10K 2.5in SC ENT HDD

1 684211-B21 HP Flex-10 10Gb 2P 530FLB FIO Adptr

1 339778-B21 HP RAID 1 Drive Setting

BL460c Gen8 Servers – Data Nodes

7 641016-B21 HP BL460c Gen8 10Gb FLB CTO Blade

7 662064-L21 HP BL460c Gen8 E5-2670 FIO Kit

7 662064-B21 HP BL460c Gen8 E5-2670 Kit

112 672631-B21 HP 16GB 2Rx4 PC3-12800R-11 Kit

14 652583-B21 HP 600GB 6G SAS 10K 2.5in SC ENT HDD

7 684211-B21 HP Flex-10 10Gb 2P 530FLB FIO Adptr

7 339778-B21 HP RAID 1 Drive Setting

Mandatory Software

8 BC318A RHEL Srv 2 Skt 1 Guest 9x5 1yr Lic*

Optional Software

1 C6N32A HP Insight Control Encl FIO Bundle 8 Lic

1 HA110A1 Opt 7FX c7000 Enclosure HW Supp*

8 HA110A1 Opt 7XE HP BL4xxc Svr Bld HW Support*

* 1 year support is the minimum to be quoted for Proof of Concept or Demo orders; 3 year support is the minimum for operational systems, particularly after the PoC/Demo phase – SKU H1K92A3 rather than HA110A1 for 3 year support is recommended in that case, as well as SKU BC333A rather than BC330A for RHEL 24x7 3 year License.

** Note: The systems that were tested included the E5-2680 8-core processors running at 2.7GHz. The minimum specification for SAS VA requires E5-2670 8-core processors running at 2.6GHz which is what is reflected here, in the bill of materials.

(9)

Summary

The world is moving more and more quickly with each passing day.

Businesses have been collecting more and more data. The ability to analyze this big data is key to being able to more effectively guide and run a business.

The more quickly and thoroughly the data is analyzed, the more timely the information, the better an organization is able to react to changes in their business climate.

SAS Visual Analytics is just the product to accelerate that analysis, but it requires fast, reliable and economical computers to enable that analysis. The BL460c Gen8 servers are a good choice to meet this need for customers. Properly configured, these servers allow analysis of larger data sets and allow for larger numbers of users to access those data sets concurrently than would be available when running SAS VA SMP.

Implementing a proof-of-concept

As a matter of best practice for all deployments, HP recommends implementing a proof-of-concept using a test

environment that matches as closely as possible the planned production environment. In this way, appropriate performance and scalability characterizations can be obtained. For help with a proof-of-concept, contact an HP Solution Design Services (SDS) representative at sastech@hp.com.

(10)

For more information

HP Converged Infrastructure for SAS High-Performance Visual Analytics

http://www8.hp.com/us/en/products/solutions/product-detail.html?oid=5405512#!tab=features

HP Converged Infrastructure hp.com/go/convergedinfrastructure SAS Visual Analytics sas.com/visual-analytics

HP Factory Express hp.com/go/factoryexpress HP ProLiant Servers hp.com/go/proliant

HP BladeSystem hp.com/go/bladesystem

HP Networking hp.com/go/networking

HP FlexFabric Networks hp.com/go/flexfabric HP Insight Control hp.com/go/insightcontrol

To help us improve our documents, please provide feedback at hp.com/solutions/feedback.

Sign up for updates

hp.com/go/getupdated

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Figure

Figure 1. Reference Architecture with eight HP ProLiant BL460c Gen8 servers
Figure 2. CPU utilization during the test run
Table 1. Bill of materials
Table 1. Bill of materials (continued)

References

Related documents

Standard Features HP ProLiant Server Support HP ProLiant BL260c G5 HP ProLiant BL2x220c G5 HP ProLiant BL2x220c G6 HP ProLiant BL280c G6 HP ProLiant BL460c HP ProLiant BL460c G5

The Community First Choice Option gives states added financial support to build a broad home- and community-based care program in Medicaid that will serve residents who need

For the HP CV SMB RA for Citrix VDI-in-a-Box, the HP ProLiant DL380p Gen8 Server can be configured to support 75 and 150 users running a standard “Medium” user workload..

• First Step up to Dedicated Hosting, from a Virtual Server, when downtime is not an option • Very Powerful Dedicated cPanel/WHM or Plesk Server, when downtime is not an option •

- HP Virtual Connect for cClass BladeSystem Setup and - This 3day course provides instruction on HP BladeSystem - HP ProLiant BL460c Gen9 Server Blade HewlettPackard - HP

ProLiant BL (BladeSystem): HP ProLiant BL420c Gen8 HP ProLiant BL460c Gen8 HP ProLiant BL465c Gen8 HP ProLiant BL660c Gen8 ProLiant DL (rack-optimized): HP ProLiant DL560 Gen8

For more information, refer to "HP BladeSystem c-Class advanced management (on page 39 )." To operate properly, the server blade must have a supported OS.. For the

HP Proactive Care service customers must register their HP ProLiant Gen8 and Gen9 servers for Insight Remote Support central connect or Insight Online direct connect in order to