Technical white paper
HP Project Moonshot and the Redstone
Development Server Platform
Introduction
2
Challenges facing today’s hyperscale
data center
2
Elements of Project Moonshot
3
Server platforms
3
HP Discovery Lab
3
HP Pathfinder Program
4
Redstone Development Server Platform
4
Chassis design
4
Server trays
5
Compute cartridges
5
Storage cartridges
5
Configuring the server trays
6
SoC architecture
6
Quad-core processor subsystem
7
What are ARM processors and why use
them?
8
Availability
8
Redstone software environment
8
Operating systems
8
What applications are appropriate?
8
Estimated power, space, and cost
savings
9
Application porting and re-compiling
9
Conclusion
10
For more information
11
Introduction
HP Project Moonshot is a multi-year, multi-phase program designed to unlock the potential savings of extreme low-energy processors. The roots of Project Moonshot come from HP Labs’ research beginning in 2008 that pioneered the idea of low-energy processors within the enterprise space. For more information about its origins, see www.hpl.hp.com/news/2011/oct-dec/moonshot.html.
Project Moonshot consists of three key elements: • Low-energy server platforms
• HP PathFinder program, a partner program to enable comprehensive solutions built around extreme low-energy platforms
• HP Discovery Lab, where customers can test and validate their solutions
The Redstone Development Server Platform is a proof-of-concept server for Project Moonshot, so that customer and partners can begin investigating the possibilities of this new architecture. The Redstone Development Server Platform leverages the existing ProLiant SL6500 chassis with server trays that can hold a mixture of storage or compute cartridges. A single tray supports up to 72 compute nodes in a single server tray, or 288 compute nodes (servers) per 4U chassis—four times the density of our space-optimized ProLiant SL platforms. Because it uses an ARM-based processor, it also has drastically reduced power requirements, approximately a tenth that of typical x86-based servers. For specific applications such as highly parallel workloads that don’t make effective use of high-end CPUs, the Redstone Development Server can achieve better effective performance per watt while using a simpler, more energy-efficient, and less expensive core.
We’ve written this paper primarily for IT managers or technology decision makers that have or are contemplating extreme scale-out applications in their data centers. The paper gives an overview of the Project Moonshot program and describes the architecture of the Redstone Development Server Platform. It explains some of the target workloads we expect to benefit most from a platform like the Redstone Development Server Platform.
Challenges facing today’s hyperscale data center
Many things that we do every day—such as checking email accounts, posting onto social media sites, browsing web pages, and searching web indexes or portals—are not compute-intensive. They do, however, have high I/O throughput and memory footprint requirements. IT architects working at this scale typically use cluster techniques to run massively parallel workloads that distribute data across many nodes, often in cloud environments. Using typical server x86 CPUs—designed for compute-intensive enterprise
applications—in these environments means underutilizing compute capacity and wasting energy.
Distributed workloads in cloud environments often run at low processor utilization levels of 20% or less, yet administrators pay for the cost of a premium CPU.
Virtualization can address the low CPU utilization problem if you can consolidate multiple workloads that are somewhat balanced, such as enterprise applications or infrastructure-as-a-service. Virtualization does not adequately address the needs of scale-out applications and web serving, where the I/O component is much larger and the amount of processing required per unit of data is much smaller. In these environments, consolidating through virtualization effectively reduces the network, memory, and I/O bandwidth per unit of data, which makes the large I/O problem worse. Project Moonshot takes the approach of using
energy-efficient CPUs that balance performance and cost to match the needs of data-intensive applications. Another issue that overwhelms IT managers in hyperscale environments is the sheer number of devices they must manage, power, and cool. With today’s rack-mount x86 platforms, you can have between 20 and 40 servers in a 42U rack. Scale-out optimized platforms like HP ProLiant SL can increase the density to 80 servers in each rack. Each server comes with its own management controller, network controllers, storage controllers, OS instance, device drivers, and so on. So every time you add a server, you must also procure
multiple I/O devices and manage, secure, power, and cool them. While HP BladeSystem c-Class enclosures also provide a shared infrastructure, Project Moonshot takes the sharing to a new level by integrating the processor and chipset onto a single piece of silicon and sharing other resources across the system.
Elements of Project Moonshot
The innovative Moonshot architecture opens up new opportunities to rethink hardware design, workloads, and the software ecosystem around them. For this reason, Project Moonshot consists of three elements: • Low-energy server platforms—The Redstone Development Server is the first of the Project Moonshot servers. It is available for select customers and partners to validate solutions for low-energy servers. Multiple production server models will follow.
• HP Pathfinder program—This program is part of the HP AllianceONE program that brings together ISVs, compute, storage, and networking companies to develop an entire ecosystem around extreme
low-energy platforms.
• HP Discovery Lab—This lab enables customers to experiment, test, and benchmark hardware and software for low-energy servers. They will be able to use the Redstone Server Development Platform, future Moonshot production servers, and traditional servers for comparison.
While other companies may have programs to develop energy-efficient platforms or may support other aspects of extreme scale-out computing, only HP has the resources and industry-leading position to develop this comprehensive approach.
Server platforms
Project Moonshot platforms enables running thousands of servers in a rack in a cost-effective manner. To do this, we put extreme low-energy processors with dedicated memory on small compute cartridges. The chipset to control the system periphery is integrated with the processor as part of a system-on-a-chip (SoC). We are moving other elements such as power, storage controllers, network fabric, and management into the system infrastructure and sharing them. As a result, when you add a compute cartridge (server node), you will no longer have to power, cool, and manage other infrastructure elements. As we move forward with production platforms, we will leverage our strengths in management, storage, and networking to implement these elements as shared devices across the system.
One goal of Project Moonshot is to align specific silicon architectures—purpose-built SoCs—with the right workload to provide optimal results. Therefore, Project Moonshot platforms will be able to use multiple types of processors to support different workloads. While the Redstone Development Server (our first Moonshot offering) uses Calxeda EnergyCore™ processors based on ARM® cores, the Moonshot architecture can incorporate other ARM-powered processors and low-energy x86 processors such as Intel Atom.
HP Discovery Lab
The HP Discovery Lab helps customers and partners validate and test their scale-out workloads on an extreme low-energy architecture. Because the Moonshot architecture is different from typical x86 servers, it opens new opportunities for optimizing software to exploit it and for developing benchmarking tools to understand it. The lab lets participants investigate, test, and benchmark applications in a secure and confidential environment to determine which computing infrastructure is best suited to their application. The HP campus in Houston, Texas, houses the first Discovery Lab (see the YouTube video), staffed by HP technology experts and partners. Customers and partners who have solutions they believe will benefit from a scale-out infrastructure can access the lab on site or via remote connections. We also plan to open Discovery Labs in Europe and Asia later.
HP Pathfinder Program
We’re inviting software, compute, storage, and networking providers to become partners to develop the applications and software needed for low-energy compute solutions. Partners will help publish observations such as best practices, create de facto industry standards, and develop formal industry standards to speed up innovation in the hyperscale market. Our initial partners include hardware partners ARM, AMD, Intel, and Calxeda, and OS partners such as SUSE, Canonical, and Red Hat. We are actively recruiting emerging processor, operating system, and ISV solution partners to create a robust developercommunity that includes hardware and software vendors. The Pathfinder program and the Discovery Lab are ongoing HP investments as we develop the Moonshot architecture and the ecosystem surrounding it. If your
organization is interested in participating in the Pathfinder program, visit the Moonshot website at
www.hp.com/go/moonshotfor more information.
Redstone Development Server Platform
The Redstone Development Server is a proof-of-concept server to allow HP, developers, and partners to begin investigating the potential of extreme low-energy servers.
Chassis design
Redstone uses the same 4U chassis as the ProLiant SL6500 (Figure 1). We’ve optimized the chassis for power and cooling efficiency. It includes the following hardware components:
• Four 2U, half-width pluggable server trays
• Four common slot power supplies (750W or 1200W) with N+1 redundancy • Eight shared fans with N+1 redundancy
• Embedded ProLiant Power Management Controller that monitors system power and temperature and adjusts fans as needed
We developed high-efficiency common-slot power supplies to increase power efficiency without degrading server performance. The Platinum-level certification by the 80Plus program and the EPRI (Electric Power Research Institute) ensures one of the highest levels of power efficiency in the industry—up to 94%.
Figure 1: A Redstone Development Server holds four server trays in the front of the chassis and four power supplies in
the back.
Four server trays in the compact 4U chassis Power supplies and fans located in the rear of chassis
Server trays
Each 2U, half-width server tray holds 18 compute or storage cartridges that plug vertically into the server tray (Figure 2). The front of each server tray contains a single serial port interface and four 10 Gb SFP+ ports to route network signals out the front of the chassis.
Figure 2: Up to 18 compute or storage cartridges plug vertically into the Redstone server tray.
Compute cartridges
Each compute cartridge contains four Calxeda EnergyCore SoCs that have their own dedicated memory and OS instance, so each cartridge holds four discrete servers (Figure 3). A single tray can hold up to 72 servers (18 cartridges x 4 SoCs per cartridge).
Figure 3: A Redstone compute cartridge contains four separate servers that share power and cooling with other
compute and/or storage cartridges in a chassis.
Each compute cartridge contains:
• Four EnergyCore SoCs with quad-core processors
• Four mini-DIMM slots (one per SoC), each holding up to 4 GB ECC memory
• Four micro SD ports (one per SoC) for optional local boot using standard micro SD cards • Four direct-attached SATA 3 Gb/s ports per SoC
Storage cartridges
If your application needs local storage, the Redstone Development Server supports optional storage cartridges. A storage cartridge consists of either two SFF HDDs (Figure 4) or two SFF SSDs. Each SoC can connect to four drives (two cartridges). This means a server tray fully optimized for storage would contain 16 storage cartridges and 2 compute cartridges. The drives are directly attached using cables between the drives and the SATA ports on the compute cartridge.
Optimized for compute Optimized for internal storage
4 GB DRAM ECC mini-DIMMs
Quad-core SoC 4 SATA ports per SoC
Figure 4: Each storage cartridge includes two SFF drives: either HDDs or SSD drives.
Configuring the server trays
You can optimize the server trays with a mix of compute or storage cartridges for the specific workloads you’ll be running. For example, you can configure for maximum compute capabilities or for maximum storage. Table 1 shows some sample configurations.
Table 1: Available servers and drives depending on configuration Configuration Number of compute cartridges
per server tray Number of storage cartridges per server tray
All-compute 18 (72 SoCs) 0*
Mix of compute and storage 6 (24 SoCs) 12 (24 drives) Maximum storage solution 2 (8 SoCs) 16 (32 drives ) *Network or MicroSD boot
SoC architecture
The Calxeda EnergyCore SoC includes a quad-core processor and wraps other integrated components around it to make it an enterprise-ready system on a chip. It includes a management co-processor and a network fabric interface, as well as interface controllers for storage and networking. While the long-term vision of Project Moonshot is to share storage and network fabrics across the entire chassis, Redstone only partially incorporates those capabilities. It does share power and cooling through the chassis.
The Calxeda EnergyCore SoC includes the following functionality (Figure 5): • EnergyCore quad-core processor subsystem
• I/O controllers for Ethernet, PCIe, SATA, and SD storage cards • Integrated system management co-processor
Figure 5: The Calxeda EnergyCore SoC has a typical power rating of 5W and includes its own Ethernet controller, network
fabric switch, and management co-processor.
The network supports 10 Gb Ethernet using an integrated fabric switch. By reducing the need for expensive centralized switches, the embedded network fabric overcomes the challenge of interconnecting thousands of server nodes in a single chassis by conventional methods. The embedded mesh fabric lets you connect nodes in a variety of redundant topologies.
The integrated management capability supports common systems management protocols such as IPMI 2.0. It supports several remote management scenarios, including a remote console to access the SoC through a Serial-over-LAN connection.
Each SoC is a separate server running its own OS instance. The SoC typically operates at 5 W of power and uses about 0.5 W when idle. If you add one 4 GB DIMM, it consumes about 1 W more.
Quad-core processor subsystem
The Redstone Development Server uses the quad-core Calxeda ECX-1000 EnergyCore processor subsystem, based on ARM 32-bit cores. The quad-core processor subsystem includes server-grade features such as an integrated ECC memory controller supporting DDR3 memory and a shared 4 MB L2 cache with ECC. The processor subsystem has the following characteristics:
• Core frequency ranges of 1.1–1.4 GHz
• Individual power domains per core to minimize overall power consumption
• A fully IEEE-754 compliant floating point unit (FPU) with single- and double-precision operations • NEON® technology extensions for multimedia and SIMD processing
• Integrated 4 MB shared L2 cache with ECC • Integrated memory controller
– 72-bit data path with ECC
– DDR3 (1.5 V) and DDR3L (1.35 V) at 800/1066/1333 MHz 4 GB D RA M SoC Power: 5W DIMM Power: 1 W EnergyCore Management Controller ARM ARM ARM ARM L 2 c a c h e Memory controller I/O Controllers Ethernet, PCIe, SATA, SD EnergyCore Fabric Switch Calxeda EnergyCore SoC Source: Calxeda
What are ARM processors and why use them?
The vast majority of embedded and consumer devices today—for example, smart phones, tablets, e-readers, printers, and scanners—use processors based on ARM cores, which means they are plentiful and well understood by the engineering community. The ARM architecture uses a RISC instruction set, for which ARM Holdings and partners have developed a rich set of silicon available today. The vast and versatile ARM partner base builds highly optimized SoC components that combine right-sized performance with low power usage.
The target applications for our extreme low-energy servers are highly parallel workloads that don’t make effective use of all the CPU cycles available today in high-end processors. By moving to ARM processors for these application environments, we can achieve better effective performance per watt while using a simpler, more energy-efficient, and less expensive core. Simplifying the main CPU core frees up additional silicon resources: Our silicon partners can take the ARM processor architecture and design even more specialized SoC implementations around it. They can integrate other components such as memory or network
controllers for deeper reductions of system cost and power for a particular application domain. Mobile and embedded platforms already have a long history of doing this. With Project Moonshot, we’re driving these advantages into the server market.
The simpler instruction set and aggressive low-power designs of ARM implementations give ARM cores an intrinsic energy and cost advantage compared to high-performance processors such as x86.
Availability
We will make limited quantities of the Redstone Development Server available to customers and partners participating in the HP Discovery lab. The servers will be available in the second half of 2012.
Redstone software environment
Many software vendors today offer robust software components for ARM processors, primarily in mobile applications and web serving. We plan to leverage this ecosystem and enhance it to address specific requirements of the hyperscale server market. This section briefly discusses the major software components and workloads that we plan to investigate with Redstone.
Operating systems
We are designing the Redstone Development Server initially to run on Canonical’s Ubuntu and Red Hat’s Fedora distributions. We will explore the use of other operating systems in the near future.
What applications are appropriate?
Moonshot platforms in general will be best suited for one or more of these application areas:
• Simple content delivery—web front-end servers with relatively little request processing or with video servers fetching and shipping unmodified video
• Large distributed memory caching—server clusters running software such as Memcached
• Big-data applications—simple database searches on in-memory databases, scalable analytics, and MapReduce applications with lightweight CPU requirements
These workloads and application types are common in the cloud and service provider environments today. They can operate as massively parallel scale-out applications and are constrained most often by I/O rather than by CPU or memory. This makes them ideal for Moonshot platforms.
For the Redstone Development Server specifically, the web front-end applications are likely to be the most suitable. Most web developers use the open-source LAMP software stack (Linux OS, Apache HTTP Server, MySQL database, and PHP/Python/PERL scripting) to build dynamic web sites and web servers. The LAMP
software stack has a reputation for being a light computational workload, where fetching and delivering data is more important than computational power.
The Redstone Development Server may not be as effective for more memory-intensive applications such as Hadoop and Memcached, because they might need more memory than the 32-bit Redstone processors can address. Follow-on Moonshot platforms may be better suited for such applications.
Estimated power, space, and cost savings
To illustrate the amount of savings Redstone can deliver, let’s consider a scenario of a target workload requiring 400 x86 servers today. We estimate a total cost of ownership of $3.3 million, which includes the following:
• Server acquisition costs (dual-socket 1U servers in 10U racks) amortized over 3 years • Networking acquisition costs (top of rack switches and cables) amortized over 3 years • 3-year power costs
• 3-year cooling costs
• Space costs amortized over 15 years
To reach the same performance with the Redstone Development Server, we estimate needing about 1600 servers at a cost of $1.2 million. Figure 6 shows estimated savings when using the Redstone Development Server instead of traditional x86 servers.
Figure 6: In hyperscale environments, use of the Redstone Development Server shows tremendous promise for savings.
These estimated savings apply to workloads for which you can use a 4:1 ratio of Redstone servers to traditional x86 servers to achieve equal throughput performance. For mainstream IT workloads with higher computational requirements, you are not likely to see the same cost and power savings. Traditional approaches might give better results for such workloads because you are likely to need more servers to meet the same performance. In some cases, you might choose to trade off cost and power for the space savings achieved in the Moonshot program.
Application porting and re-compiling
Many software applications are already available and supported on the ARM instruction set. Other server-specific applications and system software must be recompiled and tuned for ARM. That’s because the ARM instruction set is not binary compatible to x86, and because the performance tradeoffs are different. Efforts are already under way to expand the software available on ARM. In May 2012, Calxeda demonstrated a
Traditional x86
$3.3M
HP Redstone
$1.2M
89% less energy
94% less space
63% less cost
97% less complexity
400 servers 10 racks 20 switches 1,600 cables 91 kilowatts 1,600 servers 1/2 rack 2 switches 41 cables 9.9 kilowattsreference cluster platform running Ubuntu 12.04 LTS on EnergyCore hardware at the Ubuntu Developer and Cloud Summit events in Oakland, California (http://armservers.com/2012/05/07/calxeda-demonstrates-ubuntu-12-04-lts-on-energycore-soc).
Because of the different system, performance, and granularity tradeoffs, we expect that our customers will want to test and benchmark their applications on a Moonshot platform before moving to that platform for production. In some cases, they might want to port and tune their applications to get the benefits of the architecture. We are making available the HP Discovery Lab exactly for that purpose.
Conclusion
Project Moonshot and the Redstone Development Server address the needs of data centers deploying servers at massive scale, such as thousands of servers at a time.
Project Moonshot server platforms will use extremely low-power ARM or x86 processors and will allow the sharing of power, cooling, network controllers, storage controllers, and system management among servers. Along with the platform development, we have developed the HP Discovery Lab and the Pathfinder Program for customers and partners to try out the platforms, port and tune their applications, run
benchmarks, and develop software to take full advantage of low-power, hyperscale servers.
The Redstone Development Server Platform is the first step in Project Moonshot. It will provide up to 288 servers in a single 4U chassis with an integrated high-bandwidth network fabric, storage controllers for optional local storage, and system management. It will be available to select customers in the second half of 2012. Following quickly behind the Redstone proof-of-concept server will be full production servers
available to any customer. Redstone and other follow-on Moonshot platforms complement the existing HP ProLiant portfolio. Because Moonshot platforms target scale-out application workloads that require high I/O but low compute cycles, they will not replace the existing ProLiant BL, DL, ML, or SL lines.
If you think Project Moonshot could meet your project needs, please visit www.hp.com/go/moonshot for more information about how to join the Pathfinder program.
For more information
Visit the URLs listed below if you need additional information.
Resource description Web address
HP Project Moonshot website www.hp.com/go/moonshot
HP Project Moonshot business whitepaper http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA3-9839ENW.pdf
Calxeda EnergyCore ECX-1000 technical
specifications www.calxeda.com/products/energycore/ecx1000/techspecs HP industry-standard server white papers www.hp.com/servers/technology
HP labs paper, “Server designs for
warehouse-computing environments” www.hpl.hp.com/personal/Partha_Ranganathan/papers/2009/2009_MicroTopPicks_microblades.pdf
Call to action
Send comments about this paper to [email protected]
Follow us on Twitter: http://twitter.com/ISSGeekatHP
Get connected hp.com/go/getconnected
Current HP driver, support, and security alerts delivered directly to your desktop
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. AMD is a trademark of Advanced Micro Devices, Inc.