WHITE P
APER
7/2013
li
Automating Test
Data Centers
li
Test labs and testing data centers constitute a massive investment for technol-ogy vendors, converged infrastructure solution providers, systems integrators, and service providers. Large technology providers routinely fill tens if not hundreds of thousands of square feet of real estate with millions of dollars in equipment consuming stratospheric amounts of electricity. This huge expenditure is necessitated by the fundamental need to validate the quality and interoperability of complex products and services, replicate technical issues and verify fixes, and in some cases, demonstrate and deliver proof of concept testing for customers. However, the lack of automation in these testing environments presents a significant barrier to a reasonable return on the investment. Furthermore, typical manual operations in these costly data centers prevent the speed of execution and management visibility needed to maintain decent market velocity. This paper explores the case for test lab and data center automation, introduces the TestShell test and lab automation solution and its benefits, and reviews best practices for successfully automat-ing complex test data center environments. By implementautomat-ing TestShell’s powerful automation capabilities with proven methodologies, test data centers can achieve substantially greater efficiency, productivity and manage-ment visibility, leading to significant CAPEX and OPEX savings, faster time to market and increased business competitiveness.
By implementing
TestShell’s powerful
automation
capabilities...
test data centers can
achieve substantially
greater efficiency,
productivity and
management visibility
Introduction
The Case for Test Data Center
Automation: the Challenges
Test data centers are normally established to serve as a shared, dynamic infrastructure for development, Quality Assurance (QA), technical support and field engineers to perform a variety of critical testing tasks. Such data centers generally include many instances of multi-vendor equipment representing the full data center stack of computing, storage, network, and virtualization components, and three types of testing predominate here:
”
”
To ensure that new products will work properly with a variety of other products installed in target deployment environments. For product vendors and systems integrators, this may include testing against common
customer deployment architectures as well as custom testing for strategic accounts. For service providers, this means testing and certifying new devices and software vis-à-vis all relevant production architectures.
Interoperability or certification tests
In this use case, when a customer field issue is reported, technical support engineers and sometimes field engineers too, need to assemble the same deployment configuration in which the field problem occurred and replicate it so that escalation engineers can examine the problem in context. When the fix is ready, support and field engineers must verify that the problematic behavior has been resolved in that specific configuration before delivering the fix to the customer.
Technical issue replication and fix verification
Test Data Center
Automating Test Data Centers
Development
Field Engineering
Tech support
QA
העברא
לש
לזאפ
ישעת
ילוא
תא
םיביכרמ
דחיש
םיקלח
רטנס
הטאדה
תא
דירוהל
ךירצ
הזמצוח
תרתוכל
רשייתת
אלש
תרתוכה
לבלבמ
הז
תלכתה
All three types of testing require dynamic reuse of multi-vendor equipment and virtual resources in variable configurations or test topologies. For exam-ple, a test may require verifying the interoperability and performance of a new version of a hypervisor in a test topology consisting of multiple server, storage, and SAN switch vendor equipment. While each of the three use cases is characterized by somewhat different processes and durations, the simplicity and speed with which infrastructure can be deployed to support these tests is plainly an important factor.
Unfortunately, in the vast majority of testing data centers today, deploying costly resources in test topologies is anything but simple and fast. Ironically, the enterprise and carrier-class technology that is being tested is becoming increasingly virtualized, agile and software-controlled. Yet, the infrastructure management, setup, and provisioning processes of the typical test data center are overwhelmingly manual in nature, clearly evidenced by the following:
li
Automating Test Data Centers
Technology vendors and service providers often have to provide demonstrations and perform PoC tests for customers to show that the proposed solution will work in their architectural and multi-vendor environment.
Customer demonstration and proof of concept (PoC) testing
In most data centers, equipment inventory is not tracked in a way that provides visibility to engineers. While most organizations track assets for financial purposes, what passes for inventory management by engineers is a simple spreadsheet that is often poorly maintained. As a result, it is difficult to tell without exhaustive investigation what equipment exists, what is being used by whom, and what is actually available.
Absence of inventory visibility
In the absence of usable inventory visibility, test topology is designed offline without regard to resource availability. Visio or other diagramming tools are generally used to produce what is essentially the electronic version of a paper drawing, which is then printed to aid a time consuming and arduous hunt for the relevant equipment.
Offline test topology design
Once inventory is found that appears to be available, engineers must manually re-cable connections between the equipment. With different people adding, moving and changing components, typically without up-to-date documentation, errors such as accidentally disconnecting someone else’s test inevitably occur. Lamentably, test breaks are a common occurrence in most test data centers today.
Chaotic connectivity management and costly errors
Interoperability
Testing
Functional
Testing
PoC
Testing
Testing Types
הניא
דיה
תציחל
קיפסמ
הרורב
םינוקייאה
הזמצוח
ידמ
םילודג
דבוע
םיוסמ
תוריש
\
דויצש
הז
ןאכ
ריבעהל
ךירצש
ןויערה
.
רורב
ככ
אל
הז
תונוש
תוביבסב
.
High test setup to actual testing ratio. Technology test data center engineers can easily spend a week on the setup process for a test that takes less than a day to run.
Hoarding and poor resource sharing. Given the time that engineers often have to spend physically locating, connecting and provisioning test
resources into a ready-to-use test topology, it is understandable that they do not want others to effect any changes. The time-consuming setup process also makes it too costly to release resources even if a test is not ready to start immediately, so that expensive data center components may remain idle, often powered up and consuming electricity, for days or even weeks on end until a test is ready to run.
Very low device utilization. Tens of millions of dollars in capital equipment are typically only 15% to 20% utilized.
Excessive power usage. Test data center managers report instances where pre emptive power downs of large test environments caused only a few test engineers to complain, leading managers to wonder why all the other equipment was powered on continuously in the first place.
Such laborious setup efforts can be summed up in a word, waste: waste of space, waste of equipment and waste of power resources:
The degree of waste generated by manual operations at test data centers has significant implications. To begin with, device utilization of less than 20% means that as demand for testing expands with the deployment of new devices and business growth, the rate of investment in data center capacity will rise sharply. With data centers costing anywhere from $1K to $3K per square foot including equipment costs, this can lead to huge, unnecessary CAPEX outlays over time. Wasted power costs are also daunting. Assuming modest annual power costs of $50 per square foot for a 50 KW/square foot data center, the expansion of capacity needed to accommodate low device utilization can lead to hefty additional OPEX.
Besides the bottom-line of wasted test data center capacity due to operations being performed manually, there is also the top-line issue of slower business velocity. With setup to testing ratios as high as 80:20, testing cycles are longer, delaying product or service releases, or causing organizations to compromise test coverage and quality, resulting in higher incidences of problems found in the field that are much costlier to fix after products are released, and conse-quently, delayed customer adoption and revenues.
li
Automating Test Data Centers
20%
Device utilization
$1K-$3K
Costing per
square foot
$50
Annual power per
square foot
80:20
Testing ratios
An engineer, after painstakingly assembling a physical topology, must then proceed to perform a variety of further time-consuming logical
provisioning steps. For example: loading a particular hypervisor version, changing OS images on networking equipment, setting up logical connectivity between servers and virtualized storage devices, and instantiating virtual machines. Test engineers may be highly
knowledgeable about the components they are testing, but in effect, they spend the vast majority of their time on low level provisioning tasks.
Manual provisioning
Time consuming
setup
The degree of
waste generated by
manual operations
at test data centers
high significant
implications
Manage data center inventory including physical DUT (Device Under Test) and testing equipment, L1 switches, and virtual resources such as virtual machines and virtual switches in a live, searchable database of resource objects tagged with searchable attributes, eliminating manual searches for equipment on racks and enabling engineers to interface with the data center infrastructure efficiently via software. TestShell’s inventory and resource management allows for object hierarchies which can represent relatively simple nested resources such as chassis, blades and ports or complex pre integrated resources stacks such as converged infrastructure and “data center in a box” solutions.
Create test topologies via a software GUI that allows drag and drop of resource objects onto a canvass, visually ascertain availability, design and sanity check connectivity, and save the entire topology as a higher level object in the resource library for their later own reuse or by other engineers.
Centralized, live infrastructure and resource inventory Inventory-aware test topology design
Shared, calendar-based resource and topology reservation Connectivity mapping and automated connectivity control Easy to create automated task provisioning
Non-programmer friendly automation; workflow creation is based on a library of highly reusable test objects that can be built from a wide variety of sources and leveraged to construct:
Auto-discovery, auto baselining and other automated maintenance routines.
Full test automation workflows.
li
Automating Test Data Centers
text
TestShell Test and
Lab Automation Solution
TestShell is the industry’s leading test and lab automation solution, designed to help test data centers achieve dramatically higher efficiency and productiv-ity, leading to significant CAPEX and OPEX savings, faster test cycle comple-tion, and better top-line performance and competitiveness. TestShell’s groundbreaking innovation lies in the delivery of a fully integrated, object-oriented software framework for automating the test data center, including:
TestShell’s architecture avoids the pitfalls of script-based approaches to automation, which cannot scale due to their high maintenance costs. A best of breed commercial solution deployed by industry-leaders worldwide, TestShell offers the fastest path to a successfully and sustainably automated testing system. Using TestShell, the world’s leading technology and service providers turn their test lab data centers from chaotic, manually-operated environments into highly efficient Lab as a Service (LaaS) clouds enabling data center managers and engineers to:
Using TestShell,
the world’s leading
technology and
service providers
turn their test lab
data centers
from chaotic,
manually-operated
environments into
highly efficient Lab
as a Service (LaaS)
clouds
we can use here the icons from the
TestShell overview digram
Schedule resources and entire test topologies through a common calendaring system. Resource conflicts can be swiftly resolved since it is easy to determine who is using any specific resources at any given time. Manage connectivity remotely by generating patching or cabling requests to lab administrators, or if Layer 1 switches are used, to automatically connect test topologies.
Make device provisioning error-free by building automation objects for common provisioning tasks launched via a right-click menu in a graphi-cal view of the test topology. Device provisioning can include uploading OS images, or common configurations such as creating GRE tunnels or routing adjacencies between virtual switches.
Create auto-discovery and auto-base lining processes that leverage TestShell’s extensive array of control interfaces, GUI automation and scripting capabilities to streamline the management of inventory and device states.
Roll out full test automation, including integration of existing automa-tion scripts as testing objects, as well as creaautoma-tion of new test automaautoma-tion objects through screen, GUI and other forms of capture.
Generate real-time reports and dashboards on data center device utilization, topology reservations vs. activations, and even comprehen-sive test results.
A dramatic increase in the velocity of test cycle completion
One organization reported that after automating their demonstration test data center they were able to increase customer demo delivery from 37 demos per month to over 700 demos per month within the space of two years - an increase of almost 1800%. A service provider that automated its home gateway certification testing environment and process was able to reduce certification time from three weeks to just four days. A greater demo velocity leads to a wider and shorter sales pipeline and rising top-line revenues.
Significant savings in lab CAPEX and OPEX
Organizations deploying lab automation software report increases of 50% to 200% in device utilization, resulting in capital budget savings as well as accompanying savings in space, power, and cooling costs. In some cases, these cost savings are dwarfed by the prevention of serious business disruptions that would otherwise be caused by unchecked equipment growth, especially in cases where no additional
1800%
increase in
test operation
50%
to
200%
cost savings
Tangible
competitive edge
li
Automating Test Data Centers
TestShell’s Beneficial Impact on
Test Data Center Operations
Adoption and deployment of TestShell’s powerful automation capabilities in the test data center leads to significant, positive top and bottom-line impacts for technology vendors and service providers:
Costly
programming
No scalability
Low penetration
Script bloat
Personnel
dependency
Traditional
Test-Automation
Challenges
li
Automating Test Data Centers
Layer one
switch control
real estate or power is available on campus to grow labs, putting a stranglehold on sales. By increasing device utilization, labs not only save on the bottom line, but also ensure that they do not become a bottleneck for corporate top-line growth.
Increased competitiveness
Just as re-engineering and automating front office business processes leads to greater business agility, responsiveness to customers and ability to execute, automation in this critical area of technology equipment sales cycles provides a tangible competitive edge that is reflected in customer satisfaction, employee productivity, return on capital, and positive brand effects.
The Need to Evolve Physical Layer Connectivity
ISoftware-based automation benefits from a structured, documented, and easy to operate physical connectivity environment. Most data centers are architected according to the Telecommunication Industry Association’s standard TIA 942 layout with main and horizontal distribution areas; however, many data centers do not employ structured cabling systems fully, largely relying instead on point-to-point cable connections, which impede automation because changes are physically difficult to implement. It is therefore strongly recommended that point-to-point cable connections be eliminated, and that data center connectivity be moved, to the extent possible, towards a “lights-out” operating mode enabled by software-controlled Layer 1 switches. Naturally, most organizations cannot embed Layer 1 switch connectivity in one fell swoop, and while automation of all physical layer connectivity is desirable, this may not be necessary, at least not initially. Layer 1 switches should be prioritized where connectivity changes are frequent and/or need to be implemented rapidly. For other connections, point-to-point cabling can be migrated to either passive or intelligent patch panels.
Resourcing the Automation Infrastructure Service
The most successful lab automation deployments tend to assign personnel with data architecture and programming skills to build and maintain the object library of inventory resources, test topologies, provisioning and shared testing objects and workflows. Once TestShell software is implemented as the interface between users (test engineers) To achieve successful automation outcomes, best in breed technology is critical, but must also be accompanied by best practice methodologies. While every lab environment is to some extent unique, certain best practice guidelines apply to the majority of data center testing environments. These include:
Best Practices for
Test Data Center Automation
Name Version Model IP Address
Automation in
Practice:
A)
L1 Switch/Patch
Panels control
B1) Reusable test
automation objects
(show two worflows when
one object is shown in
both - same color and
shape)
B) Shared Test Objects:
B2) Tagged
resources for fast
tracking
and the actual test infrastructure, the broader user community can leverage this library to build and reserve topologies, easily perform provisioning, and progress into test automation as the library is built out. Dedicating resources to maintaining the object library as an infrastructure service is strongly recommended; otherwise, if the utility and ease of use of the object library are not maintained to a high standard, users will abandon the automation system, squandering the investment.
One of the most important up-front tasks is building the object library that represents the physical and virtual resources inside the data center. This involves gathering a comprehensive list of test lab assets and resources, designing a resource structure (e.g. chassis, blades, ports), and applying data tags to resources so that users can easily search for and locate the resources they need in the resource library. Once the inventory has been architected into an appropriate object hierarchy and imported into TestShell, auto-discovery processes can be designed to automatically update the inventory.
A Phased Approach
Successful automation systems tend to be built in phases, where each phase aims for a visible productivity gain and return on investment in a relatively short time in order to drive user engagement and momentum and create realistic expectations.
Generally speaking, work on automating the basic visibility, topology design and reservation/ sharing of infrastructure resources is the first and easiest phase in automating the test data center environment because it delivers the most immediate and tangible benefits. For example, if two groups can successfully share an expensive set of testing resources rather than making duplicative purchases, the return on investment in automation will be swift.
The second automation phase involves transforming low-level device provisioning tasks into easy to invoke, menu-driven tasks from the automation GUI. Such provisioning tasks typically start with the basic provisioning steps needed to get DUTs to a particular state, such as uploading OS images or applying patches. More advanced provisioning tasks involve common configuration steps to ready the logical layer of a test topology, such as configuring VLANs, routing adjacencies, or creating tunnels on physical or virtual switches. These automated provisioning objects help test engineers to accomplish the routine tasks that often dominate their workdays more expeditiously, allowing them to focus more on higher order thinking to achieve maximal test coverage. The third phase, which is short of full test automation, involves creating automated maintenance routines. Examples include auto-discovery, which helps keep the inventory up to date, and auto baselining, which restores devices to their default provisioning states on a timed basis. Such routines require development of a comprehensive set of device control/ interface automation objects for all the devices needed in the test infrastructure so that they can be leveraged across multiple mainte-nance automation processes.
li
Automating Test Data Centers
Automation in
3 steps:
Topology design &
resource sharing
Advanced
automating
provisioning
Automated
tests routines
1
2
3
ילגלג
לש
דירג
םע
הפ
"
יטשקת
"
..
הזכ
והשמ
וא
םייניש
םירפסמל
ינתת
ילוא
?
תוחכונ
רתוי
C) Methodology - 3
Test data centers represent a huge investment of capital and human resources. TestShell automation, accompanied by best practices methodolo-gies, offers organizations an opportunity to optimize test data center performance. The net result is significant CAPEX and OPEX savings,
increased business velocity and agility, and higher levels of market competi-tiveness.
Full test automation requires the greatest time investment before results are seen, since a large library of building block test objects must be created before testers can effectively start to author their own test automation workflows in the TestShell GUI.
Conclusion
For more information about QualiSystems, visit our website at
www.qualisystems.com
li
Automating Test Data Centers
WP-T
A-1