Survey Results and Statistics
9. DA AND DATA VIRTUALIZATION
Data Virtualization (DV), while not a new concept, is still quite misunderstood.
Data virtualization allows an organization to make its enterprise data easily available to business users. From a more technical standpoint, data virtualization is a form of middleware that leverages high-performance software and an advanced computing architecture to integrate and deliver data from multiple, disparate sources in a loosely-coupled, logically-federated manner. It differs from the traditional ETL/Data warehouse solutions by leaving the data in place – in the originating data sources – and extracting it as and when needed by the consuming applications. With the growth of data and complexity of IT infrastructures over the past decade or so, data virtualization is becoming ever-more important. It now can provide numerous benefits to enterprises in many different arenas. Some of those benefits include:
• Gaining more business insights by leveraging all your data –
Empowering people with instant access to all the data they want, the way they want it.
• Responding faster to your ever changing analytics and BI – Five to ten times faster time to solution than traditional data integration.
• More cost effective than data replication and consolidation – Reduces unnecessary copying of data. Data virtualization’s streamlined approach reduces complexity and saves money.
Yet, even with those benefits listed, numerous enterprises are not moving forward with virtualization technologies. In the first analysis section of this paper [Section 4–Addressing Enterprise Needs], the ambiguous nature of data virtualization in the modern enterprise was demonstrated clearly. Figure 9 had respondents answer about which elements of Data Management should or should not be included in their data architectures. DV was in first place in what “should be included” at 58.9%, and first place in “what should not be included” at 18.6%.
Clearly, it remains one of the least-understood and least-utilized of the all the elements discussed in this paper.
Therefore, to help provide further clarity to this often misunderstood architectural technology, the survey asked five questions about the utilization of data virtualization at the enterprise level.
Survey Results and Statistics
The initial DV question of the survey asked what statement best represents the respondent’s organizational view in regards to DV. The top two answers really establish where DV is within the Data Management industry and why more education is necessary. The top two answers [Figure 23] were:
• We are not very familiar with DV: 32.3%.
• We know what DV is, but not considering seriously at this time: 28.6%.
When viewed in terms of the next two questions [Figures 24 and 25], the most prevalent path to DV becomes clearer. Each of the following questions was rated on a 1-5 scale, with 5 being the most likely to use or best option. The results will be shown in two separate formats, as a percentage and as a rating average. The best use of DV for the respondents’ organizations is as an Agile BI Enabler (19.8%/3.03 rating) and for Access to New Data Sources (11.8%/2.96 rating):
Figure 23 (133 respondents)
0.0% 5.0% 10.0% 15.0% 20.0% 25.0% 30.0% 35.0%
We are using or ac4vely pursuing adop4on of Data Virtualiza4on technologies We know what DV is, and are keen to learn
more about its benefits and uses We know what DV is, but not considering
seriously at this 4me
We are not very familiar with DV
Which of the following best represents your company’s view of Data Virtualiza;on?
The main factors or “pain points” that are pushing the respondents’ organizations towards DV integration into their existing systems are:
• Real-Time or On Demand access to information: 26.9%/3.21
• Reduce Replication of Data/Silos: 22.6%/3.18
• Time to Market/Agility: 24.3%/3.27
Figure 24 (123 respondents)
0 20 40 60 80 100 120 140
Enterprise Strategy – Implement at enterprise / broad level to create Data Services / IaaS across
analy?cal and opera?onal uses
Agile BI Enabler – Component to add agility to BI, EDW, MDM ini?a?ves
Single View Applica?ons -‐ Support Portal, Call Center etc. ini?a?ves
Managed Migra?on – Abstrac?on Layer for managed migra?on, mergers, acquisi?ons Access New Data Sources – Integrate Unstructured, Semi-‐Structured, Web , Cloud data
more easily
How and where would you use Data Virtualiza4on? (rate each on scale of 1-‐5, 5 being most likely to use)
1 (Least Likely to Use) 2 3 4 5 (Most Likely to Use)
When asked to consider the most preferred approach to DV, respondents were given three different choices., “Best of Breed Data Virtualization Platform” had the highest percentage at 16.7% though the highest rating average was for “BI Tools with Integrated Federation Capability” at 2.90 [Figure 26]:
Figure 25 (114 respondents)
0 20 40 60 80 100 120
Time to Market / Agility Lower Integra;on Costs Real-‐;me or On Demand access to informa;on Reduce Replica;on of Data / Silos Complexity / Heterogeneity – Access XML, Big
Data, NoSQL, Unstructured, Web Abstrac;on -‐ Unified Business Views of Data Data Services Delivery – Secure enterprise data
sharing
What are the main factors or pain points with current integra3on approach that is driving you to consider Data Virtualiza3on (Rate 1-‐5)
1 (Least Important) 2 3 4 5 (Most Important)
The final question for this section asked respondents to rank their criteria for the selection of a DV tool. They were given seven separate choices, with a possible ranking of 1-4 (4 being the best). The top three choices (they could select more than one) were [Figure 27]:
• Pricing/Total Cost of Ownership: 39.6%/3.12
• Performance, Caching, Scalability Features: 31.4%/2.84
• Ability to Handle Structured and Unstructured Data: 28%/2.49
Figure 26 (102 respondents)
0 20 40 60 80 100 120
Extension to Incumbent Data Integra9on Vendors’ Products
BI Tools with integrated Federa9on Capability Best of Breed Data Virtualiza9on PlaIorm
Which approach to Data Virtualiza0on do you support more (Rate on 1-‐5 scale)
1 2 3 4 5
Analysis of Results
The results of these questions suggest that it is still early in the consideration and adoption of data virtualization as part of an enterprise data strategy.
For example, the scores of the different choices for Figure 24 about ways of using data virtualization were generally even. Yet, the relatively high score in Figure 24 for using data virtualization for accessing new data sources and integrating unstructured and semi-structured data, cloud data, and web data somewhat contrasts with the relatively low score for Figure 25 (about pain points and drivers for data virtualization) “Complexity/Heterogeneity–Access XML, Big Data, NoSQL, Unstructured, Web.” One might infer that accessing new (and “big”) data sources is less of a priority, and that in fact providing faster access to data in the data warehouse (“… add agility to BI, EDW” and “real-time or On-Demand access to information”) is the more critical driver for introducing data virtualization into the enterprise today.
Figure 27 (104 respondents)
0 20 40 60 80 100 120
Source Breadth – Access to most number of sources
Specify your most important sources (other than databases):
-‐ Ability to handle structured and unstructured data
-‐ Modeling, TransformaFon, Governance CapabiliFes
-‐ Performance, Caching, Scalability Features -‐ Data Services Publishing OpFons (SQL, Web
Services, JSON, Portlets)
-‐ Pricing / Total Cost of Ownership
Rank the Criteria for selec1on of DV tool (Rate on 1-‐4 scale)
1 (Low) 2 3 4 (High)