• No results found

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

N/A
N/A
Protected

Academic year: 2021

Share "Cloud Integration and the Big Data Journey - Common Use-Case Patterns"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Cloud Integration and the Big Data Journey -

Common Use-Case Patterns

OVER VIE W

The advent of cloud and hybrid architectures have enabled clients to rapidly stand up technology stacks that traditionally required specialized expertise and long times. Big Data, an umbrella term encompassing ingestion, processing, and analytics around structured and semi-structured data sets, has been revolutionary for the data warehousing and analytics market. These data sets, including data from cloud-based solutions, sensors, and Internet-enabled devices, are often large and difficult to pro-cess using standard relational data warehousing methodologies. Big Data solutions take an alternative way to process these data sets by leveraging both cloud-based and non-relational technologies to derive analytical value.

One significant customer problem with Big Data involves the rapidly changing technology stacks and specialized code that is required to work effectively in the space. Companies are reluctant to invest too deeply in one technology for fear of this rapid change. However, many innovative applications leverage Big Data to improve customer satisfaction, reduce operational risk, and increase sell through. Customers want the benefit of Big Data, but often do not know how much of an investment is required to begin.

To that end, we will review three specific customer use case patterns in detail within in this white paper. These use cases discuss both cloud architectures and Big Data solutions in detail, and show how to remove complexity, reduce operational risk, and improve customer satisfaction.

This technical brief is intended to be a companion to “The Big Data Journey” webi-nar. The author would like to extend his thanks to John Haddad of Informatica, who provided some of the architecture slides within this white paper.

A White Paper | August, 2014

(2)

2

USE C ASE 1: REMOVE COMPLEXIT Y

A pharmaceutical client uses several cloud-based applications for sales force and opera-tions enablement. These applicaopera-tions allow business analysts to rapidly provision new functionality on the fly.

However, IT must also possess the agility to rapidly and continuously provision and inte-grate cloud-based application data while maintaining existing data warehouse integrity and data lineage. We leverage cloud services solutions like Informatica Cloud Services (ICS) to help address this challenge.

At our pharmaceutical client, the sales team manages multiple new products and adds new SFDC columns at the rate of one or two a week. The existing ETL process had to replicate data from SFDC down to the main enterprise data warehouse (EDW). Each new column required a corresponding ETL change and update to jobs, causing significant IT development churn.

We leveraged ICS’s SFDC replication solution to mirror each SFDC table into a staging en-vironment within the DW. The ICS workflow is managed through a web-based interface, which is available to the same business analyst that adds fields to SFDC. If a new column has been added to SFDC, the analyst logs into ICS and quickly configures, in less than 5 minutes, the new column to be replicated to the DW.

In the diagram above, the green databases represent existing SQL Server databases that

were not impacted by the switch in replication architecture. We simply removed the existing ETL code feeding the ‘SQL 2008 Replication Stage’ target, and replaced it with an ICS endpoint

Once replicated to a DW staging environment, the SFDC tables are wrapped with views to create a dimensional analytical layer. This layer is immediately available to trained business analysts using BI and visualization tools to perform data analysis. Insights from these analyses are vetted and implemented by the DW team and then turned into op-erational reporting in the enterprise BI environment on a weekly basis.

(3)

USE C ASE 2: REDUCE OPER ATIONAL RISK

An oil and gas client was looking to understand performance and maintenance activity around their wells. Specifically, the oil and gas industry uses a large amount of sensors to monitor well activity. These sensors measure pressure, level and flow rates, and are preva-lent within the industry. They come with operational monitoring solutions that allow technicians to spot up-to-the-second deviations and apply corrective action.

Maintenance, as you can imagine, is critical for both production and safety, and often the earlier a problem is caught, the cheaper it is to fix. Our client was very interested in know-ing about maintenance issues as soon as possible, and ideally, applyknow-ing preventative maintenance to prevent a larger issue.

In order to apply more intelligence towards preventative maintenance, the customer wanted to load sensor data to an existing data warehousing solution. However, when ex-isting ETL infrastructure was leveraged to stream sensor data directly into the warehouse, we quickly encountered performance issues around the sheer volume of data that was being sourced.

If you think about it, sensors report readings at a real time level. With multiple sensors a well, the volume of data easily eclipsed hundreds of gigabytes a day for the client’s production wells. This created a serious problem with both performance and the expense of storing the data.

Upon further analysis, we realized that we needed to do two things with the full array of sensor data. We were looking to apply algorithms to spot deviations in time series data, specifically deviations that went above a certain threshold for a period of time. These deviations may change, based on measurements of multiple sensor arrays. In short, we were attempting to apply matrix algebra to the existing series of sensor data. Once the deviations were spotted, we wanted to provide time-bounded series of this data to the BI environment for reporting and simple analysis.

The combination of these two requirements allowed us to introduce a Big Data approach into our overall solution pattern in order to perform ELT pre-processing of this data, by applying matrix algebra using to the large volumes of sensor data. This sensor data also resembled JSON data structures in nature, and was more suitable for a Big Data solution, specifically Hadoop. We leveraged Hadoop to filter the data, apply matrix algebra to look for anomalies, and roughly model the filter records for data warehouse ingestion, by transforming the sensor records from JSON to a relational structure.

(4)

4

PowerCenter BDE allows you to leverage Big Data solutions within your EDW environ-ment, while leveraging existing skillsets.

This solution allowed us to significantly reduce load time and space consumed for the EDW. More importantly, the customer was able to spend much less time, by almost 90%, to find maintenance issues. The majority of this savings was in the time spent ingesting and processing the data, and operational expense and load on the data warehouse.

USE CASE 3: IMPROVE CUSTOMER SATISFACTION

An online retail company has been selling to customers over the internet for many years, and has accumulated a large data warehouse on customer activity during that time. The retailer is now interested in linking social media elements and real time customer website navigation into their selling strategy, due to significant user adop-tion in shopping via mobile. This likely comes with no surprise to many readers of this white paper. During the last five years, mobile shopping has become mainstream and dominant in some sectors such as books and electronics.

However, the retailer also discovered that mobile customers have a higher rate of shopping cart abandonment compared to traditional laptop browser customers. For various reasons and distractions, mobile customers are leaving more shopping carts; even a small conversion on these abandoned carts would result in a significant revenue rise for our retailer.

In order to get more mobile conversions, our retailer wanted to provide a more personalized shopping experience to mobile customers, by dynamically modify-ing content as the customer interacts with the site. The content would present both products of higher interest, as well as potentially offer aggressive pricing on selected items for certain shopping cart mixes.

We leveraged a Big Data / DW solution pattern in two ways: via a NoSQL database to 1) crunch weblogs in real time, and 2) analyze a customer’s Twitter stream, in order to provide items of interest and potential discounting. All of this was linked to a historical customer score made available by the traditional data warehouse via a web service.

Again, you can leverage Informatica’s PowerCenter Big Data Edition (“BDE”) in order to ease the processing of weblogs and connectivity to Big Data solutions. In addi-tion, you can leverage the Social Media Connector to connect directly to a customer’s Twitter stream to source that data into the NoSQL database for further analysis. PowerCenter BDE allows you to leverage Big Data solutions within your EDW environ-ment, while leveraging existing skillsets.

(5)

ABOUT CORPORATE TECHNOLOGIES

Corporate Technologies provides high value services to clients. Through the ef-fective application of technologies like Business Intelligence, Data Integration and Management, Enterprise and Cloud Computing, we help clients implement the right IT solutions to empower business innovation and dynamic scalability. From leverag-ing business intelligence to rethinkleverag-ing the efficiency of the data center, we are your strategic partner for everything from data management to information delivery. Today’s IT solutions have to be highly integrated to solve the complex business challenges that organizations face. Your business cannot afford to work with multiple consulting organizations specializing in “silos of experience.” Corporate Technologies’ engineering team understands how the implementation of any new technology must support both the business and infrastructure requirements. Our ability to successfully integrate Business Intelligence, Data Management and Systems Technologies by merging complex system and application structures is a rar-ity in the industry. We focus on solving complex business challenges. We create long term relationships with our clients and partners to deliver recommendations and innovative, high quality, high value IT solutions.

References

Related documents

Informatica PowerCenter Standard Edition™ is enterprise data integration software for accessing and integrating data from virtually any business system, in any format, and

Informatica Cloud provides purpose-built integration applications and an easy-to-use, wizard-based interface that allows business analysts and SaaS application administrators

¾ You need to create a ssh keys on the informatica server using ssh –keygen command in Unix ¾ Share the public key with the team that maintains your SFTP server and ask them

327208 The Proactive Monitoring for PowerCenter Management Console does not validate the associated node or PowerCenter Repository Service provided in the PowerCenter

Here we report the burial, radiocarbon dating, and stable isotope analysis, and describe the female skeleton and com- pare its morphology with that of females from the Initial and

Li returned to his home and then recorded all he had learned on Mount Hua in one hundred and thirty-four verses now known as The Five Character Secrets of Li Dong Feng.. This is

In the Daphni texts, however, I have not found any clear example of it, whereas I have found many places where the permeation of text through layers of papyrus has produced

Respondents of both countries believe that the most important role of a company in a society is economical responsibility (Lithuania-paying taxes, Hungary – making profit) and